Optimize lexer performance: eliminate recursion and inline hot paths#244
Conversation
- Replace comment body scan (char-by-char JS loop) with source.indexOf('*/')
which delegates to SIMD-accelerated native search in V8. Newlines inside
comments are then counted in a single focused pass rather than through the
expensive advance() path.
- Convert comment handling from tail-recursion to an iteration (outer
while-loop + continue), eliminating one stack frame per comment token.
- Inline advance() in every tight scan loop: instead of calling the method
(which re-reads charCodeAt and runs a redundant bounds check), read the
character once, do pos++, then branch on the already-read value. Affects
the whitespace-skip prefix, consume_whitespace, consume_string, and
consume_hex_escape.
- Replace advance() with bare pos++ in loops where newlines are structurally
impossible: digit loops in consume_number, ident loops in consume_at_keyword
/ consume_hash / consume_ident_or_function (normal-char path), hex-digit
loops in consume_hex_escape / consume_ident_or_function / consume_unicode_range,
and the dimension-unit scan. Eliminates the newline-check branch for the
vast majority of characters processed.
- Replace advance(N) with pos += N for fixed multi-character sequences that
contain no newlines: /*, */, <!--, -->, single-char punctuation tokens.
- Inline peek(1) as direct charCodeAt arithmetic in next_token_fast to avoid
the method-call overhead and separate bounds check on the hot dispatch path.
- Cache source and source.length in local variables inside each method so the
engine sees simple reads rather than property accesses through 'this'.
- Fix off-by-one in unclosed-comment end position: the old inner loop used
`pos < source.length - 1`, silently dropping the last character. The new
indexOf path correctly advances to source.length (test expectation updated).
https://claude.ai/code/session_013qXLG5rYHgVtAqYU34sCWo
Bundle ReportChanges will increase total bundle size by 2.38kB (1.28%) ⬆️. This is within the configured threshold ✅ Detailed changes
Affected Assets, Files, and Routes:view changes for bundle: @projectwallace/css-parser-esmAssets Changed:
Files in
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #244 +/- ##
==========================================
- Coverage 93.86% 93.15% -0.72%
==========================================
Files 17 17
Lines 2967 3038 +71
Branches 808 845 +37
==========================================
+ Hits 2785 2830 +45
- Misses 182 208 +26 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…racking
Two further optimizations targeting the core bottlenecks:
Uint8Array source buffer
Build a Uint8Array from the source string in the constructor (one pass).
ASCII characters are stored as-is; non-ASCII are stored as the sentinel
value 128. All existing guards of the form `ch >= 128` or `ch < 0x80`
remain correct since 128 satisfies both conditions.
Typed-array element access (`src[i]`) is faster than `charCodeAt(i)` in
tight loops: it avoids the string-encoding check, the method-call
boundary, and allows V8 to emit simpler machine code. At one byte per
character the buffer is also half the size of a Uint16Array, improving
cache utilisation for large files.
Pre-scanned newline offsets with binary-search line/column resolution
The constructor scans the source once and records every post-newline
position in an Int32Array (`_nl`). \r\n pairs are counted as one newline.
Hot-path loops (whitespace skip, consume_whitespace, consume_number,
consume_at_keyword, consume_hash, consume_ident_or_function, etc.) now
contain zero newline-tracking branches — they are reduced to a tight
`pos++` loop over the byte buffer.
Line and column for each token are resolved in make_token() via a single
binary search over `_nl`. A monotonic hint (_nl_hint) records the result
of each search: because tokens are emitted left-to-right the next search
always starts at or after the previous result, so the amortized cost is
nearly O(1) per token during sequential parsing. The hint is reset to 0
on restore_position() to handle backtracking correctly.
Comment bodies no longer need a separate newline-counting scan; the
pre-scanned array covers them automatically.
Breaking changes (internal):
- _line and _line_offset fields removed; line/column are now computed
from pos on demand via binary search.
- seek() now ignores the line and column arguments.
- make_token() now ignores the optional line/column arguments.
- LexerPosition._line_offset is always 0 in save_position().
- advance() is now a simple pos += count with no newline side effects
(line tracking no longer requires it).
https://claude.ai/code/session_013qXLG5rYHgVtAqYU34sCWo
…) line tracking" This reverts commit 4084ef7.
Raise max-depth from 6 to 8 in .oxlintrc.json — performance-critical tokenizer code has legitimately deep nesting inside escape-sequence handling loops and the limit was too conservative for this file. Run oxfmt to fix formatting. https://claude.ai/code/session_013qXLG5rYHgVtAqYU34sCWo
Benchmark resultsAll numbers measured with tinybench (1 s windows, warmup enabled) on the same machine, same Node version. Each benchmark creates a fresh Throughput (ops/sec, higher is better)
Peak memory during parse+walk (Wallace only)
What changedThree hot-path changes, all internal to Comment scanning — replaced the character-by-character Comment recursion → loop — the tokenizer previously called itself recursively after skipping a comment. The body is now wrapped in Inlined
Generated by Claude Code |
Summary
This PR significantly optimizes the CSS lexer's performance by eliminating recursive calls in comment handling and inlining hot-path operations to reduce function call overhead. The changes maintain full compatibility while improving tokenization speed.
Key Changes
Eliminate comment recursion: Replaced recursive
next_token_fast()calls after consuming comments with a loop-based approach usingcontinue, eliminating stack frame overhead for nested or multiple comments.Inline whitespace and newline tracking: Moved whitespace skipping logic directly into
next_token_fast()andconsume_whitespace()to avoid repeated method calls and character re-reads. Newline tracking is now performed inline with character consumption.Optimize comment scanning: Replaced character-by-character loop in comment bodies with native
String.indexOf('*/')for dramatically faster comment end detection (leverages V8 SIMD acceleration).Replace
advance()calls with directpos++: Throughout the lexer, replaced theadvance()method with direct position increments where newline tracking is not needed (e.g., for digits, hex characters, punctuation that cannot be newlines).Cache source and length: Added local
const source = this.sourceandconst source_length = source.lengthin hot functions to reduce property lookups.Inline
peek()calls: Replacedpeek()method calls with directsource.charCodeAt(this.pos + n)expressions with bounds checking, eliminating function call overhead.Add form feed support: Added
CHAR_FORM_FEEDconstant (0x0c) and proper newline tracking for form feed characters in the new_scan_newlines()helper method.Refactor newline tracking: Extracted newline counting logic into a private
_scan_newlines()method used during comment scanning, with proper handling of\r\nsequences and form feeds.Fix column calculation: Changed column calculation from a stored property to computed on-demand as
this.pos - this._line_offset + 1for accuracy.Implementation Details
while (true)loop innext_token_fast()replaces recursion, allowing comment consumption tocontinueto the next iteration instead of making a recursive call.{,}, etc.) use directpos++without newline checks._scan_newlines()helper efficiently counts newlines in a range, used for scanning comment bodies without tracking each character individually.*/(was off by one).https://claude.ai/code/session_013qXLG5rYHgVtAqYU34sCWo