Skip to content

perf(dicer): avoid regex when parsing headers#211

Merged
mcollina merged 3 commits into
mainfrom
perf/header-parser-fast-path
May 24, 2026
Merged

perf(dicer): avoid regex when parsing headers#211
mcollina merged 3 commits into
mainfrom
perf/header-parser-fast-path

Conversation

@mcollina
Copy link
Copy Markdown
Member

Summary

  • avoid regex-based header field parsing in Dicer's HeaderParser
  • scan CRLF-delimited header lines directly instead of using buffer.split(/\r\n/g)
  • preserve folded-header handling and existing malformed-header behavior

Performance

Local HeaderParser microbenchmark on a 3-header block:

  • before: ~1.78M ops/sec
  • after: ~2.66M ops/sec

A local @platformatic/flame run also confirmed HeaderParser._parseHeader remains a primary hotspot, but its self-time dropped in the targeted multipart workload.

Validation

  • npm test
  • npm run lint

Copy link
Copy Markdown
Member

@ivan-tymoshenko ivan-tymoshenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@marcopiraccini
Copy link
Copy Markdown

lgtm

@mcollina
Copy link
Copy Markdown
Member Author

Added a follow-up optimization in 6d81635: HeaderParser now scans for the header terminator directly instead of routing header parsing through StreamSearch. I also added coverage for CRLFCRLF split across chunks, including after maxHeaderSize is reached.\n\nValidation:\n- npm test\n- npm run lint\n- npm run bench:dicer\n- remote CI checks are passing.

@mcollina
Copy link
Copy Markdown
Member Author

Added another follow-up optimization in 978c38d, based on a full Busboy multipart flamegraph (Busboy -> Multipart -> Dicer -> HeaderParser), not just the Dicer micro/profile path.\n\nFull-profile result on the local multipart workload:\n- before this follow-up: 7,794 parses / 7s baseline from the full Busboy flamegraph\n- after HeaderParser direct scan: 8,610 parses / 7s\n- after parse/decode fast paths: 10,397 parses / 7s\n\nChanges in the latest commit:\n- skip Buffer/TextDecoder-style UTF-8 decoding for ASCII strings\n- add fast paths for common form-data; name=...; filename=... parsing\n- avoid parseParams() for simple per-part content-type values without parameters\n\nFinal full flamegraph artifacts are local at:\n- /tmp/busboy-full-flame-final/cpu-profile-2026-05-23T10-15-52-214Z.html\n- /tmp/busboy-full-flame-final/cpu-profile-2026-05-23T10-15-52-214Z.md\n\nValidation:\n- npm test\n- npm run lint\n- remote CI checks are passing.

@mcollina mcollina merged commit 77b3cb2 into main May 24, 2026
17 checks passed
@mcollina mcollina deleted the perf/header-parser-fast-path branch May 24, 2026 12:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants