Skip to content

fix(importer): bound runaway loops on corrupt count fields in GP3-5 binary parser#2670

Open
kaizenman wants to merge 1 commit intoCoderLine:developfrom
kaizenman:fix/gp35-corrupt-input-runaway-loops
Open

fix(importer): bound runaway loops on corrupt count fields in GP3-5 binary parser#2670
kaizenman wants to merge 1 commit intoCoderLine:developfrom
kaizenman:fix/gp35-corrupt-input-runaway-loops

Conversation

@kaizenman
Copy link
Copy Markdown

@kaizenman kaizenman commented Apr 24, 2026

Issues

Fixes #2677

Proposed changes

Gp3To5Importer.readBend and readTremoloBarEffect read a pointCount integer from the input stream and then loop pointCount times, reading 9 bytes per iteration. Neither function validates that pointCount * 9 fits in the remaining stream:

const pointCount: number = IOHelper.readInt32LE(this.data);
if (pointCount > 0) {
    for (let i: number = 0; i < pointCount; i++) {
        // 9 bytes read per iteration
        IOHelper.readInt32LE(this.data);
        IOHelper.readInt32LE(this.data);
        GpBinaryHelpers.gpReadBool(this.data);
        ...
    }
}

When the parser becomes misaligned mid-stream (e.g. on a corrupted file where an earlier field shifts the cursor onto random data), pointCount is read from those random bytes and can be up to ~2^31. The loop then iterates hundreds of millions of times. ByteBuffer.readByte() returns -1 silently at EOF rather than throwing, so the loop never terminates from the input side; meanwhile each iteration allocates a BendPoint object — V8 heap grows until OOM-crash.

This is a practical denial-of-service vector against any browser page that runs AlphaTab on user-supplied input: a single ~3 KB corrupt .gp5 can OOM-crash the tab in 10–30 seconds, taking down all sibling tabs in the same Chromium renderer process.

Reproducer

test-data/guitarpro5/corrupted-bend-point-count.gp5 — a 3.8 KB GP5 file. The file's readBeatEffects flag byte mid-stream causes readTremoloBarEffect to be invoked on garbage, where the pointCount field reads as 587,530,544 against a remaining stream of 1,413 bytes — a 6-orders-of-magnitude mismatch.

Pre-fix: Node OOM-crashes after ~10 s with V8 reporting Reached heap limit.
Post-fix: a typed UnsupportedFormatError is thrown immediately.

How other parsers handle this fixture

For context — the same fixture exposes downstream defects in other parsers as well (out of scope for this PR but useful as a sanity check that the file is genuinely malformed):

  • PyGuitarPro rejects the file at parse time with a typed GPException annotated with track / measure / voice / beat coordinates. Python's struct.unpack raises on insufficient bytes, which acts as an automatic safety net at every read — the equivalent of which ByteBuffer.readByte lacks (it returns -1 silently at EOF).
  • Guitar Pro 5.2 desktop hangs on the file.
  • TuxGuitar parses the file successfully but throws ArithmeticException: / by zero during playback (likely a separate downstream bug on a zero-valued field that survived the malformed parse).

This PR only addresses the AlphaTab parse-time DoS. The other parsers' downstream behaviors are not affected.

Fix

Add a bounds check before entering the loop. The check is not a magic cap (MAX_BEND_POINTS = 60) but a mathematical constraint of the format: pointCount × bytes_per_iteration must fit in the remaining stream. Anything larger is impossible by conservation of bytes.

const pointCount: number = IOHelper.readInt32LE(this.data);
Gp3To5Importer._requireFits(this.data, pointCount, 9, 'bend point count');
if (pointCount > 0) {
    for (let i = 0; i < pointCount; i++) { ... }
}

private static _requireFits(data: IReadable, count: number, bytesPerItem: number, fieldName: string): void {
    const remaining = data.length - data.position;
    if (count < 0 || count * bytesPerItem > remaining) {
        throw new UnsupportedFormatError(
            `${fieldName}=${count} (${count * bytesPerItem} bytes) exceeds remaining ${remaining} bytes`
        );
    }
}

A real bend has at most ~30 points (30 × 9 = 270 bytes) — well below any stream's remaining capacity, so the check never fires on valid input. On corrupt or crafted input where pointCount is millions or negative, the check fires immediately, before any allocation, and surfaces a typed error the caller can handle.

Coverage

The same loop pattern appears in two functions; both are fixed:

Function File Bytes per iteration
readBend (Gp3To5Importer.ts:1334) bend points 9 (i32 + i32 + bool)
readTremoloBarEffect (Gp3To5Importer.ts:1036) whammy points 9 (i32 + i32 + bool)

This covers GP3 / GP4 / GP5 (all three formats share Gp3To5Importer). GP6 (GpxImporter, BCFZ container) and GP7/8 (Gp7To8Importer + GpifParser, XML-based) do not have analogous binary count-driven loops in this importer path; if similar issues exist elsewhere they would be addressed separately.

Scope

Four other count-driven loops in Gp3To5Importer follow the same pattern and are theoretically vulnerable to the same DoS class (barCount, trackCount, beatCount headers). They are tracked separately in #2678, where the architectural choice between (a) extending _requireFits to those loops and (b) making ByteBuffer.readByte() throw at EOF can be made independently of merging this PR.

Tests

Gp5ImporterTest > corrupted-bend-point-count — loads the fixture and asserts UnsupportedFormatError is thrown. Pre-fix the test would never finish (Node OOM); post-fix it completes in <100 ms.

All 173 GP importer tests pass (16 GP3 + 21 GP4 + 40 GP5 + 42 GP7 + 26 GP8 + 28 GPX).

Note

One of three independent fixes in the same area. The others address distinct root causes:

Checklist

  • I consent that this change becomes part of alphaTab under it's current or any future open source license
  • Changes are implemented
  • New tests were added

Further details

  • This is a breaking change
  • This change will require update of the documentation/website

@kaizenman kaizenman marked this pull request as draft April 24, 2026 23:38
@kaizenman kaizenman force-pushed the fix/gp35-corrupt-input-runaway-loops branch from 9827cb8 to 598fb34 Compare April 24, 2026 23:46
…inary parser

Gp3To5Importer.readBend and readTremoloBarEffect read a pointCount
integer from the input stream and then loop pointCount times, reading
9 bytes per iteration. Neither validates that pointCount * 9 fits in
the remaining stream.

When the parser becomes misaligned mid-stream (corrupt or crafted
input) pointCount can be read from random bytes and reach ~2^31.
ByteBuffer.readByte returns -1 silently at EOF rather than throwing,
so the loop never terminates from the input side; each iteration
allocates a BendPoint object and V8 OOM-crashes after ~10s.

Reject counts on the basis that count * bytesPerItem must fit in the
remaining stream — a mathematical constraint of the format, not a
magic cap. Real bends have ~30 points (270 bytes), well below any
stream's remaining capacity, so the check never fires on valid input.

Fixture: a 3.8 KB GP5 from a real-world corpus where pointCount reads
as 587530544 against 1413 bytes remaining (6 orders of magnitude
mismatch). Pre-fix Node OOM-crashes; post-fix throws
UnsupportedFormatError in <100ms.
@kaizenman kaizenman force-pushed the fix/gp35-corrupt-input-runaway-loops branch from 598fb34 to 988c629 Compare April 24, 2026 23:53
@kaizenman kaizenman marked this pull request as ready for review April 24, 2026 23:54
@kaizenman
Copy link
Copy Markdown
Author

Note on scope

This PR adds bounds checks to the two count-driven loops empirically observed to OOM-crash on real-world corrupted files (readBend and readTremoloBarEffect). Four other count-driven loops in Gp3To5Importer follow the same pattern and are theoretically vulnerable to the same DoS:

  • for i = 0; i < this._barCount; i++ — header bar count (line 336, 607)
  • for i = 0; i < this._trackCount; i++ — header track count (line 452)
  • for i = 0; i < beatCount; i++ — voice beat count (line 654)

A crafted (vs naturally-corrupted) file declaring barCount = 2_000_000_000 would still bypass this PR.

A more holistic alternative would be to make ByteBuffer.readByte() throw EndOfReaderError at EOF instead of returning -1 silently. That would terminate any past-EOF runaway loop after a single iteration, regardless of which count field drove it. The trade-off is contract-changing and would need careful audit of existing callers that may rely on the -1 sentinel.

I went with targeted bounds checks for this PR because:

  1. They're guaranteed not to break any existing behavior on valid input.
  2. They cover every case actually seen in a 1M-file corpus sweep.
  3. The change is small and easy to review.

Happy to extend to the remaining four loops with the same _requireFits helper, or rework as the holistic readByte-throws change, depending on preference.

@Danielku15
Copy link
Copy Markdown
Member

Thanks for the fixes, can you open counterpart bug reports following the template so we have proper tracking of these items? Also: the issue and PR templates are not optional, please be sure to update your description accordingly.

@Danielku15
Copy link
Copy Markdown
Member

@kaizenman On the issues please use the issue template: https://github.com/CoderLine/alphaTab/issues/new?template=bug_report_form.yml I know it can be a bit cumbersome, but it really helps the project to track things properly (problem&background, expectations, how to reproduce, what version etc.)

@kaizenman
Copy link
Copy Markdown
Author

Sorry about that — my fault for filing the first round via the API instead of the template URL, which is why labels and assignment didn't apply.

I've now re-filed all four through the bug report form so they pick up state-needs-triage and the assignment automatically:

The old four are closed with a "Replaced by #…" pointer. The PR descriptions are updated to reference the new issue numbers. Let me know if anything else still needs adjusting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

readBend/readTremoloBarEffect OOM-crash on unbounded count fields (DoS)

2 participants