perf: cache source lines, stream file bytes, and trim hot-path allocations by Thorium · Pull Request #1535 · ionide/FsAutoComplete

Thorium · 2026-06-09T14:19:33Z

Summary

Reduces allocations and redundant work in several FsAutoComplete hot paths. Unlike a "looks faster" change set, every optimization here was validated one-by-one with real BenchmarkDotNet A/B measurements (pre-optimization baseline vs. this branch), measuring both time and allocations. Changes that turned out to be empirically pointless or to break the public API for marginal gain were deliberately excluded (see below).

All changes are body/implementation-only and backward compatible — no public signature (.fsi) changes.

What's included (with evidence)

1. `RoslynSourceTextFile.Lines` — lazily cache the line array

Previously every access recomputed sourceText.Lines |> Seq.toArray |> Array.map (_.ToString()). Now cached on first use.

Repeated access on one file instance (2000-line file):

Reads	Before	After
1×	249 µs / 431 KB	one-time build, then free
10×	2,291 µs / 4,306 KB	~free after first
100×	22,487 µs / 43,063 KB	~free after first

Repeat reads (the common case over a file's lifetime) go from O(lines) each to effectively free; first read costs the same as before.

2. `FileSystem` file-content read — stream bytes via chunked `ISourceText.CopyTo`

Replaces file.Source.ToString() |> Encoding.UTF8.GetBytes (which materializes the whole file as an intermediate string) with a chunked copy.

File size	Time before → after	Alloc before → after
1,000 lines	178 µs → 68 µs (−62%)	332 KB → 356 KB (+7%)
20,000 lines	1,733 µs → 1,191 µs (−31%)	7,067 KB → 5,544 KB (−22%)

Clear time win at both sizes and an allocation win on large files. Small files show a slight allocation bump from MemoryStream buffer doubling — addressed in a follow-up (see below).

3. `CompilerProjectOption.SourceFilesTagged` — fuse two passes into one

Avoids an extra List.map/Array.toList pass when tagging source-file paths. Time is dominated by normalizePath and stays within noise; the win is allocation:

Path	Allocation
`TransparentCompiler` (list)	−50%
`BackgroundCompiler` (array)	−37%

4. `processFSIArgs` — O(n²) `Array.append`-in-fold → O(n) `ResizeArray`

At its real call site (SetFSIAdditionalArguments, a small user-configured arg list) the impact is negligible (realistic N≈8: ~800 B saved, time within noise). Included as a correctness/scaling improvement — it is dramatic only at unrealistic sizes (N=1000: 25× faster, 47× less allocation).

5. OTel tag `source.text` → `source.length`

The trace tag previously boxed the entire file contents as a string. Replaced with the integer length, eliminating that per-trace allocation.

6. Completion retry only re-reads the file when content is actually stale

getCompletions now takes a rereadFile flag, so the document is re-read only when the error indicates stale content (line-lookup failure / trigger-char mismatch), not on every retry — avoiding redundant I/O on the hot completion path.

What was deliberately excluded (also evidence-based)

Lexer.tokenizeLine define-parsing (Array.fold → for-loop): allocations identical (−1%), time within noise — FSharpSourceTokenizer creation/scan dominates. No measurable benefit; dropped.
LoadedProject lazy-cache of SourceFilesTagged: required adding a public record field (_sourceFilesTagged: Lazy<…>) to AdaptiveServerState.fsi — a source/binary-breaking public-API change leaking an implementation detail into the signature — for a benefit that measured as marginal (the underlying computation is cheap, per Correct build script name #3). Dropped: compatibility cost outweighed the gain.

Follow-up (not in this PR)

Pre-sizing the byte buffer in #2 (new MemoryStream(length)) measured as a large further win (1k lines: 68 µs → 14 µs, 356 KB → 170 KB; 20k lines: 1,191 µs → 725 µs, 5,544 KB → 2,866 KB) and removes the small-file allocation bump. Can be added here or as a separate PR.

Methodology

BenchmarkDotNet v0.14, .NET 8.0, Release/optimized toolchain, [<MemoryDiagnoser>], on an i9-13900H. Each optimization exercised through its real code path (or, for the private/algorithmic ones, an exact head-to-head reproduction of the old vs. new body). Numbers reported correspond to the code in this branch.

Let's retry performance optimizations

8f986dc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: cache source lines, stream file bytes, and trim hot-path allocations#1535

perf: cache source lines, stream file bytes, and trim hot-path allocations#1535
Thorium wants to merge 1 commit into
ionide:mainfrom
Thorium:perf-opt-2

Thorium commented Jun 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Thorium commented Jun 9, 2026

Summary

What's included (with evidence)

1. RoslynSourceTextFile.Lines — lazily cache the line array

2. FileSystem file-content read — stream bytes via chunked ISourceText.CopyTo

3. CompilerProjectOption.SourceFilesTagged — fuse two passes into one

4. processFSIArgs — O(n²) Array.append-in-fold → O(n) ResizeArray

5. OTel tag source.text → source.length

6. Completion retry only re-reads the file when content is actually stale

What was deliberately excluded (also evidence-based)

Follow-up (not in this PR)

Methodology

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

1. `RoslynSourceTextFile.Lines` — lazily cache the line array

2. `FileSystem` file-content read — stream bytes via chunked `ISourceText.CopyTo`

3. `CompilerProjectOption.SourceFilesTagged` — fuse two passes into one

4. `processFSIArgs` — O(n²) `Array.append`-in-fold → O(n) `ResizeArray`

5. OTel tag `source.text` → `source.length`