Skip to content

perf: eliminate AsciiString ref-count overhead in rendering, logic, and UI hot paths#2663

Open
githubawn wants to merge 4 commits intoTheSuperHackers:mainfrom
githubawn:perf/const-asciiString
Open

perf: eliminate AsciiString ref-count overhead in rendering, logic, and UI hot paths#2663
githubawn wants to merge 4 commits intoTheSuperHackers:mainfrom
githubawn:perf/const-asciiString

Conversation

@githubawn
Copy link
Copy Markdown

@githubawn githubawn commented Apr 29, 2026

This refactor eliminates global lock contention in the AsciiString system by replacing pass-by-value with const AsciiString& references. Boosting performance from 14 to 87 FPS (1% low) on AOD Cobalt Rush. Inspired by #1942.

tracy

AsciiString is bottlenecked by heavy lock contention in RtlEnterCriticalSection, specifically during releaseBuffer and object construction. This forces getClosestWaypointOnPath to stall as it repeatedly creates and destroys temporary string labels, turning a spatial query into a series of expensive global synchronization events.

Optimization Targets

Our optimizations target the high-frequency "Hot Paths" where string operations directly impact game performance. This includes Terrain Rendering (getTerrainNameAt, getName, getTexture), where strings are accessed per tile every frame; Unit Templates and UI (getUpgradeCameoName, getNthName, getNthTag), which are evaluated constantly for HUD refreshes; and Map Logic (getSourceFilename, Waypoint, and Trigger getters), which are processed every simulation tick. We also standardized Infrastructure calls (FileStream::open, TerrainVisual::load) to ensure data flows through the engine without redundant copies.

Rationale for Limited Scope

We intentionally limited this scope to ensure internal consistency and completeness. As the maintainer (xezon) noted, changing a function to return a reference is "incomplete" if the call sites still copy the result into a value-type variable. By focusing on these specific subsystems, we were able to verify and update every single call site to use const&, achieving 100% efficiency for these paths.

This change was generated with AI assistance. All generated code has been reviewed, tested, and verified for correctness.
Process and Methodology: The work followed a data-driven diagnostic path:
We used the Tracy Profiler to pinpoint high contention on TheAsciiStringCriticalSection during unit-heavy simulation steps.
Based on the profile, we systematically replaced high-frequency AsciiString value-passing with const AsciiString& references across TerrainLogic, Waypoint, and PolygonTrigger classes.
We performed a secondary audit to eliminate hidden copies in the file-system abstraction and internal search loops, ensuring a completely zero-copy string path for these logical systems.

@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented Apr 29, 2026

Greptile Summary

This PR replaces pass-by-value AsciiString parameters and return types with const AsciiString& references across TerrainLogic, Waypoint, PolygonTrigger, WorldHeightMap, and related classes in both the Generals/ and GeneralsMD/ trees, eliminating redundant ref-count increments and lock contention on TheAsciiStringCriticalSection. The previously-flagged FileInputStream::open signature mismatches in both DataChunk.cpp files have been correctly resolved in this revision.

Confidence Score: 5/5

PR is safe to merge — all const-ref changes are correct, no dangling references, and previously flagged linker errors are resolved.

Every returned reference points to a member field or the static AsciiString::TheEmptyString; backing storage is either a fixed-size C array member (m_textureClasses, m_upgradeCameoUpgradeNames) or a stable instance member (m_filenameString, m_name, m_pathLabel*). No local-variable references are returned. The Waypoint constructor still copies its reference arguments into owned members via the initializer list. The DataChunk.cpp signature mismatches flagged in previous reviews are resolved. No P0 or P1 issues found in this diff.

No files require special attention.

Important Files Changed

Filename Overview
Core/GameEngineDevice/Source/W3DDevice/GameClient/WorldHeightMap.cpp getTerrainNameAt now returns const AsciiString& — safe because it returns either AsciiString::TheEmptyString (static) or m_textureClasses[i].name which is a field of a fixed-size C array member, so no dangling-reference risk.
Generals/Code/GameEngine/Source/GameLogic/Map/TerrainLogic.cpp All parameter and local-variable changes to const AsciiString& are correct; Waypoint constructor still copies into members via the initializer list, and all returned references are to member fields or TheEmptyString (static).
GeneralsMD/Code/GameEngine/Source/GameLogic/Map/TerrainLogic.cpp Mirror of Generals changes — same analysis applies; no issues found.
Generals/Code/GameEngine/Source/Common/System/DataChunk.cpp Both CachedFileInputStream::open and FileInputStream::open implementations now match the updated header signatures — previous linker-error issue is resolved.
GeneralsMD/Code/GameEngine/Source/Common/System/DataChunk.cpp Same fix as Generals counterpart — both open() implementations correctly updated to const AsciiString&.
Generals/Code/GameEngine/Include/Common/ThingTemplate.h getNthName, getNthTag, and getUpgradeCameoName return const AsciiString& safely — the backing storage is either a fixed-size C array member or AsciiString::TheEmptyString.
Core/GameEngineDevice/Source/W3DDevice/GameClient/Water/W3DWaterTracks.cpp const AsciiString& fileName bound to TheTerrainLogic->getSourceFilename() (returns ref to m_filenameString member); used immediately before any mutation — safe.
Generals/Code/GameEngine/Include/GameLogic/TerrainLogic.h All virtual function signatures updated consistently; Waypoint getter return types changed to const AsciiString& matching member fields.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["Caller passes AsciiString\n(previously by value — copy + lock)"] -->|"Before PR"| B["AsciiString copy ctor\nAcquires TheAsciiStringCriticalSection\nIncrements refcount\nReleases lock"]
    B --> C["Function body executes"]
    C --> D["AsciiString destructor\nAcquires lock again\nDecrements refcount\nReleases lock"]

    A2["Caller passes const AsciiString&\n(after PR — zero copy)"] -->|"After PR"| C2["Function body executes\ndirectly on original string"]
    C2 --> D2["No destructor / no lock\nacquired for temporary"]

    style B fill:#f88,stroke:#c00
    style D fill:#f88,stroke:#c00
    style C2 fill:#8f8,stroke:#080
    style D2 fill:#8f8,stroke:#080
Loading

Reviews (7): Last reviewed commit: "const consistency" | Re-trigger Greptile

@xezon
Copy link
Copy Markdown

xezon commented Apr 29, 2026

The lock in AsciiString needs to be replaced with an atomic write.

@xezon
Copy link
Copy Markdown

xezon commented Apr 29, 2026

Which threads are contending?

Copy link
Copy Markdown

@xezon xezon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pull needs more info where exactly the contention was observed with evidence.

Comment thread GeneralsMD/Code/GameEngine/Include/GameLogic/TerrainLogic.h
@xezon xezon added Minor Severity: Minor < Major < Critical < Blocker Performance Is a performance concern Gen Relates to Generals ZH Relates to Zero Hour labels Apr 29, 2026
Comment thread Core/GameEngine/Source/Common/System/AsciiString.cpp
Comment thread Core/GameEngine/Source/Common/System/AsciiString.cpp Outdated
@Caball009
Copy link
Copy Markdown

Caball009 commented Apr 29, 2026

Can the replacement of the mutex / lock with atomic be done in a separate PR? That's much cleaner.

@githubawn
Copy link
Copy Markdown
Author

githubawn commented Apr 29, 2026

Can the replacement of the mutex / lock with atomic be done in a separate PR? That's much cleaner.

I'm okay with this.

@githubawn githubawn force-pushed the perf/const-asciiString branch from b8f53bd to 38d6e0d Compare April 29, 2026 21:23
@Mauller
Copy link
Copy Markdown

Mauller commented Apr 29, 2026

So when i had a quick look at cobalt rush i eliminated a lot of the allocations by simply making the "getPathLabel" functions in TerrainLogic return a constant reference to an ascii string.

The main problem after that is the compareNoCase function.

/// Get the waypoint's first path label
const AsciiString& getPathLabel1() const { return m_pathLabel1;  }
/// Get the waypoint's second path label
const AsciiString& getPathLabel2() const { return m_pathLabel2;  }
/// Get the waypoint's third path label
const AsciiString& getPathLabel3() const { return m_pathLabel3;  }

Before the change
image
image

After the change
image
image

@githubawn githubawn changed the title perf(logic): Eliminate AsciiString ref-count overhead in TerrainLogic and PolygonTriggers perf: eliminate AsciiString ref-count overhead in rendering, logic, and UI hot paths Apr 29, 2026
@Mauller
Copy link
Copy Markdown

Mauller commented Apr 30, 2026

I do not think a lot of the changes are necessary as they are not in performance critical paths. Especially the filename load / save path name handling.

Making them const references can be detrimental to path handling when we want to manipulate the path in the asciistring.

@xezon
Copy link
Copy Markdown

xezon commented Apr 30, 2026

Which thread does main thread content with?

#endif

AsciiString getNthName(size_t i) const
const AsciiString& getNthName(size_t i) const
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are still copies with this.

public:
/// return the bridge template name
AsciiString getBridgeTemplateName() { return m_templateName; }
const AsciiString& getBridgeTemplateName() { return m_templateName; }
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are still copies made of this

@githubawn
Copy link
Copy Markdown
Author

Went back to the drawing board instead of manually deciding what should be part of this refactor, generated a script that goes over the entire codebase. This "programmatic const-ref" approach increases performance by another 10 FPS in our benchmark as a downstream side effect of catching every redundant copy missed by manual analysis.

To ensure safety, the script automatically skips virtual/override methods, variables modified via assignment or member mutators (like .format()), variadic signatures, and complex or debug-only locals, guaranteeing that headers and implementations remain perfectly synchronized and binary-safe. Tested a few matches and didn't notice any regressions.

The script: refactor_ascii_strings.ps1.txt

The results: resultsmain.txt

Thoughts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Gen Relates to Generals Minor Severity: Minor < Major < Critical < Blocker Performance Is a performance concern ZH Relates to Zero Hour

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants