perf: eliminate AsciiString ref-count overhead in rendering, logic, and UI hot paths#2663
perf: eliminate AsciiString ref-count overhead in rendering, logic, and UI hot paths#2663githubawn wants to merge 4 commits intoTheSuperHackers:mainfrom
Conversation
|
| Filename | Overview |
|---|---|
| Core/GameEngineDevice/Source/W3DDevice/GameClient/WorldHeightMap.cpp | getTerrainNameAt now returns const AsciiString& — safe because it returns either AsciiString::TheEmptyString (static) or m_textureClasses[i].name which is a field of a fixed-size C array member, so no dangling-reference risk. |
| Generals/Code/GameEngine/Source/GameLogic/Map/TerrainLogic.cpp | All parameter and local-variable changes to const AsciiString& are correct; Waypoint constructor still copies into members via the initializer list, and all returned references are to member fields or TheEmptyString (static). |
| GeneralsMD/Code/GameEngine/Source/GameLogic/Map/TerrainLogic.cpp | Mirror of Generals changes — same analysis applies; no issues found. |
| Generals/Code/GameEngine/Source/Common/System/DataChunk.cpp | Both CachedFileInputStream::open and FileInputStream::open implementations now match the updated header signatures — previous linker-error issue is resolved. |
| GeneralsMD/Code/GameEngine/Source/Common/System/DataChunk.cpp | Same fix as Generals counterpart — both open() implementations correctly updated to const AsciiString&. |
| Generals/Code/GameEngine/Include/Common/ThingTemplate.h | getNthName, getNthTag, and getUpgradeCameoName return const AsciiString& safely — the backing storage is either a fixed-size C array member or AsciiString::TheEmptyString. |
| Core/GameEngineDevice/Source/W3DDevice/GameClient/Water/W3DWaterTracks.cpp | const AsciiString& fileName bound to TheTerrainLogic->getSourceFilename() (returns ref to m_filenameString member); used immediately before any mutation — safe. |
| Generals/Code/GameEngine/Include/GameLogic/TerrainLogic.h | All virtual function signatures updated consistently; Waypoint getter return types changed to const AsciiString& matching member fields. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["Caller passes AsciiString\n(previously by value — copy + lock)"] -->|"Before PR"| B["AsciiString copy ctor\nAcquires TheAsciiStringCriticalSection\nIncrements refcount\nReleases lock"]
B --> C["Function body executes"]
C --> D["AsciiString destructor\nAcquires lock again\nDecrements refcount\nReleases lock"]
A2["Caller passes const AsciiString&\n(after PR — zero copy)"] -->|"After PR"| C2["Function body executes\ndirectly on original string"]
C2 --> D2["No destructor / no lock\nacquired for temporary"]
style B fill:#f88,stroke:#c00
style D fill:#f88,stroke:#c00
style C2 fill:#8f8,stroke:#080
style D2 fill:#8f8,stroke:#080
Reviews (7): Last reviewed commit: "const consistency" | Re-trigger Greptile
|
The lock in AsciiString needs to be replaced with an atomic write. |
|
Which threads are contending? |
xezon
left a comment
There was a problem hiding this comment.
This pull needs more info where exactly the contention was observed with evidence.
|
Can the replacement of the mutex / lock with atomic be done in a separate PR? That's much cleaner. |
I'm okay with this. |
b8f53bd to
38d6e0d
Compare
|
So when i had a quick look at cobalt rush i eliminated a lot of the allocations by simply making the "getPathLabel" functions in TerrainLogic return a constant reference to an ascii string. The main problem after that is the compareNoCase function. /// Get the waypoint's first path label
const AsciiString& getPathLabel1() const { return m_pathLabel1; }
/// Get the waypoint's second path label
const AsciiString& getPathLabel2() const { return m_pathLabel2; }
/// Get the waypoint's third path label
const AsciiString& getPathLabel3() const { return m_pathLabel3; } |
|
I do not think a lot of the changes are necessary as they are not in performance critical paths. Especially the filename load / save path name handling. Making them const references can be detrimental to path handling when we want to manipulate the path in the asciistring. |
|
Which thread does main thread content with? |
| #endif | ||
|
|
||
| AsciiString getNthName(size_t i) const | ||
| const AsciiString& getNthName(size_t i) const |
| public: | ||
| /// return the bridge template name | ||
| AsciiString getBridgeTemplateName() { return m_templateName; } | ||
| const AsciiString& getBridgeTemplateName() { return m_templateName; } |
|
Went back to the drawing board instead of manually deciding what should be part of this refactor, generated a script that goes over the entire codebase. This "programmatic const-ref" approach increases performance by another 10 FPS in our benchmark as a downstream side effect of catching every redundant copy missed by manual analysis. To ensure safety, the script automatically skips virtual/override methods, variables modified via assignment or member mutators (like .format()), variadic signatures, and complex or debug-only locals, guaranteeing that headers and implementations remain perfectly synchronized and binary-safe. Tested a few matches and didn't notice any regressions. The script: refactor_ascii_strings.ps1.txt The results: resultsmain.txt Thoughts? |




This refactor eliminates global lock contention in the AsciiString system by replacing pass-by-value with const AsciiString& references. Boosting performance from 14 to 87 FPS (1% low) on AOD Cobalt Rush. Inspired by #1942.
AsciiString is bottlenecked by heavy lock contention in RtlEnterCriticalSection, specifically during releaseBuffer and object construction. This forces getClosestWaypointOnPath to stall as it repeatedly creates and destroys temporary string labels, turning a spatial query into a series of expensive global synchronization events.
Optimization Targets
Our optimizations target the high-frequency "Hot Paths" where string operations directly impact game performance. This includes Terrain Rendering (getTerrainNameAt, getName, getTexture), where strings are accessed per tile every frame; Unit Templates and UI (getUpgradeCameoName, getNthName, getNthTag), which are evaluated constantly for HUD refreshes; and Map Logic (getSourceFilename, Waypoint, and Trigger getters), which are processed every simulation tick. We also standardized Infrastructure calls (FileStream::open, TerrainVisual::load) to ensure data flows through the engine without redundant copies.
Rationale for Limited Scope
We intentionally limited this scope to ensure internal consistency and completeness. As the maintainer (xezon) noted, changing a function to return a reference is "incomplete" if the call sites still copy the result into a value-type variable. By focusing on these specific subsystems, we were able to verify and update every single call site to use const&, achieving 100% efficiency for these paths.
This change was generated with AI assistance. All generated code has been reviewed, tested, and verified for correctness.
Process and Methodology: The work followed a data-driven diagnostic path:
We used the Tracy Profiler to pinpoint high contention on TheAsciiStringCriticalSection during unit-heavy simulation steps.
Based on the profile, we systematically replaced high-frequency AsciiString value-passing with const AsciiString& references across TerrainLogic, Waypoint, and PolygonTrigger classes.
We performed a secondary audit to eliminate hidden copies in the file-system abstraction and internal search loops, ensuring a completely zero-copy string path for these logical systems.