fix(doc-collector): two SPA-robustness fixes (recoverable nav error, page-state repopulation) by gololdf1sh · Pull Request #34 · testomatio/explorbot

gololdf1sh · 2026-05-20T12:27:50Z

Summary

Two independent fixes that came out of running explorbot docs collect against a real-world SPA (Testomat.io beta). Each is in its own commit.

Note: Initial version of this PR included a third fix to ConfigParser.loadEnv (walk parent dirs to find .env). It regressed action-result-diff.test.ts on CI in ways the diff alone doesn't fully explain. Dropped from this PR — will resubmit separately with a CLI-level bootstrap so library semantics are unchanged.

1. `fix(explorer)`: treat "navigating and changing the content" as recoverable

Playwright throws page.content: Unable to retrieve content because the page is navigating and changing the content on heavy SPAs whose client router rewrites the DOM mid-action (Ember, React Router, etc.). The current regex covered net::ERR_ABORTED, screenshot timeout, and font-wait — this new phrase fell through to FATAL_BROWSER_ERRORS and killed the whole crawl on the first race. Added to RECOVERABLE_NAVIGATION_ERRORS so the explorer retries instead.

2. `fix(doc-collector)`: repopulate page state when framenavigated stripped it

After navigation, the framenavigated handler overwrites the rich ActionResult (html / links / aria) with a stripped WebPageState carrying only { url, title, statusCode }. doc-collector then reads getCurrentState() and gets state.html === undefined, state.links === []. Two consequences:

Documentarian receives empty html → page docs degrade to a near-empty stub.
extractNextPaths sees an empty links array → subtree crawl stops at the entry page even when many followable links exist.

Targeted fixes:

In the main collect loop: if state.html is falsy, force capturePageState before passing to the AI documenter.
In extractNextPaths: if state.links is empty but state.html is present, fall back to extractLinks(state.html).

Repro (combined effect)

Running explorbot docs collect /projects/{slug}/runs/{id} on Testomat.io beta:

Before	After
Crash on first action with the "navigating and changing the content" error	Crawl completes
When it didn't crash: "Pages documented: 1"	"Pages documented: 2-3" (entry + linked subpages)

…overable Playwright throws "page.content: Unable to retrieve content because the page is navigating and changing the content" on heavy SPAs whose client-side router rewrites the DOM mid-action (Ember, React Router, etc.). The explorer was catching only net::ERR_ABORTED / screenshot-timeout / waiting-for-fonts as recoverable; this new phrase fell through to FATAL_BROWSER_ERRORS and killed the whole crawl on the first navigation race. Add the phrase to RECOVERABLE_NAVIGATION_ERRORS so the explorer re-queues the action instead of aborting. Repro: collect docs against a Testomat.io page hosted in beta (Ember-based SPA). Without the fix, ~30% of pages fail with the fatal error on the first action. With the fix, those pages complete normally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…d it After a navigation completes, ExplorBot's framenavigated handler overwrites the full ActionResult (with html/links/aria) with a stripped-down WebPageState that has only { url, title, statusCode }. The doc-collector then reads getCurrentState() and gets a state with state.html === undefined and state.links === []. Consequences: - Documentarian receives empty html -> page documentation degrades to a near-empty stub. - extractNextPaths() sees an empty links array -> the subtree crawl stops at the entry page even when many followable links exist. Two targeted fixes: 1. In the main collect loop, if state.html is falsy, force a capturePageState (with screenshots if configured). This is cheap compared to the AI documentation step that follows. 2. In extractNextPaths, if state.links is empty but state.html is present, fall back to extractLinks(state.html) so subtree traversal still finds child paths. Repro: collect against a Testomat.io project page. Before: "Pages documented: 1". After: full subtree (3-7 pages depending on the entry). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

gololdf1sh and others added 2 commits May 20, 2026 15:50

gololdf1sh force-pushed the fix/doc-collector-spa-robustness branch from b45fdaa to 5c37bcd Compare May 20, 2026 12:50

gololdf1sh changed the title ~~fix(doc-collector): three SPA-robustness fixes (env lookup, recoverable nav error, page-state repopulation)~~ fix(doc-collector): two SPA-robustness fixes (recoverable nav error, page-state repopulation) May 20, 2026

gololdf1sh force-pushed the fix/doc-collector-spa-robustness branch from 16857af to 5c37bcd Compare May 20, 2026 12:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(doc-collector): two SPA-robustness fixes (recoverable nav error, page-state repopulation)#34

fix(doc-collector): two SPA-robustness fixes (recoverable nav error, page-state repopulation)#34
gololdf1sh wants to merge 2 commits into
testomatio:mainfrom
gololdf1sh:fix/doc-collector-spa-robustness

gololdf1sh commented May 20, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

gololdf1sh commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

1. fix(explorer): treat "navigating and changing the content" as recoverable

2. fix(doc-collector): repopulate page state when framenavigated stripped it

Repro (combined effect)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

gololdf1sh commented May 20, 2026 •

edited

Loading

1. `fix(explorer)`: treat "navigating and changing the content" as recoverable

2. `fix(doc-collector)`: repopulate page state when framenavigated stripped it