fix(cursor): intercept fabricated Questions skipped AskQuestion result in headless mode (#784)#801
Merged
tiann merged 3 commits intoJun 5, 2026
Conversation
…ult in headless mode (tiann#784) When cursor-agent runs under `--print --output-format stream-json` (HAPI's current Cursor remote launcher), the CLI returns a synthetic `Questions skipped by the user, continue with the information you already have` response for the `AskQuestion` tool in ~zero seconds with no error flag, because there is no IDE surface to render the question. The underlying model can interpret this as legitimate user consent and act on it. This patch intercepts the synthetic result in `cli/src/cursor/utils/cursorEventConverter.ts` and rewrites the `tool_call`/completed event to a structured `no_input_surface` failure (`status: 'failed'`, which downstream becomes `is_error: true`). Detection has two strategies: 1. String match - any `tool_call`/completed payload whose serialized form contains the synthetic-skip marker is rewritten. This is robust to wherever cursor-agent stuffs the marker inside the `tool_call` object. 2. Timing + name heuristic (defense in depth) - any completion that arrives within 500 ms of its 'started' event with a trivial result, for a tool call named `AskQuestion`, `askQuestion`, `ask_question`, or the converter's `unknown` fallback, is also rewritten. This catches the case where cursor-agent changes the synthetic-string text in a future release. The converter tracks per-call timestamps in a bounded `Map` (`<= 1024` entries, oldest evicted on overflow) and clears entries when the corresponding 'completed' event arrives. A small test-only reset hook isolates state between Vitest cases. This is a transitional safety patch. It auto-deletes when tiann#781's ACP launcher replaces the stream-json launcher and `cursor/ask_question` becomes a proper bidirectional ACP method where fabrication is structurally impossible. Scope is intentionally tiny: only `cli/src/cursor/utils/cursorEventConverter.ts`, its colocated Vitest file, and a section in `docs/guide/cursor.md`. No changes to `cursorRemoteLauncher.ts`, ACP code, web normalizer, or permission UI. Refs: tiann#781 (long-term resolution via ACP migration) Closes: tiann#784
There was a problem hiding this comment.
Findings
- [Major] Synthetic marker check fails unrelated tool results —
containsSyntheticSkipMarker(opts.toolCall)scans the whole completed tool payload before checking the tool type/result. Because this PR also adds the literal marker todocs/guide/cursor.md:53, a normal Cursorread_filecompletion for that file would include the marker inreadToolCall.result.content, hitcli/src/cursor/utils/cursorEventConverter.ts:177, and be emitted asstatus: 'failed'withno_input_surface. That corrupts unrelated tool results.
Suggested fix:function containsSyntheticSkipMarker(value: unknown): boolean { if (typeof value === 'string') return value.includes(SYNTHETIC_SKIP_MARKER); if (Array.isArray(value)) return value.some(containsSyntheticSkipMarker); if (value && typeof value === 'object') { return Object.values(value as Record<string, unknown>).some(containsSyntheticSkipMarker); } return false; } if (ASK_QUESTION_TOOL_NAMES.has(opts.name) && containsSyntheticSkipMarker(opts.result)) { return true; }
Summary
- Review mode: initial
- Found one correctness regression in the intercept scope. Residual risk: the synthetic-string path needs a test proving non-AskQuestion tool outputs containing the marker are passed through.
Testing
- Not run (automation)
HAPI Bot
…ann#801 review) Address regression flagged by the HAPI auto-review bot on tiann#801: `containsSyntheticSkipMarker` previously stringified the entire `tool_call` payload and matched the literal marker substring. Because this PR also adds that exact marker to `docs/guide/cursor.md` (to document the intercept), a Cursor `read_file` of that documentation page would surface the marker inside `readToolCall.result.content` and be rewritten as a `no_input_surface` failure, corrupting an unrelated, legitimate result. The intercept is now gated on the tool name resolving to an AskQuestion-shaped call (`AskQuestion`, `askQuestion`, `ask_question`, or the converter's `unknown` fallback for unnamed function-shaped tools). `read_file` / `write_file` tool calls - which have explicit `read_file` and `write_file` names from `extractToolName` - no longer fall under the intercept, regardless of what their payload contains. The marker check itself now walks values recursively (string / array / object), guarded by a `WeakSet` against cycles, instead of relying on `JSON.stringify`. Slightly tidier; behaviour is otherwise unchanged for the AskQuestion path. Regression tests added: - `read_file` result whose `content` contains the marker -> passes through with `status: 'completed'` and no `no_input_surface`. - `write_file` whose serialized `args` contain the marker -> same. - A non-AskQuestion function tool (`MyCustomTool`) whose result quotes the marker -> same. All 846 cli tests pass (17 in this file). `bun run typecheck` exits 0.
There was a problem hiding this comment.
Findings
- [Major] Marker match still scans AskQuestion arguments —
shouldRewriteAsNoInputSurfacenow gates by tool name, butcontainsSyntheticSkipMarker(opts.toolCall)still searches the entire function payload. Cursor completed function payloads include the originalargumentsalongside the result, so a legitimate AskQuestion whose prompt quotes the documented synthetic string will be rewritten asno_input_surfaceeven if the user actually answered. Scope marker detection to the extracted result, and make function result extraction returnfunction.result. Evidencecli/src/cursor/utils/cursorEventConverter.ts:207.
Suggested fix:function extractToolResult(toolCall: Record<string, unknown>): unknown { if (toolCall.readToolCall && typeof toolCall.readToolCall === 'object') { const r = toolCall.readToolCall as Record<string, unknown>; return r.result ?? r; } if (toolCall.writeToolCall && typeof toolCall.writeToolCall === 'object') { const w = toolCall.writeToolCall as Record<string, unknown>; return w.result ?? w; } if (toolCall.function && typeof toolCall.function === 'object') { const fn = toolCall.function as Record<string, unknown>; return fn.result ?? {}; } return {}; } if (containsSyntheticSkipMarker(opts.result)) { return true; }
Summary
- Review mode: follow-up after new commits
- The prior read/write/non-AskQuestion false positive is covered by the new gating and regression tests, but the marker path still false-positives when the AskQuestion prompt itself quotes the marker. Residual risk: focused tests were not executed in this environment.
Testing
- Not run:
bun test cli/src/cursor/utils/cursorEventConverter.test.tsfailed becausebunis not installed.
HAPI Bot
…PR tiann#801 review-2) Address second Major finding from the HAPI auto-review bot on tiann#801: After the previous fix gated the intercept on the tool name, the marker check still recursed into the entire `tool_call` object - which includes `function.arguments`, the agent's own prompt text. A legitimate AskQuestion whose prompt quotes the synthetic-skip marker (e.g. an agent debugging this exact bug, or any prompt that pastes the marker verbatim) would have been rewritten as `no_input_surface` even when the operator actually answered. Changes: 1. `extractToolResult` now extracts the cursor-side response from function-shaped tool calls. Previously it returned `{}` for anything that wasn't `readToolCall` or `writeToolCall`. It now returns `function.result` when present, otherwise every field of `function` except `name` and `arguments`. This excludes the agent's input from what downstream sees as the tool result, and as a side effect surfaces the actual cursor response for function-shaped tools (which was previously lost - see the tiann#784 incident note about HAPI storing `output: {}` for AskQuestion in the message DB). 2. `shouldRewriteAsNoInputSurface` now searches only the extracted `result`, not the whole `tool_call`. The bot's exact recommendation. 3. Test added: an AskQuestion whose `arguments` quote the marker but whose `result` is a real user answer, with elapsed time past the 500 ms threshold so the timing heuristic does not apply. Asserts the tool_result passes through with `status: 'completed'` and the operator's actual answer. All 847 cli tests pass (18 in `cursorEventConverter.test.ts`). `bun run typecheck` exits 0. The widened `extractToolResult` scope is necessary for the marker check to actually find the synthetic string (it lives inside `function.result` or a sibling field), and is the bot's explicit recommendation. It also removes the long-standing data-loss bug where AskQuestion responses were surfaced to the message DB as opaque `{}` - regardless of fabrication.
There was a problem hiding this comment.
Findings
None.
Summary
- Review mode: follow-up after new commits
- No high-confidence issues found in the latest diff. The follow-up scopes the synthetic-marker check to the extracted Cursor function result, avoiding the prior
function.argumentsfalse positive. Residual risk: this environment cannot execute Bun-based validation.
Testing
- Not run:
bun test cli/src/cursor/utils/cursorEventConverter.test.tsandbun typecheckfailed becausebunis not installed in this environment.
HAPI Bot
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
When
cursor-agentruns under--print --output-format stream-json(HAPI's current Cursor remote launcher), the CLI returns a syntheticQuestions skipped by the user, continue with the information you already haveresponse for theAskQuestiontool in ~zero seconds withis_error=false, because there is no IDE surface to render the question. The agent's underlying model interprets this as legitimate user consent and acts on it.This patch intercepts the synthetic result in
cli/src/cursor/utils/cursorEventConverter.tsand rewrites thetool_call/completed event to a structuredno_input_surfacefailure (status: 'failed', whichconvertAgentMessagealready maps tois_error: trueon the wire). Agents now see an explicit error and can fall back to plain-text prompting instead of acting on fabricated consent.Closes #784.
Why this framing - transitional, auto-deletes with #781
This is a transitional safety patch, not an architectural commitment. #781 (ACP migration, owner @swear01) replaces the stream-json launcher with an ACP launcher in which
cursor/ask_questionis a proper bidirectional protocol method - fabrication is structurally impossible in that mode. That migration is multi-PR and weeks of work.This PR's scope is deliberately tiny so the migration doesn't fight it:
cli/src/cursor/utils/cursorEventConverter.ts(plus its colocated test and one section indocs/guide/cursor.md).cursorRemoteLauncher.ts, ACP code, ACP types, the web normalizer, or the permission UI - all Migrate Cursor integration fromagent -pstream-json mode to Cursor ACP mode #781's surface area.git rmof the lines.Refs #781 for the long-term resolution.
What changes
cli/src/cursor/utils/cursorEventConverter.ts:tool_call/completed event, the converter serializes the rawtool_callpayload and checks for the literalQuestions skipped by the user, continue with the information you already havemarker. If present, the event is rewritten:status: 'failed'(downstream becomesis_error: true)output: { kind: 'no_input_surface', message: 'cursor-agent fabricated a skip response in headless mode. The operator did not respond. Re-prompt in plain text and wait for a real user message before proceeding.' }Map<callId, startedAt>records when each tool call'sstartedevent arrives. If thecompletedevent arrives within 500 ms with a trivial result for a tool namedAskQuestion,askQuestion,ask_question, or the converter'sunknownfallback, the same rewrite is applied. This catches the case where cursor-agent changes the synthetic-string text in a future release.The Map is bounded at 1024 entries (oldest evicted on overflow) so it cannot leak memory if
completedevents are ever missed. A small test-only__resetCursorEventConverterStateForTestshook isolates state between Vitest cases.The
kind: 'no_input_surface'string is intentionally stable so downstream agents and docs can reference it as a contract.Tests
cli/src/cursor/utils/cursorEventConverter.test.tsextended with 8 new cases (plus 1 added smoke test for the existing read_file passthrough):tool_call/completed result containing the synthetic-skip string to ano_input_surfacefailure.tool_callpayload.AskQuestioncompletion with a trivial result, even without the synthetic string.name=unknown.read_file/write_fileresults even with empty payloads.AskQuestioncompletion that took longer than the synthetic threshold (550 ms async wait).Local results:
bun run typecheckexits 0,bun run testincli/passes 843/843 tests (14 in this file, all green).Docs
docs/guide/cursor.mdgains a "Headless safety: AskQuestion behavior" section explaining the synthetic response, the HAPI intercept, the defense-in-depth latency check, theno_input_surfacecontract, and the expected lifecycle (auto-removal when #781 lands).Adjacent patterns flagged for future work, not in this PR
Per #784's stopgap framing comment, similar fabrication may exist for other Cursor tools that have no headless surface (
cursor/create_plan,cursor/update_todos, etc.). This PR is intentionally laser-focused on the documentedAskQuestionbug. Future patches can cite this one as the pattern if they materialize - or, more likely, #781 lands first and the whole question becomes moot.Constraints honored
upstream/mainonly (no fork-internal commits).cli/src/cursor/utils/cursorEventConverter.ts, its test, anddocs/guide/cursor.md.cursorRemoteLauncher.ts, ACP code, web UI, or permission types.Test plan for reviewers
bun install && bun run typecheckfrom repo rootcd cli && bun run test(orbun run testfrom root)AskQuestion. Confirm thetool_resultHAPI emits to the web/SSE layer carriesis_error: truewith theno_input_surfacepayload instead of the silent synthetic skip.