fix(cursor): intercept fabricated Questions skipped AskQuestion result in headless mode (#784) by heavygee · Pull Request #801 · tiann/hapi

heavygee · 2026-06-04T16:13:47Z

Summary

When cursor-agent runs under --print --output-format stream-json (HAPI's current Cursor remote launcher), the CLI returns a synthetic Questions skipped by the user, continue with the information you already have response for the AskQuestion tool in ~zero seconds with is_error=false, because there is no IDE surface to render the question. The agent's underlying model interprets this as legitimate user consent and acts on it.

This patch intercepts the synthetic result in cli/src/cursor/utils/cursorEventConverter.ts and rewrites the tool_call/completed event to a structured no_input_surface failure (status: 'failed', which convertAgentMessage already maps to is_error: true on the wire). Agents now see an explicit error and can fall back to plain-text prompting instead of acting on fabricated consent.

Closes #784.

Why this framing - transitional, auto-deletes with #781

This is a transitional safety patch, not an architectural commitment. #781 (ACP migration, owner @swear01) replaces the stream-json launcher with an ACP launcher in which cursor/ask_question is a proper bidirectional protocol method - fabrication is structurally impossible in that mode. That migration is multi-PR and weeks of work.

This PR's scope is deliberately tiny so the migration doesn't fight it:

Touches only cli/src/cursor/utils/cursorEventConverter.ts (plus its colocated test and one section in docs/guide/cursor.md).
Does not touch cursorRemoteLauncher.ts, ACP code, ACP types, the web normalizer, or the permission UI - all Migrate Cursor integration from agent -p stream-json mode to Cursor ACP mode #781's surface area.
Trivially removable: when the ACP launcher replaces the stream-json launcher, the intercept (and its test) goes away with a one-PR git rm of the lines.

Refs #781 for the long-term resolution.

What changes

cli/src/cursor/utils/cursorEventConverter.ts:

String-match intercept. Before forwarding a tool_call/completed event, the converter serializes the raw tool_call payload and checks for the literal Questions skipped by the user, continue with the information you already have marker. If present, the event is rewritten:
- status: 'failed' (downstream becomes is_error: true)
- output: { kind: 'no_input_surface', message: 'cursor-agent fabricated a skip response in headless mode. The operator did not respond. Re-prompt in plain text and wait for a real user message before proceeding.' }
Timing-signature defense-in-depth. A bounded Map<callId, startedAt> records when each tool call's started event arrives. If the completed event arrives within 500 ms with a trivial result for a tool named AskQuestion, askQuestion, ask_question, or the converter's unknown fallback, the same rewrite is applied. This catches the case where cursor-agent changes the synthetic-string text in a future release.

The Map is bounded at 1024 entries (oldest evicted on overflow) so it cannot leak memory if completed events are ever missed. A small test-only __resetCursorEventConverterStateForTests hook isolates state between Vitest cases.

The kind: 'no_input_surface' string is intentionally stable so downstream agents and docs can reference it as a contract.

Tests

cli/src/cursor/utils/cursorEventConverter.test.ts extended with 8 new cases (plus 1 added smoke test for the existing read_file passthrough):

Rewrites a tool_call/completed result containing the synthetic-skip string to a no_input_surface failure.
Matches the synthetic-skip string even when nested deep inside the tool_call payload.
Rewrites a sub-500 ms AskQuestion completion with a trivial result, even without the synthetic string.
Rewrites a sub-500 ms completion when the converter falls back to name=unknown.
Does not rewrite a normal function-tool completion with a real result.
Does not rewrite read_file / write_file results even with empty payloads.
Does not rewrite an AskQuestion completion that took longer than the synthetic threshold (550 ms async wait).
Passes through a normal read_file tool result unchanged.

Local results: bun run typecheck exits 0, bun run test in cli/ passes 843/843 tests (14 in this file, all green).

Docs

docs/guide/cursor.md gains a "Headless safety: AskQuestion behavior" section explaining the synthetic response, the HAPI intercept, the defense-in-depth latency check, the no_input_surface contract, and the expected lifecycle (auto-removal when #781 lands).

Adjacent patterns flagged for future work, not in this PR

Per #784's stopgap framing comment, similar fabrication may exist for other Cursor tools that have no headless surface (cursor/create_plan, cursor/update_todos, etc.). This PR is intentionally laser-focused on the documented AskQuestion bug. Future patches can cite this one as the pattern if they materialize - or, more likely, #781 lands first and the whole question becomes moot.

Constraints honored

Branch is based on upstream/main only (no fork-internal commits).
No changes outside cli/src/cursor/utils/cursorEventConverter.ts, its test, and docs/guide/cursor.md.
No changes to cursorRemoteLauncher.ts, ACP code, web UI, or permission types.
Neutral, factual commit/PR voice - no fork-private framing.

Test plan for reviewers

bun install && bun run typecheck from repo root
cd cli && bun run test (or bun run test from root)
Optional manual: spawn a Cursor remote session, prompt the agent to call AskQuestion. Confirm the tool_result HAPI emits to the web/SSE layer carries is_error: true with the no_input_surface payload instead of the silent synthetic skip.

…ult in headless mode (tiann#784) When cursor-agent runs under `--print --output-format stream-json` (HAPI's current Cursor remote launcher), the CLI returns a synthetic `Questions skipped by the user, continue with the information you already have` response for the `AskQuestion` tool in ~zero seconds with no error flag, because there is no IDE surface to render the question. The underlying model can interpret this as legitimate user consent and act on it. This patch intercepts the synthetic result in `cli/src/cursor/utils/cursorEventConverter.ts` and rewrites the `tool_call`/completed event to a structured `no_input_surface` failure (`status: 'failed'`, which downstream becomes `is_error: true`). Detection has two strategies: 1. String match - any `tool_call`/completed payload whose serialized form contains the synthetic-skip marker is rewritten. This is robust to wherever cursor-agent stuffs the marker inside the `tool_call` object. 2. Timing + name heuristic (defense in depth) - any completion that arrives within 500 ms of its 'started' event with a trivial result, for a tool call named `AskQuestion`, `askQuestion`, `ask_question`, or the converter's `unknown` fallback, is also rewritten. This catches the case where cursor-agent changes the synthetic-string text in a future release. The converter tracks per-call timestamps in a bounded `Map` (`<= 1024` entries, oldest evicted on overflow) and clears entries when the corresponding 'completed' event arrives. A small test-only reset hook isolates state between Vitest cases. This is a transitional safety patch. It auto-deletes when tiann#781's ACP launcher replaces the stream-json launcher and `cursor/ask_question` becomes a proper bidirectional ACP method where fabrication is structurally impossible. Scope is intentionally tiny: only `cli/src/cursor/utils/cursorEventConverter.ts`, its colocated Vitest file, and a section in `docs/guide/cursor.md`. No changes to `cursorRemoteLauncher.ts`, ACP code, web normalizer, or permission UI. Refs: tiann#781 (long-term resolution via ACP migration) Closes: tiann#784

github-actions

Findings

[Major] Synthetic marker check fails unrelated tool results — containsSyntheticSkipMarker(opts.toolCall) scans the whole completed tool payload before checking the tool type/result. Because this PR also adds the literal marker to docs/guide/cursor.md:53, a normal Cursor read_file completion for that file would include the marker in readToolCall.result.content, hit cli/src/cursor/utils/cursorEventConverter.ts:177, and be emitted as status: 'failed' with no_input_surface. That corrupts unrelated tool results.
Suggested fix:

function containsSyntheticSkipMarker(value: unknown): boolean {
    if (typeof value === 'string') return value.includes(SYNTHETIC_SKIP_MARKER);
    if (Array.isArray(value)) return value.some(containsSyntheticSkipMarker);
    if (value && typeof value === 'object') {
        return Object.values(value as Record<string, unknown>).some(containsSyntheticSkipMarker);
    }
    return false;
}

if (ASK_QUESTION_TOOL_NAMES.has(opts.name) && containsSyntheticSkipMarker(opts.result)) {
    return true;
}

Summary

Review mode: initial
Found one correctness regression in the intercept scope. Residual risk: the synthetic-string path needs a test proving non-AskQuestion tool outputs containing the marker are passed through.

Testing

Not run (automation)

HAPI Bot

…ann#801 review) Address regression flagged by the HAPI auto-review bot on tiann#801: `containsSyntheticSkipMarker` previously stringified the entire `tool_call` payload and matched the literal marker substring. Because this PR also adds that exact marker to `docs/guide/cursor.md` (to document the intercept), a Cursor `read_file` of that documentation page would surface the marker inside `readToolCall.result.content` and be rewritten as a `no_input_surface` failure, corrupting an unrelated, legitimate result. The intercept is now gated on the tool name resolving to an AskQuestion-shaped call (`AskQuestion`, `askQuestion`, `ask_question`, or the converter's `unknown` fallback for unnamed function-shaped tools). `read_file` / `write_file` tool calls - which have explicit `read_file` and `write_file` names from `extractToolName` - no longer fall under the intercept, regardless of what their payload contains. The marker check itself now walks values recursively (string / array / object), guarded by a `WeakSet` against cycles, instead of relying on `JSON.stringify`. Slightly tidier; behaviour is otherwise unchanged for the AskQuestion path. Regression tests added: - `read_file` result whose `content` contains the marker -> passes through with `status: 'completed'` and no `no_input_surface`. - `write_file` whose serialized `args` contain the marker -> same. - A non-AskQuestion function tool (`MyCustomTool`) whose result quotes the marker -> same. All 846 cli tests pass (17 in this file). `bun run typecheck` exits 0.

github-actions

Findings

[Major] Marker match still scans AskQuestion arguments — shouldRewriteAsNoInputSurface now gates by tool name, but containsSyntheticSkipMarker(opts.toolCall) still searches the entire function payload. Cursor completed function payloads include the original arguments alongside the result, so a legitimate AskQuestion whose prompt quotes the documented synthetic string will be rewritten as no_input_surface even if the user actually answered. Scope marker detection to the extracted result, and make function result extraction return function.result. Evidence cli/src/cursor/utils/cursorEventConverter.ts:207.
Suggested fix:

function extractToolResult(toolCall: Record<string, unknown>): unknown {
    if (toolCall.readToolCall && typeof toolCall.readToolCall === 'object') {
        const r = toolCall.readToolCall as Record<string, unknown>;
        return r.result ?? r;
    }
    if (toolCall.writeToolCall && typeof toolCall.writeToolCall === 'object') {
        const w = toolCall.writeToolCall as Record<string, unknown>;
        return w.result ?? w;
    }
    if (toolCall.function && typeof toolCall.function === 'object') {
        const fn = toolCall.function as Record<string, unknown>;
        return fn.result ?? {};
    }
    return {};
}

if (containsSyntheticSkipMarker(opts.result)) {
    return true;
}

Summary

Review mode: follow-up after new commits
The prior read/write/non-AskQuestion false positive is covered by the new gating and regression tests, but the marker path still false-positives when the AskQuestion prompt itself quotes the marker. Residual risk: focused tests were not executed in this environment.

Testing

Not run: bun test cli/src/cursor/utils/cursorEventConverter.test.ts failed because bun is not installed.

HAPI Bot

…PR tiann#801 review-2) Address second Major finding from the HAPI auto-review bot on tiann#801: After the previous fix gated the intercept on the tool name, the marker check still recursed into the entire `tool_call` object - which includes `function.arguments`, the agent's own prompt text. A legitimate AskQuestion whose prompt quotes the synthetic-skip marker (e.g. an agent debugging this exact bug, or any prompt that pastes the marker verbatim) would have been rewritten as `no_input_surface` even when the operator actually answered. Changes: 1. `extractToolResult` now extracts the cursor-side response from function-shaped tool calls. Previously it returned `{}` for anything that wasn't `readToolCall` or `writeToolCall`. It now returns `function.result` when present, otherwise every field of `function` except `name` and `arguments`. This excludes the agent's input from what downstream sees as the tool result, and as a side effect surfaces the actual cursor response for function-shaped tools (which was previously lost - see the tiann#784 incident note about HAPI storing `output: {}` for AskQuestion in the message DB). 2. `shouldRewriteAsNoInputSurface` now searches only the extracted `result`, not the whole `tool_call`. The bot's exact recommendation. 3. Test added: an AskQuestion whose `arguments` quote the marker but whose `result` is a real user answer, with elapsed time past the 500 ms threshold so the timing heuristic does not apply. Asserts the tool_result passes through with `status: 'completed'` and the operator's actual answer. All 847 cli tests pass (18 in `cursorEventConverter.test.ts`). `bun run typecheck` exits 0. The widened `extractToolResult` scope is necessary for the marker check to actually find the synthetic string (it lives inside `function.result` or a sibling field), and is the bot's explicit recommendation. It also removes the long-standing data-loss bug where AskQuestion responses were surfaced to the message DB as opaque `{}` - regardless of fabrication.

github-actions

Findings
None.

Summary

Review mode: follow-up after new commits
No high-confidence issues found in the latest diff. The follow-up scopes the synthetic-marker check to the extracted Cursor function result, avoiding the prior function.arguments false positive. Residual risk: this environment cannot execute Bun-based validation.

Testing

Not run: bun test cli/src/cursor/utils/cursorEventConverter.test.ts and bun typecheck failed because bun is not installed in this environment.

HAPI Bot

github-actions Bot reviewed Jun 4, 2026

View reviewed changes

Comment thread cli/src/cursor/utils/cursorEventConverter.ts Outdated

github-actions Bot reviewed Jun 4, 2026

View reviewed changes

Comment thread cli/src/cursor/utils/cursorEventConverter.ts Outdated

github-actions Bot reviewed Jun 4, 2026

View reviewed changes

heavygee mentioned this pull request Jun 4, 2026

Migrate Cursor integration from agent -p stream-json mode to Cursor ACP mode #781

Open

tiann merged commit dc0d21e into tiann:main Jun 5, 2026
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(cursor): intercept fabricated Questions skipped AskQuestion result in headless mode (#784)#801

fix(cursor): intercept fabricated Questions skipped AskQuestion result in headless mode (#784)#801
tiann merged 3 commits into
tiann:mainfrom
heavygee:fix/cursor-headless-askquestion-safety-784

heavygee commented Jun 4, 2026

Uh oh!

github-actions Bot left a comment

Uh oh!

Uh oh!

github-actions Bot left a comment

Uh oh!

Uh oh!

github-actions Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

heavygee commented Jun 4, 2026

Summary

Why this framing - transitional, auto-deletes with #781

What changes

Tests

Docs

Adjacent patterns flagged for future work, not in this PR

Constraints honored

Test plan for reviewers

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants