Skip to content

fix(cursor): intercept fabricated Questions skipped AskQuestion result in headless mode (#784)#801

Merged
tiann merged 3 commits into
tiann:mainfrom
heavygee:fix/cursor-headless-askquestion-safety-784
Jun 5, 2026
Merged

fix(cursor): intercept fabricated Questions skipped AskQuestion result in headless mode (#784)#801
tiann merged 3 commits into
tiann:mainfrom
heavygee:fix/cursor-headless-askquestion-safety-784

Conversation

@heavygee
Copy link
Copy Markdown
Collaborator

@heavygee heavygee commented Jun 4, 2026

Summary

When cursor-agent runs under --print --output-format stream-json (HAPI's current Cursor remote launcher), the CLI returns a synthetic Questions skipped by the user, continue with the information you already have response for the AskQuestion tool in ~zero seconds with is_error=false, because there is no IDE surface to render the question. The agent's underlying model interprets this as legitimate user consent and acts on it.

This patch intercepts the synthetic result in cli/src/cursor/utils/cursorEventConverter.ts and rewrites the tool_call/completed event to a structured no_input_surface failure (status: 'failed', which convertAgentMessage already maps to is_error: true on the wire). Agents now see an explicit error and can fall back to plain-text prompting instead of acting on fabricated consent.

Closes #784.

Why this framing - transitional, auto-deletes with #781

This is a transitional safety patch, not an architectural commitment. #781 (ACP migration, owner @swear01) replaces the stream-json launcher with an ACP launcher in which cursor/ask_question is a proper bidirectional protocol method - fabrication is structurally impossible in that mode. That migration is multi-PR and weeks of work.

This PR's scope is deliberately tiny so the migration doesn't fight it:

  • Touches only cli/src/cursor/utils/cursorEventConverter.ts (plus its colocated test and one section in docs/guide/cursor.md).
  • Does not touch cursorRemoteLauncher.ts, ACP code, ACP types, the web normalizer, or the permission UI - all Migrate Cursor integration from agent -p stream-json mode to Cursor ACP mode #781's surface area.
  • Trivially removable: when the ACP launcher replaces the stream-json launcher, the intercept (and its test) goes away with a one-PR git rm of the lines.

Refs #781 for the long-term resolution.

What changes

cli/src/cursor/utils/cursorEventConverter.ts:

  1. String-match intercept. Before forwarding a tool_call/completed event, the converter serializes the raw tool_call payload and checks for the literal Questions skipped by the user, continue with the information you already have marker. If present, the event is rewritten:
    • status: 'failed' (downstream becomes is_error: true)
    • output: { kind: 'no_input_surface', message: 'cursor-agent fabricated a skip response in headless mode. The operator did not respond. Re-prompt in plain text and wait for a real user message before proceeding.' }
  2. Timing-signature defense-in-depth. A bounded Map<callId, startedAt> records when each tool call's started event arrives. If the completed event arrives within 500 ms with a trivial result for a tool named AskQuestion, askQuestion, ask_question, or the converter's unknown fallback, the same rewrite is applied. This catches the case where cursor-agent changes the synthetic-string text in a future release.

The Map is bounded at 1024 entries (oldest evicted on overflow) so it cannot leak memory if completed events are ever missed. A small test-only __resetCursorEventConverterStateForTests hook isolates state between Vitest cases.

The kind: 'no_input_surface' string is intentionally stable so downstream agents and docs can reference it as a contract.

Tests

cli/src/cursor/utils/cursorEventConverter.test.ts extended with 8 new cases (plus 1 added smoke test for the existing read_file passthrough):

  • Rewrites a tool_call/completed result containing the synthetic-skip string to a no_input_surface failure.
  • Matches the synthetic-skip string even when nested deep inside the tool_call payload.
  • Rewrites a sub-500 ms AskQuestion completion with a trivial result, even without the synthetic string.
  • Rewrites a sub-500 ms completion when the converter falls back to name=unknown.
  • Does not rewrite a normal function-tool completion with a real result.
  • Does not rewrite read_file / write_file results even with empty payloads.
  • Does not rewrite an AskQuestion completion that took longer than the synthetic threshold (550 ms async wait).
  • Passes through a normal read_file tool result unchanged.

Local results: bun run typecheck exits 0, bun run test in cli/ passes 843/843 tests (14 in this file, all green).

Docs

docs/guide/cursor.md gains a "Headless safety: AskQuestion behavior" section explaining the synthetic response, the HAPI intercept, the defense-in-depth latency check, the no_input_surface contract, and the expected lifecycle (auto-removal when #781 lands).

Adjacent patterns flagged for future work, not in this PR

Per #784's stopgap framing comment, similar fabrication may exist for other Cursor tools that have no headless surface (cursor/create_plan, cursor/update_todos, etc.). This PR is intentionally laser-focused on the documented AskQuestion bug. Future patches can cite this one as the pattern if they materialize - or, more likely, #781 lands first and the whole question becomes moot.

Constraints honored

  • Branch is based on upstream/main only (no fork-internal commits).
  • No changes outside cli/src/cursor/utils/cursorEventConverter.ts, its test, and docs/guide/cursor.md.
  • No changes to cursorRemoteLauncher.ts, ACP code, web UI, or permission types.
  • Neutral, factual commit/PR voice - no fork-private framing.

Test plan for reviewers

  • bun install && bun run typecheck from repo root
  • cd cli && bun run test (or bun run test from root)
  • Optional manual: spawn a Cursor remote session, prompt the agent to call AskQuestion. Confirm the tool_result HAPI emits to the web/SSE layer carries is_error: true with the no_input_surface payload instead of the silent synthetic skip.

…ult in headless mode (tiann#784)

When cursor-agent runs under `--print --output-format stream-json` (HAPI's
current Cursor remote launcher), the CLI returns a synthetic
`Questions skipped by the user, continue with the information you already have`
response for the `AskQuestion` tool in ~zero seconds with no error flag,
because there is no IDE surface to render the question. The underlying
model can interpret this as legitimate user consent and act on it.

This patch intercepts the synthetic result in
`cli/src/cursor/utils/cursorEventConverter.ts` and rewrites the
`tool_call`/completed event to a structured `no_input_surface` failure
(`status: 'failed'`, which downstream becomes `is_error: true`).

Detection has two strategies:

1. String match - any `tool_call`/completed payload whose serialized form
   contains the synthetic-skip marker is rewritten. This is robust to
   wherever cursor-agent stuffs the marker inside the `tool_call` object.
2. Timing + name heuristic (defense in depth) - any completion that arrives
   within 500 ms of its 'started' event with a trivial result, for a tool
   call named `AskQuestion`, `askQuestion`, `ask_question`, or the
   converter's `unknown` fallback, is also rewritten. This catches the case
   where cursor-agent changes the synthetic-string text in a future release.

The converter tracks per-call timestamps in a bounded `Map` (`<= 1024`
entries, oldest evicted on overflow) and clears entries when the
corresponding 'completed' event arrives. A small test-only reset hook
isolates state between Vitest cases.

This is a transitional safety patch. It auto-deletes when tiann#781's ACP
launcher replaces the stream-json launcher and `cursor/ask_question`
becomes a proper bidirectional ACP method where fabrication is
structurally impossible.

Scope is intentionally tiny: only `cli/src/cursor/utils/cursorEventConverter.ts`,
its colocated Vitest file, and a section in `docs/guide/cursor.md`. No
changes to `cursorRemoteLauncher.ts`, ACP code, web normalizer, or
permission UI.

Refs: tiann#781 (long-term resolution via ACP migration)
Closes: tiann#784
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings

  • [Major] Synthetic marker check fails unrelated tool results — containsSyntheticSkipMarker(opts.toolCall) scans the whole completed tool payload before checking the tool type/result. Because this PR also adds the literal marker to docs/guide/cursor.md:53, a normal Cursor read_file completion for that file would include the marker in readToolCall.result.content, hit cli/src/cursor/utils/cursorEventConverter.ts:177, and be emitted as status: 'failed' with no_input_surface. That corrupts unrelated tool results.
    Suggested fix:
    function containsSyntheticSkipMarker(value: unknown): boolean {
        if (typeof value === 'string') return value.includes(SYNTHETIC_SKIP_MARKER);
        if (Array.isArray(value)) return value.some(containsSyntheticSkipMarker);
        if (value && typeof value === 'object') {
            return Object.values(value as Record<string, unknown>).some(containsSyntheticSkipMarker);
        }
        return false;
    }
    
    if (ASK_QUESTION_TOOL_NAMES.has(opts.name) && containsSyntheticSkipMarker(opts.result)) {
        return true;
    }

Summary

  • Review mode: initial
  • Found one correctness regression in the intercept scope. Residual risk: the synthetic-string path needs a test proving non-AskQuestion tool outputs containing the marker are passed through.

Testing

  • Not run (automation)

HAPI Bot

Comment thread cli/src/cursor/utils/cursorEventConverter.ts Outdated
…ann#801 review)

Address regression flagged by the HAPI auto-review bot on tiann#801:

`containsSyntheticSkipMarker` previously stringified the entire `tool_call`
payload and matched the literal marker substring. Because this PR also adds
that exact marker to `docs/guide/cursor.md` (to document the intercept), a
Cursor `read_file` of that documentation page would surface the marker
inside `readToolCall.result.content` and be rewritten as a
`no_input_surface` failure, corrupting an unrelated, legitimate result.

The intercept is now gated on the tool name resolving to an
AskQuestion-shaped call (`AskQuestion`, `askQuestion`, `ask_question`, or
the converter's `unknown` fallback for unnamed function-shaped tools).
`read_file` / `write_file` tool calls - which have explicit `read_file`
and `write_file` names from `extractToolName` - no longer fall under the
intercept, regardless of what their payload contains.

The marker check itself now walks values recursively (string / array /
object), guarded by a `WeakSet` against cycles, instead of relying on
`JSON.stringify`. Slightly tidier; behaviour is otherwise unchanged for
the AskQuestion path.

Regression tests added:

- `read_file` result whose `content` contains the marker -> passes
  through with `status: 'completed'` and no `no_input_surface`.
- `write_file` whose serialized `args` contain the marker -> same.
- A non-AskQuestion function tool (`MyCustomTool`) whose result quotes
  the marker -> same.

All 846 cli tests pass (17 in this file). `bun run typecheck` exits 0.
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings

  • [Major] Marker match still scans AskQuestion arguments — shouldRewriteAsNoInputSurface now gates by tool name, but containsSyntheticSkipMarker(opts.toolCall) still searches the entire function payload. Cursor completed function payloads include the original arguments alongside the result, so a legitimate AskQuestion whose prompt quotes the documented synthetic string will be rewritten as no_input_surface even if the user actually answered. Scope marker detection to the extracted result, and make function result extraction return function.result. Evidence cli/src/cursor/utils/cursorEventConverter.ts:207.
    Suggested fix:
    function extractToolResult(toolCall: Record<string, unknown>): unknown {
        if (toolCall.readToolCall && typeof toolCall.readToolCall === 'object') {
            const r = toolCall.readToolCall as Record<string, unknown>;
            return r.result ?? r;
        }
        if (toolCall.writeToolCall && typeof toolCall.writeToolCall === 'object') {
            const w = toolCall.writeToolCall as Record<string, unknown>;
            return w.result ?? w;
        }
        if (toolCall.function && typeof toolCall.function === 'object') {
            const fn = toolCall.function as Record<string, unknown>;
            return fn.result ?? {};
        }
        return {};
    }
    
    if (containsSyntheticSkipMarker(opts.result)) {
        return true;
    }

Summary

  • Review mode: follow-up after new commits
  • The prior read/write/non-AskQuestion false positive is covered by the new gating and regression tests, but the marker path still false-positives when the AskQuestion prompt itself quotes the marker. Residual risk: focused tests were not executed in this environment.

Testing

  • Not run: bun test cli/src/cursor/utils/cursorEventConverter.test.ts failed because bun is not installed.

HAPI Bot

Comment thread cli/src/cursor/utils/cursorEventConverter.ts Outdated
…PR tiann#801 review-2)

Address second Major finding from the HAPI auto-review bot on tiann#801:

After the previous fix gated the intercept on the tool name, the marker
check still recursed into the entire `tool_call` object - which includes
`function.arguments`, the agent's own prompt text. A legitimate
AskQuestion whose prompt quotes the synthetic-skip marker (e.g. an agent
debugging this exact bug, or any prompt that pastes the marker verbatim)
would have been rewritten as `no_input_surface` even when the operator
actually answered.

Changes:

1. `extractToolResult` now extracts the cursor-side response from
   function-shaped tool calls. Previously it returned `{}` for anything
   that wasn't `readToolCall` or `writeToolCall`. It now returns
   `function.result` when present, otherwise every field of `function`
   except `name` and `arguments`. This excludes the agent's input from
   what downstream sees as the tool result, and as a side effect surfaces
   the actual cursor response for function-shaped tools (which was
   previously lost - see the tiann#784 incident note about HAPI storing
   `output: {}` for AskQuestion in the message DB).

2. `shouldRewriteAsNoInputSurface` now searches only the extracted
   `result`, not the whole `tool_call`. The bot's exact recommendation.

3. Test added: an AskQuestion whose `arguments` quote the marker but
   whose `result` is a real user answer, with elapsed time past the
   500 ms threshold so the timing heuristic does not apply. Asserts the
   tool_result passes through with `status: 'completed'` and the
   operator's actual answer.

All 847 cli tests pass (18 in `cursorEventConverter.test.ts`).
`bun run typecheck` exits 0.

The widened `extractToolResult` scope is necessary for the marker check
to actually find the synthetic string (it lives inside `function.result`
or a sibling field), and is the bot's explicit recommendation. It also
removes the long-standing data-loss bug where AskQuestion responses were
surfaced to the message DB as opaque `{}` - regardless of fabrication.
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings
None.

Summary

  • Review mode: follow-up after new commits
  • No high-confidence issues found in the latest diff. The follow-up scopes the synthetic-marker check to the extracted Cursor function result, avoiding the prior function.arguments false positive. Residual risk: this environment cannot execute Bun-based validation.

Testing

  • Not run: bun test cli/src/cursor/utils/cursorEventConverter.test.ts and bun typecheck failed because bun is not installed in this environment.

HAPI Bot

@tiann tiann merged commit dc0d21e into tiann:main Jun 5, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants