Skip to content

Plan context file preloading for agent.create #319

@shiny-code-bot

Description

@shiny-code-bot

Intent

Add an explicit way for Every Code parent agents to launch subagents with selected file contents preloaded into the subagent's initial prompt. This is separate from the existing files field, which should remain a list of paths the subagent may inspect.

The motivating case is large-context agent work: a running GPT-5.5 Every Code session should be able to launch a code-gpt-5.4 subagent and intentionally inline 500k+ tokens of curated context without forcing the parent model to emit those tokens in a tool call. GPT-5.4 currently defaults to the expensive 1m context path in Every Code via context_mode = "auto"; context_mode = "disabled" keeps the standard context window.

Finish Line

agent.create supports explicit context_files with budgeted initial-prompt preloading while preserving files as lightweight path hints

Current Status

  • GitHub plan created from chat/agent validation on 2026-06-02.
  • Observed behavior: agent.create(files=[...]) passes paths as 'Files to consider', but does not inline file contents into the subagent prompt.
  • Smoke result: a code-gpt-5.4 nested agent launched with a synthetic ~650k-word file in files completed with only ~23k tokens used, confirming the file was not stuffed into the initial prompt.
  • Design direction validated with code-gpt-5.4, Claude, and Antigravity agents: add explicit context_files plus context_budget_tokens rather than overloading files; keep guardrails host-side so parent-agent token cost stays minimal.
  • Important implementation note from agent review: path scoping/canonicalization must be explicit so context_files cannot become arbitrary host file reads. Prefer resolving against the effective child workspace/worktree and rejecting escapes.
  • Next action: implement PR 1 in code-rs/core/src/agent_tool.rs with schema fields, prompt assembly helper, budget validation, path-safety tests, and docs.

Proposed API

{
  "files": ["src/"],
  "context_files": ["/tmp/large-context-bundle.txt"],
  "context_budget_tokens": 700000,
  "models": ["code-gpt-5.4"]
}
  • files: paths the subagent should consider or inspect; no contents are inlined.
  • context_files: paths whose contents the Every Code runtime snapshots and inlines into the subagent's initial prompt.
  • context_budget_tokens: explicit launch budget for inlined context. Large context should fail fast when this is omitted or too small.

Implementation Plan

  1. Update code-rs/core/src/agent_tool.rs schema and params.

    • Add context_files: Option<Vec<String>>.
    • Add context_budget_tokens: Option<u64>.
    • Keep files unchanged, but clarify its schema description as path hints rather than content inclusion.
  2. Add a context snapshot builder.

    • Read context_files from the parent workspace at create/launch time.
    • Reject directories for v1.
    • Reject obvious binary files.
    • Estimate tokens locally.
    • Enforce the caller budget.
    • Return a structured prompt block plus compact metadata.
  3. Insert preloaded context into subagent prompt assembly.

    • Include file metadata: path, bytes, estimated tokens, truncated/rejected state.
    • Wrap each file in clear delimiters.
    • Tell the subagent the listed files were preloaded and should not be re-read unless fresh contents are needed.
  4. Add launch summaries and errors.

    • Return compact metadata from agent.create: inlined file count, estimated token count, budget, truncation/rejection status.
    • Report batch fanout cost: estimated_tokens * number_of_agents.
    • Keep detailed manifests in progress/log output rather than dumping large metadata into the parent model context.
  5. Add tests.

    • files still only produces a Files to consider line.
    • context_files inlines file contents and metadata.
    • Oversized context without an explicit budget fails fast.
    • Explicit high budget allows large context.
    • Binary files are rejected.
    • Batch fanout summary is computed.
    • Add an opt-in/manual smoke for 500k+ tokens so CI does not burn model budget.
  6. Update docs and prompt guidance.

    • code-rs/core/prompt_coder.md
    • docs/agents.md
    • docs/config.md
    • Any generated tool/schema docs if applicable.

Guardrails

  • Host-side token estimation and budget checks, not parent-model reasoning.
  • Conservative default budget when context_files is present without context_budget_tokens.
  • Explicit high budget required for 500k+ context.
  • No recursive directory expansion in v1.
  • No automatic summarization in v1.
  • No interactive confirmation prompts inside the LLM loop.
  • Preserve current files semantics to avoid surprising cost changes.

Acceptance Criteria

  • A parent Every Code agent can call agent.create with context_files and launch a code-gpt-5.4 subagent whose first turn includes the file contents.
  • A 500k+ token synthetic context can be stuffed into a code-gpt-5.4 subagent when an explicit high budget is provided.
  • The same payload fails fast with a clear message when the explicit budget is absent or too small.
  • Existing files calls remain lightweight and do not inline content.
  • ./build-fast.sh passes cleanly.

Open Questions

  • Should context_budget_tokens default be 100k, 150k, or model-relative?
  • Should context_files allow absolute paths outside the repo when the parent session has access?
  • Should launch metadata be exposed through agent.status/agent.result in addition to create output?
  • Should context_files be allowed for write-mode agents whose worktree is created after launch, or should contents always snapshot from the parent checkout?

Metadata

Metadata

Assignees

No one assigned

    Labels

    planDurable planning issueplan:activeCurrent active plan

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions