Skip to content

[codex] Fix dynamic handler test env cleanup#5

Closed
coe0718 wants to merge 2 commits intoghostwright:mainfrom
coe0718:codex/fix-dynamic-handlers-lint
Closed

[codex] Fix dynamic handler test env cleanup#5
coe0718 wants to merge 2 commits intoghostwright:mainfrom
coe0718:codex/fix-dynamic-handlers-lint

Conversation

@coe0718
Copy link
Copy Markdown
Contributor

@coe0718 coe0718 commented Mar 31, 2026

Summary

Fixes the pre-existing Biome lint failure in dynamic-handlers.test.ts without changing the test semantics.

What Changed

  • replaces delete process.env.* cleanup with a small restoreEnvVar() helper
  • uses Reflect.deleteProperty() so env vars are actually removed instead of being set to the string "undefined"
  • keeps the original env value when one existed before the test
  • applies Biome import ordering in the file

Why

Biome flags the delete operator in this test file, but its suggested replacement (process.env.KEY = undefined) is behaviorally unsafe for Node/Bun because env vars become the string "undefined" instead of being removed.

This keeps the lint fix small and behavior-preserving.

Validation

  • bash -lc 'export PATH="$HOME/.bun/bin:$PATH" && /home/coemedia/.bun/bin/bun test src/mcp/__tests__/dynamic-handlers.test.ts'
  • bun run lint
  • ./node_modules/.bin/biome check src/mcp/__tests__/dynamic-handlers.test.ts

@mcheemaa
Copy link
Copy Markdown
Member

Hey @coe0718 - thanks for catching this! Really appreciate you taking the time to put together a fix.

We ended up fixing the same lint issue in v0.18.0 (commit 2bca825, merged via #6), but looking at both approaches more carefully, your Reflect.deleteProperty() pattern is actually more correct than our process.env.X = undefined. Setting undefined on process.env can coerce to the string "undefined" instead of actually removing the key. Your approach properly removes it.

We're going to close this PR since the changes overlap and would conflict, but we'll adopt the Reflect.deleteProperty pattern in a follow-up. Good catch on the semantics.

Your memory PRs (#2, #3, #4) are excellent work - we're reviewing those now. Thanks again!

@mcheemaa mcheemaa closed this Mar 31, 2026
electronicBlacksmith referenced this pull request in electronicBlacksmith/phantom Apr 5, 2026
…y output

Four latent liveness/stability bugs on the MCP dynamic-tool execution path
would silently hang agent turns or crash the container. None surfaced visible
errors, which made them the worst kind of bug: the agent just stopped.

1. Pipe-buffer deadlock: executeShellHandler and executeScriptHandler drained
   stdout then stderr sequentially. Any handler writing >64KB to stderr before
   closing stdout (curl -v, git clone, npm install, verbose loggers) blocked
   on its next stderr write while phantom waited for stdout EOF forever.
   Fix: Promise.all over both streams via a new readStreamWithCap helper.

2. No subprocess timeout: Bun.spawn ran with no kill path. A hung handler
   froze the agent turn indefinitely with no recovery. Fix: drainProcessWithLimits
   schedules SIGTERM at HANDLER_TIMEOUT_MS (default 60s, env-overridable via
   PHANTOM_DYNAMIC_HANDLER_TIMEOUT_MS) and escalates to SIGKILL after a 2s
   grace. Timeouts report partial stderr so the agent has actionable signal.

3. No stdout/stderr size cap: new Response(stream).text() slurped unbounded
   output, risking OOM of the 2GB container. Fix: readStreamWithCap enforces
   a 1MB cap by default (PHANTOM_DYNAMIC_HANDLER_MAX_OUTPUT_BYTES), appends a
   clear truncation notice, and continues draining-to-void so the child never
   blocks on a full pipe buffer.

4. DynamicToolRegistry.registerAllOnServer had no per-tool guard. One tool
   with a bad inputSchema would throw during the loop and silently skip every
   subsequent tool on every agent query (MCP factory pattern recreates servers
   per query). Fix: per-tool try/catch, warn with tool name, continue. Broken
   tools are not auto-unregistered; the operator decides.

buildSafeEnv and the --env-file= pattern in executeScriptHandler are
unchanged, preserving the subprocess environment isolation boundary from
SECURITY.md. Tests spawn real subprocesses and include a 200KB-stderr
regression test that would hang under the old sequential-drain code.

Env-var cleanup in the new tests uses Reflect.deleteProperty(process.env, ...)
rather than `delete` (Biome noDelete) or `= undefined` (coerces to the string
"undefined" on process.env and does not actually unset the key). This matches
the pattern acknowledged as correct by the maintainer in #5.
imonlinux added a commit to imonlinux/phantom that referenced this pull request Apr 24, 2026
Complete security review and implementation of fixes for Nextcloud Talk
integration based on comprehensive security audit findings.

HIGH PRIORITY fixes (security-critical):
- ghostwright#1: Implement replay attack protection with LRU cache (5-minute TTL)
- ghostwright#2: Add 64KB request size limit before body buffering
- ghostwright#4: Replace Date.now() with crypto.randomUUID() for unique IDs
- ghostwright#7: Fix JSON unwrap logic for ActivityStreams Note objects
- ghostwright#11: Replace 'Error:' text sniffing with runtime error events

Logic and security fixes:
- ghostwright#3: Fix msgId/msg name collision in error handling
- ghostwright#5: Improve parseConversationId to handle colons in tokens
- ghostwright#6: Reject webhooks without target.id instead of silent fallback
- ghostwright#8: Normalize emoji to avoid variation selector validation issues
- ghostwright#9: Handle 404/409 reaction responses as success conditions
- ghostwright#10: Make setReaction return boolean for proper error handling
- ghostwright#12: Improve bot loop guard with actorId checking

Best practices and polish:
- ghostwright#13: Make port configurable instead of hardcoded 3200
- ghostwright#14: Move webhookPath default normalization to constructor
- ghostwright#15: Fix health check path precedence (check webhook first)
- ghostwright#16: Add exponential backoff retry for 5xx/429 responses
- ghostwright#17: Add URL validation and encoding for talkServer config
- ghostwright#18: Document HMAC signing asymmetry (inbound vs outbound)
- ghostwright#20: Import randomUUID explicitly from node:crypto
- ghostwright#21: Add reactions: true to channel capabilities
- ghostwright#22: Namespace environment variables with NEXTCLOUD_ prefix

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
mcheemaa added a commit that referenced this pull request Apr 25, 2026
… tests)

Tighten secret_name schema regex to match the metadata fetcher's
defense-in-depth check so invalid names fail at parse time rather than
crashing at boot inside the fetcher (finding 1). Split loader.test.ts
into loader.test.ts and loader-metadata.test.ts so neither file blows
the 300-line cap by much, with duplicated writeYaml/cleanup helpers and
distinct TEST_DIR paths to avoid races (finding 2). Document the
single as-unknown-as cast at the loader.ts header so future maintainers
know it is the only typed/generic boundary in the file (finding 3).
Add two error-path regression tests to metadata-fetcher.test.ts for
the 304-without-cache server-bug branch and the wrapped network-error
branch (finding 5).
imonlinux added a commit to imonlinux/phantom that referenced this pull request Apr 26, 2026
…k adapter

Implements full test suite for nextcloud.ts addressing all critical areas
identified in the nextcloud-talk-review document. 943 lines of tests
covering security, functionality, and edge cases.

Test coverage by category:

1. Signature verification (Fix ghostwright#1, ghostwright#18) - Security Critical
   - Valid HMAC signature acceptance
   - Invalid HMAC signature rejection
   - Replay attack protection via nonce cache
   - Nonce cache size limits (1000 entries, FIFO eviction)
   - Nonce expiration and periodic pruning (5-minute TTL)
   - Asymmetric signing (inbound: random+body, outbound: random+content)

2. Request size limits (Fix ghostwright#2) - Security Critical
   - Content-Length validation before buffering
   - Double-check after reading (missing Content-Length)
   - 64 KB limit enforcement (Nextcloud caps at 32k chars)

3. JSON unwrapping (Fix ghostwright#7) - Functionality Critical
   - ActivityStreams Note objects unwrap correctly
   - Plain text passes through unchanged
   - Literal JSON-like text not corrupted (only Note type unwraps)
   - Invalid JSON fallback to plain text

4. parseConversationId (Fix ghostwright#5) - Correctness Critical
   - Valid conversationId format parsing
   - Missing prefix returns null
   - Tokens containing colons handled correctly (indexOf+slice)
   - Thread-scoped ID to room token extraction

5. Bot loop guard (Fix ghostwright#12) - Multi-Bot Safety
   - Application actor filtering (actorType === "Application")
   - Self-filtering (actorId === config.botId)
   - Person messages processed normally
   - Multi-bot room scenarios

6. Retry and backoff (Fix ghostwright#16) - Resilience
   - 429 rate limiting with Retry-After header
   - 5xx server errors with exponential backoff + jitter
   - Network error retry logic
   - Non-retryable 4xx handling

7. Reaction error handling (Fix ghostwright#9)
   - 404 on remove treated as success
   - 409 on add treated as success
   - 5xx retry for reaction operations

8. URL validation and encoding (Fix ghostwright#17)
   - talkServer scheme removal (http://, https://)
   - Trailing slash removal
   - URL-encoding of roomToken and messageId

9. Target validation (Fix ghostwright#6)
   - Missing target.id rejection (no silent fallback)

10. Emoji normalization (Fix ghostwright#8)
    - Variation selector removal (U+26A0 vs U+26A0 U+FE0F)

11. Unique message IDs (Fix ghostwright#4)
    - crypto.randomUUID() vs Date.now()
    - Uniqueness across concurrent calls

12. Config normalization (Fix ghostwright#13, ghostwright#14)
    - webhookPath default in constructor
    - Configurable port
    - Session window configuration

13. Health check (Fix ghostwright#15)
    - Path precedence (webhook before health)

14. Message ID extraction
    - Numeric and string ID handling
    - Missing ID handling

15. Time-window session coalescing
    - Recent session continuation
    - New session creation
    - Parent message ID handling

16. Capabilities declaration (Fix ghostwright#21)
    - reactions: true declared

All tests use bun:test with mocked dependencies and follow existing
patterns from webhook.test.ts, slack.test.ts, and email.test.ts.

Related: nextcloud-talk-review.md Issue ghostwright#19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants