[codex] Fix dynamic handler test env cleanup by coe0718 · Pull Request #5 · ghostwright/phantom

coe0718 · 2026-03-31T01:12:44Z

Summary

Fixes the pre-existing Biome lint failure in dynamic-handlers.test.ts without changing the test semantics.

What Changed

replaces delete process.env.* cleanup with a small restoreEnvVar() helper
uses Reflect.deleteProperty() so env vars are actually removed instead of being set to the string "undefined"
keeps the original env value when one existed before the test
applies Biome import ordering in the file

Why

Biome flags the delete operator in this test file, but its suggested replacement (process.env.KEY = undefined) is behaviorally unsafe for Node/Bun because env vars become the string "undefined" instead of being removed.

This keeps the lint fix small and behavior-preserving.

Validation

bash -lc 'export PATH="$HOME/.bun/bin:$PATH" && /home/coemedia/.bun/bin/bun test src/mcp/__tests__/dynamic-handlers.test.ts'
bun run lint
./node_modules/.bin/biome check src/mcp/__tests__/dynamic-handlers.test.ts

mcheemaa · 2026-03-31T04:39:52Z

Hey @coe0718 - thanks for catching this! Really appreciate you taking the time to put together a fix.

We ended up fixing the same lint issue in v0.18.0 (commit 2bca825, merged via #6), but looking at both approaches more carefully, your Reflect.deleteProperty() pattern is actually more correct than our process.env.X = undefined. Setting undefined on process.env can coerce to the string "undefined" instead of actually removing the key. Your approach properly removes it.

We're going to close this PR since the changes overlap and would conflict, but we'll adopt the Reflect.deleteProperty pattern in a follow-up. Good catch on the semantics.

Your memory PRs (#2, #3, #4) are excellent work - we're reviewing those now. Thanks again!

…y output Four latent liveness/stability bugs on the MCP dynamic-tool execution path would silently hang agent turns or crash the container. None surfaced visible errors, which made them the worst kind of bug: the agent just stopped. 1. Pipe-buffer deadlock: executeShellHandler and executeScriptHandler drained stdout then stderr sequentially. Any handler writing >64KB to stderr before closing stdout (curl -v, git clone, npm install, verbose loggers) blocked on its next stderr write while phantom waited for stdout EOF forever. Fix: Promise.all over both streams via a new readStreamWithCap helper. 2. No subprocess timeout: Bun.spawn ran with no kill path. A hung handler froze the agent turn indefinitely with no recovery. Fix: drainProcessWithLimits schedules SIGTERM at HANDLER_TIMEOUT_MS (default 60s, env-overridable via PHANTOM_DYNAMIC_HANDLER_TIMEOUT_MS) and escalates to SIGKILL after a 2s grace. Timeouts report partial stderr so the agent has actionable signal. 3. No stdout/stderr size cap: new Response(stream).text() slurped unbounded output, risking OOM of the 2GB container. Fix: readStreamWithCap enforces a 1MB cap by default (PHANTOM_DYNAMIC_HANDLER_MAX_OUTPUT_BYTES), appends a clear truncation notice, and continues draining-to-void so the child never blocks on a full pipe buffer. 4. DynamicToolRegistry.registerAllOnServer had no per-tool guard. One tool with a bad inputSchema would throw during the loop and silently skip every subsequent tool on every agent query (MCP factory pattern recreates servers per query). Fix: per-tool try/catch, warn with tool name, continue. Broken tools are not auto-unregistered; the operator decides. buildSafeEnv and the --env-file= pattern in executeScriptHandler are unchanged, preserving the subprocess environment isolation boundary from SECURITY.md. Tests spawn real subprocesses and include a 200KB-stderr regression test that would hang under the old sequential-drain code. Env-var cleanup in the new tests uses Reflect.deleteProperty(process.env, ...) rather than `delete` (Biome noDelete) or `= undefined` (coerces to the string "undefined" on process.env and does not actually unset the key). This matches the pattern acknowledged as correct by the maintainer in #5.

Complete security review and implementation of fixes for Nextcloud Talk integration based on comprehensive security audit findings. HIGH PRIORITY fixes (security-critical): - ghostwright#1: Implement replay attack protection with LRU cache (5-minute TTL) - ghostwright#2: Add 64KB request size limit before body buffering - ghostwright#4: Replace Date.now() with crypto.randomUUID() for unique IDs - ghostwright#7: Fix JSON unwrap logic for ActivityStreams Note objects - ghostwright#11: Replace 'Error:' text sniffing with runtime error events Logic and security fixes: - ghostwright#3: Fix msgId/msg name collision in error handling - ghostwright#5: Improve parseConversationId to handle colons in tokens - ghostwright#6: Reject webhooks without target.id instead of silent fallback - ghostwright#8: Normalize emoji to avoid variation selector validation issues - ghostwright#9: Handle 404/409 reaction responses as success conditions - ghostwright#10: Make setReaction return boolean for proper error handling - ghostwright#12: Improve bot loop guard with actorId checking Best practices and polish: - ghostwright#13: Make port configurable instead of hardcoded 3200 - ghostwright#14: Move webhookPath default normalization to constructor - ghostwright#15: Fix health check path precedence (check webhook first) - ghostwright#16: Add exponential backoff retry for 5xx/429 responses - ghostwright#17: Add URL validation and encoding for talkServer config - ghostwright#18: Document HMAC signing asymmetry (inbound vs outbound) - ghostwright#20: Import randomUUID explicitly from node:crypto - ghostwright#21: Add reactions: true to channel capabilities - ghostwright#22: Namespace environment variables with NEXTCLOUD_ prefix Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

… tests) Tighten secret_name schema regex to match the metadata fetcher's defense-in-depth check so invalid names fail at parse time rather than crashing at boot inside the fetcher (finding 1). Split loader.test.ts into loader.test.ts and loader-metadata.test.ts so neither file blows the 300-line cap by much, with duplicated writeYaml/cleanup helpers and distinct TEST_DIR paths to avoid races (finding 2). Document the single as-unknown-as cast at the loader.ts header so future maintainers know it is the only typed/generic boundary in the file (finding 3). Add two error-path regression tests to metadata-fetcher.test.ts for the 304-without-cache server-bug branch and the wrapped network-error branch (finding 5).

…k adapter Implements full test suite for nextcloud.ts addressing all critical areas identified in the nextcloud-talk-review document. 943 lines of tests covering security, functionality, and edge cases. Test coverage by category: 1. Signature verification (Fix ghostwright#1, ghostwright#18) - Security Critical - Valid HMAC signature acceptance - Invalid HMAC signature rejection - Replay attack protection via nonce cache - Nonce cache size limits (1000 entries, FIFO eviction) - Nonce expiration and periodic pruning (5-minute TTL) - Asymmetric signing (inbound: random+body, outbound: random+content) 2. Request size limits (Fix ghostwright#2) - Security Critical - Content-Length validation before buffering - Double-check after reading (missing Content-Length) - 64 KB limit enforcement (Nextcloud caps at 32k chars) 3. JSON unwrapping (Fix ghostwright#7) - Functionality Critical - ActivityStreams Note objects unwrap correctly - Plain text passes through unchanged - Literal JSON-like text not corrupted (only Note type unwraps) - Invalid JSON fallback to plain text 4. parseConversationId (Fix ghostwright#5) - Correctness Critical - Valid conversationId format parsing - Missing prefix returns null - Tokens containing colons handled correctly (indexOf+slice) - Thread-scoped ID to room token extraction 5. Bot loop guard (Fix ghostwright#12) - Multi-Bot Safety - Application actor filtering (actorType === "Application") - Self-filtering (actorId === config.botId) - Person messages processed normally - Multi-bot room scenarios 6. Retry and backoff (Fix ghostwright#16) - Resilience - 429 rate limiting with Retry-After header - 5xx server errors with exponential backoff + jitter - Network error retry logic - Non-retryable 4xx handling 7. Reaction error handling (Fix ghostwright#9) - 404 on remove treated as success - 409 on add treated as success - 5xx retry for reaction operations 8. URL validation and encoding (Fix ghostwright#17) - talkServer scheme removal (http://, https://) - Trailing slash removal - URL-encoding of roomToken and messageId 9. Target validation (Fix ghostwright#6) - Missing target.id rejection (no silent fallback) 10. Emoji normalization (Fix ghostwright#8) - Variation selector removal (U+26A0 vs U+26A0 U+FE0F) 11. Unique message IDs (Fix ghostwright#4) - crypto.randomUUID() vs Date.now() - Uniqueness across concurrent calls 12. Config normalization (Fix ghostwright#13, ghostwright#14) - webhookPath default in constructor - Configurable port - Session window configuration 13. Health check (Fix ghostwright#15) - Path precedence (webhook before health) 14. Message ID extraction - Numeric and string ID handling - Missing ID handling 15. Time-window session coalescing - Recent session continuation - New session creation - Parent message ID handling 16. Capabilities declaration (Fix ghostwright#21) - reactions: true declared All tests use bun:test with mocked dependencies and follow existing patterns from webhook.test.ts, slack.test.ts, and email.test.ts. Related: nextcloud-talk-review.md Issue ghostwright#19

coe0718 added 2 commits March 30, 2026 21:11

Fix dynamic handler test env cleanup

1ceabb9

Stabilize crypto tampering tests

dc005a9

mcheemaa closed this Mar 31, 2026

electronicBlacksmith mentioned this pull request Apr 5, 2026

fix: harden dynamic tool handlers against deadlock, hangs, and runaway output #36

Closed

5 tasks

mcheemaa mentioned this pull request Apr 25, 2026

channels: add Slack HTTP receiver for distributed-app mode #93

Merged

9 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[codex] Fix dynamic handler test env cleanup#5

[codex] Fix dynamic handler test env cleanup#5
coe0718 wants to merge 2 commits intoghostwright:mainfrom
coe0718:codex/fix-dynamic-handlers-lint

coe0718 commented Mar 31, 2026

Uh oh!

mcheemaa commented Mar 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants