Skip to content

Fix 500 error on operation logs download due to JSON serialization#3366

Open
Bhanunamikaze wants to merge 2 commits into
apache:masterfrom
Bhanunamikaze:fix-special-chars-500
Open

Fix 500 error on operation logs download due to JSON serialization#3366
Bhanunamikaze wants to merge 2 commits into
apache:masterfrom
Bhanunamikaze:fix-special-chars-500

Conversation

@Bhanunamikaze
Copy link
Copy Markdown

@Bhanunamikaze Bhanunamikaze commented Apr 22, 2026

Description of Issue

When operations execute agent abilities that output untrusted or malformed data—such as random binary dumps, unescaped null characters (\x00), or invalid UTF-16 surrogates (U+D800 - U+DFFF)—the Caldera server can silently store these outputs in the raw result logs.

When a user attempts to download the operation's Event Logs or Report via the web UI or API (/api/v2/operations/<id>/event-logs), these stored characters break Python's json.dumps mechanism locally inside web.json_response(). This leads directly to a 500 Internal Server Error (UnicodeEncodeError), entirely blocking the user from downloading the logs for operations that encountered this scenario.

Proposed Fix

This PR implements a defense-in-depth sanitization approach to harden the serialization pipeline, prioritizing availability and readability of logs over pure verbatim rendering of corrupted agent output blocks:

  1. Base-level Decoders: Updates BaseWorld.decode_bytes to strictly strip surrogate character anomalies dynamically to lock the initial layer.
  2. Dynamic JSON Sanitization Pipeline: Introduces _sanitize_for_json dynamically inside c_operation.py. If a specific Link or ability output string is un-serializable, it falls back to safely encoding it as pure ASCII, escaping dangerous non-printable bytes.
  3. Array Safeties: Wraps the internal report parsing iterators and _convert_link_to_event_log generators in active try/except bounds. If a single agent link returns a fundamentally malformed payload that physically breaks json.loads or custom dict mapping, it isolates that specific block, falls back to raw data, and lets the rest of the report generate gracefully rather than killing the entire request.
  4. API Safety Nets: Upgrades the final output response in operation_api.py to enforce ensure_ascii=True, locking any dynamically missed payloads down before they are dispatched through the asynchronous web router.

@Bhanunamikaze Bhanunamikaze force-pushed the fix-special-chars-500 branch from a967a4e to 7466e68 Compare April 22, 2026 14:30
@sonarqubecloud
Copy link
Copy Markdown

Quality Gate Failed Quality Gate failed

Failed conditions
0.0% Coverage on New Code (required ≥ 80%)

See analysis details on SonarQube Cloud

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant