Skip to content

cc/google/fhir: limit JSON parser recursion depth to prevent stack overflow#522

Open
dapickle wants to merge 1 commit intogoogle:masterfrom
dapickle:fix/json-parser-recursion-depth-limit
Open

cc/google/fhir: limit JSON parser recursion depth to prevent stack overflow#522
dapickle wants to merge 1 commit intogoogle:masterfrom
dapickle:fix/json-parser-recursion-depth-limit

Conversation

@dapickle
Copy link
Copy Markdown

Problem

The mutual recursion in json_parser.cc has no depth guard:

MergeValue → MergeMessage → MergeField → ParseFieldValue → MergeValue

A FHIR JSON payload with deeply nested extension objects (depth ≥ ~500) exhausts the default 8 MiB Linux thread stack, crashing the process with SIGSEGV. The crash is process-terminating and not catchable via absl::Status error handling. A single ~170 KB HTTP request containing a syntactically valid FHIR R4 Patient resource is sufficient to kill any C++ server process that calls MergeJsonFhirStringIntoProto() on user-supplied data.

The Parser class private section contains only primitive_handler_ and default_timezone_ — no depth counter, no stack guard.

The nlohmann SAX parser (Phase 1, json_sax_handler.cc:321) does not protect against this. nlohmann v3.7.3 uses an iterative sax_parse_internal with a heap-allocated std::vector<bool> states, so the FhirJson tree is built safely at any depth. Phase 2 (MergeValue) then recurses on it without limit.

Fix

Add static constexpr int kMaxNestingDepth = 500 to the Parser class and thread a depth counter through all four mutually-recursive functions (MergeValue, MergeMessage, MergeField, ParseFieldValue). The counter increments only at ParseFieldValue → MergeValue (once per FHIR object nesting level). When the limit is exceeded, MergeValue returns an absl::InvalidArgumentError via the ScopedErrorReporter — same error path used throughout the parser.

The Go implementation (go/jsonformat/unmarshaller.go) has an equivalent MaxNestingDepth / checkCurrentDepth mechanism. This brings the C++ library to parity.

kMaxNestingDepth = 500 is orders of magnitude above any real-world FHIR nesting depth while keeping worst-case stack usage well under 1 MB (500 levels × 4 frames × ~400 bytes).

…erflow

The mutual recursion in json_parser.cc has no depth guard:
  MergeValue → MergeMessage → MergeField → ParseFieldValue → MergeValue

A FHIR JSON payload with deeply nested extension objects (e.g. depth=500+)
exhausts the default 8 MiB Linux thread stack, crashing the process with
SIGSEGV. The crash is process-terminating and not catchable via absl::Status.

The nlohmann SAX parser (Phase 1) does not protect against this: v3.7.3
uses an iterative sax_parse_internal with a heap-allocated std::vector<bool>
states, so arbitrarily deep JSON is parsed safely into a FhirJson tree.
Phase 2 (MergeValue) then recurses on that tree with no guard.

Fix: add kMaxNestingDepth = 500 and thread a depth counter through all four
mutually-recursive functions. The counter increments only at
ParseFieldValue → MergeValue (once per FHIR object nesting level).
MergeValue returns an InvalidArgumentError via the ScopedErrorReporter
when the limit is exceeded.

The Go implementation (go/jsonformat/unmarshaller.go) has an equivalent
MaxNestingDepth / checkCurrentDepth mechanism; this brings the C++ library
to parity.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant