cc/google/fhir: limit JSON parser recursion depth to prevent stack overflow#522
Open
dapickle wants to merge 1 commit intogoogle:masterfrom
Open
cc/google/fhir: limit JSON parser recursion depth to prevent stack overflow#522dapickle wants to merge 1 commit intogoogle:masterfrom
dapickle wants to merge 1 commit intogoogle:masterfrom
Conversation
…erflow The mutual recursion in json_parser.cc has no depth guard: MergeValue → MergeMessage → MergeField → ParseFieldValue → MergeValue A FHIR JSON payload with deeply nested extension objects (e.g. depth=500+) exhausts the default 8 MiB Linux thread stack, crashing the process with SIGSEGV. The crash is process-terminating and not catchable via absl::Status. The nlohmann SAX parser (Phase 1) does not protect against this: v3.7.3 uses an iterative sax_parse_internal with a heap-allocated std::vector<bool> states, so arbitrarily deep JSON is parsed safely into a FhirJson tree. Phase 2 (MergeValue) then recurses on that tree with no guard. Fix: add kMaxNestingDepth = 500 and thread a depth counter through all four mutually-recursive functions. The counter increments only at ParseFieldValue → MergeValue (once per FHIR object nesting level). MergeValue returns an InvalidArgumentError via the ScopedErrorReporter when the limit is exceeded. The Go implementation (go/jsonformat/unmarshaller.go) has an equivalent MaxNestingDepth / checkCurrentDepth mechanism; this brings the C++ library to parity.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The mutual recursion in
json_parser.cchas no depth guard:A FHIR JSON payload with deeply nested
extensionobjects (depth ≥ ~500) exhausts the default 8 MiB Linux thread stack, crashing the process with SIGSEGV. The crash is process-terminating and not catchable viaabsl::Statuserror handling. A single ~170 KB HTTP request containing a syntactically valid FHIR R4 Patient resource is sufficient to kill any C++ server process that callsMergeJsonFhirStringIntoProto()on user-supplied data.The
Parserclass private section contains onlyprimitive_handler_anddefault_timezone_— no depth counter, no stack guard.The nlohmann SAX parser (Phase 1,
json_sax_handler.cc:321) does not protect against this. nlohmann v3.7.3 uses an iterativesax_parse_internalwith a heap-allocatedstd::vector<bool> states, so theFhirJsontree is built safely at any depth. Phase 2 (MergeValue) then recurses on it without limit.Fix
Add
static constexpr int kMaxNestingDepth = 500to theParserclass and thread adepthcounter through all four mutually-recursive functions (MergeValue,MergeMessage,MergeField,ParseFieldValue). The counter increments only atParseFieldValue → MergeValue(once per FHIR object nesting level). When the limit is exceeded,MergeValuereturns anabsl::InvalidArgumentErrorvia theScopedErrorReporter— same error path used throughout the parser.The Go implementation (
go/jsonformat/unmarshaller.go) has an equivalentMaxNestingDepth/checkCurrentDepthmechanism. This brings the C++ library to parity.kMaxNestingDepth = 500is orders of magnitude above any real-world FHIR nesting depth while keeping worst-case stack usage well under 1 MB (500 levels × 4 frames × ~400 bytes).