Add context_lengths option to QairtGenAIBuilder#2505
Draft
qti-kromero wants to merge 21 commits into
Draft
Conversation
Introduce a genie_overrides PassConfigParam that deep-merges user-supplied fields into the GenAIConfig before LLMContainer.export() bakes them into the Genie DLC. This allows callers to override any GenAIConfig field (engine config, positional encoding, etc.) without modifying QairtGenAIBuilder or QairtPipelinePass. Nested dicts are merged recursively so only the specified keys are changed; all other values set by the upstream builder pass are preserved.
…ensions customization Adds a backend_extensions_override PassConfigParam that is deep-merged into the LLMContainer's existing _backend_extensions_config before the Genie DLC is produced. Uses the raw JSON key names (hyphens) as they appear in backend_extensions.json. Nested dicts are merged recursively so only specified keys are overridden; all other defaults set by the builder are preserved. If the container has no existing backend extensions config, the override becomes the entire config. Includes 4 tests covering default config presence, merge into existing config, merge from empty, and no-op when override is None. Also drops one fragile identity assertion from the existing genie_overrides test.
…ion compatibility
…apsulation
Aligns with qairt-dev's rename of HTPExecutionConfig → HTPEngineConfig wrapped
in EngineConfig (AISW-184594). The param now accepts a nested dict:
{"n_threads": 0, "htp": {"cpu_mask": "0xe0", "poll": false, ...}}
Top-level keys map to EngineConfig, "htp" sub-dict maps to HTPEngineConfig.
…key deletion - Lists of dicts are merged element-wise by index, allowing recipe overrides to target nested list entries (e.g. context[0], devices[0]) without replacing the entire list - A None override value deletes the corresponding key from the result, enabling removal of builder-generated keys (e.g. "graphs": null)
…ain bug - encapsulation.py: add # pylint: disable=protected-access on _gen_ai_config and _backend_extensions_config accesses - test_encapsulation.py: rename htp_execution_overrides → engine_config_overrides throughout; fix docstring capitalization; fix pre-existing test_encapsulation_ genie_overrides_applied failure caused by mock._gen_ai_config being replaced after model_validate() — capture original mock before running the pass
Introduces a context_lengths parameter that allows users to specify an explicit list of context lengths (CLs) to compile, bypassing the fixed CL set produced by multi_graph. The two options are mutually exclusive. Like multi_graph, context_lengths is HTP-only. Sets arn_cl_options.context_length directly on the builder when provided, falling through to the existing multi_graph path otherwise.
Move _make_minimal_onnx and a mock_container fixture into conftest.py, replace inline mock_save_func/save patches and dead mock_helper.make_* stubs across all tests, parametrize the _default_config and missing-required-file assertions, drop the redundant valid-version test, and add coverage for EPContext node output, QAIRT_LOG_LEVEL env wiring, and sequence_lengths plumbing into create_genai_config.
…genai-custom-cl-list Brings in QairtEncapsulation genie_overrides, backend_extensions_overrides, and engine_config_overrides with _deep_merge list/None support.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Describe your changes
Introduces a
context_lengthsparameter toQairtGenAIBuilderthat allows users to specify an explicit list of context lengths (CLs) to compile for HTP backends.This provides fine-grained control over which context-length binaries are produced, as an alternative to the fixed CL set generated by
multi_graph. The two options are mutually exclusive.Usage:
Sets
arn_cl_options.context_lengthdirectly on the underlying GenAI builder when provided, bypassing themulti_graphpath.Checklist before requesting a review