Skip to content

Add context_lengths option to QairtGenAIBuilder#2505

Draft
qti-kromero wants to merge 21 commits into
microsoft:mainfrom
CodeLinaro:dev/qti-kromero/qairt-genai-custom-cl-list
Draft

Add context_lengths option to QairtGenAIBuilder#2505
qti-kromero wants to merge 21 commits into
microsoft:mainfrom
CodeLinaro:dev/qti-kromero/qairt-genai-custom-cl-list

Conversation

@qti-kromero

@qti-kromero qti-kromero commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Describe your changes

Introduces a context_lengths parameter to QairtGenAIBuilder that allows users to specify an explicit list of context lengths (CLs) to compile for HTP backends.

This provides fine-grained control over which context-length binaries are produced, as an alternative to the fixed CL set generated by multi_graph. The two options are mutually exclusive.

Usage:

"qgab": {
    "type": "QairtGenAIBuilder",
    "backend": "HTP",
    "context_lengths": [512, 1024, 2048, 3072, 4096, 6144, 8192, 10240, 13312, 16384]
}

Sets arn_cl_options.context_length directly on the underlying GenAI builder when provided, bypassing the multi_graph path.

Checklist before requesting a review

  • I have added unit tests for the new parameter
  • All tests pass locally
  • I have updated relevant documentation/descriptions
  • I have run linting

Introduce a genie_overrides PassConfigParam that deep-merges user-supplied
fields into the GenAIConfig before LLMContainer.export() bakes them into the
Genie DLC. This allows callers to override any GenAIConfig field (engine config,
positional encoding, etc.) without modifying QairtGenAIBuilder or QairtPipelinePass.

Nested dicts are merged recursively so only the specified keys are changed;
all other values set by the upstream builder pass are preserved.
…ensions customization

Adds a backend_extensions_override PassConfigParam that is deep-merged
into the LLMContainer's existing _backend_extensions_config before the
Genie DLC is produced. Uses the raw JSON key names (hyphens) as they
appear in backend_extensions.json. Nested dicts are merged recursively
so only specified keys are overridden; all other defaults set by the
builder are preserved. If the container has no existing backend
extensions config, the override becomes the entire config.

Includes 4 tests covering default config presence, merge into existing
config, merge from empty, and no-op when override is None. Also drops
one fragile identity assertion from the existing genie_overrides test.
…apsulation

Aligns with qairt-dev's rename of HTPExecutionConfig → HTPEngineConfig wrapped
in EngineConfig (AISW-184594). The param now accepts a nested dict:
  {"n_threads": 0, "htp": {"cpu_mask": "0xe0", "poll": false, ...}}
Top-level keys map to EngineConfig, "htp" sub-dict maps to HTPEngineConfig.
…key deletion

- Lists of dicts are merged element-wise by index, allowing recipe overrides
  to target nested list entries (e.g. context[0], devices[0]) without
  replacing the entire list
- A None override value deletes the corresponding key from the result,
  enabling removal of builder-generated keys (e.g. "graphs": null)
…ain bug

- encapsulation.py: add # pylint: disable=protected-access on _gen_ai_config
  and _backend_extensions_config accesses
- test_encapsulation.py: rename htp_execution_overrides → engine_config_overrides
  throughout; fix docstring capitalization; fix pre-existing test_encapsulation_
  genie_overrides_applied failure caused by mock._gen_ai_config being replaced
  after model_validate() — capture original mock before running the pass
Introduces a context_lengths parameter that allows users to specify an
explicit list of context lengths (CLs) to compile, bypassing the fixed
CL set produced by multi_graph. The two options are mutually exclusive.
Like multi_graph, context_lengths is HTP-only.

Sets arn_cl_options.context_length directly on the builder when provided,
falling through to the existing multi_graph path otherwise.
Move _make_minimal_onnx and a mock_container fixture into conftest.py,
replace inline mock_save_func/save patches and dead mock_helper.make_*
stubs across all tests, parametrize the _default_config and
missing-required-file assertions, drop the redundant valid-version test,
and add coverage for EPContext node output, QAIRT_LOG_LEVEL env wiring,
and sequence_lengths plumbing into create_genai_config.
…genai-custom-cl-list

Brings in QairtEncapsulation genie_overrides, backend_extensions_overrides,
and engine_config_overrides with _deep_merge list/None support.
@qti-kromero qti-kromero changed the title Dev/qti kromero/qairt genai custom cl list Add context_lengths option to QairtGenAIBuilder Jun 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant