Skip to content

Enable live catalog with region-aware fallback engine#789

Open
baijumeswani wants to merge 1 commit into
mainfrom
baijumeswani/catalog
Open

Enable live catalog with region-aware fallback engine#789
baijumeswani wants to merge 1 commit into
mainfrom
baijumeswani/catalog

Conversation

@baijumeswani

@baijumeswani baijumeswani commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator

Replaces the static, file-based model catalog with a live Azure Foundry catalog client, and makes both catalog queries and model downloads region-aware. The client now detects the caller's Azure region from cluster headers, queries that region's catalog endpoint, and transparently retries against nearby regions when an endpoint is unhealthy. Downloads resolve their SAS URI from the same region that served the model's catalog entry.

This ports three behaviors from the C# core into the C++ SDK:

  1. Fetch the live catalog instead of reading a baked-in static snapshot.
  2. Detect the region from azureml-served-by-cluster headers (instead of always using eastus).
  3. Region-aware catalog query and model download.

How it works

                        ┌──────────────────────────────────────────┐
                        │  AzureModelCatalog.FetchModels()         │
                        └───────────────────┬──────────────────────┘
                                            │
                                            ▼
                        ┌──────────────────────────────────────────┐
                        │  AzureCatalogClient                      │
                        │                                          │
                        │  1. Detect region                        │
                        │     POST probe ─► read                   │
                        │     "azureml-served-by-cluster" header   │
                        │     vienna-<region>-NN ─► region         │
                        │     (explicit config wins; else eastus)  │
                        └───────────────────┬──────────────────────┘
                                            │ for each device filter set
                                            ▼
                        ┌──────────────────────────────────────────┐
                        │  RegionFallback.Execute(startRegion)     │
                        │                                          │
                        │   chain = [start]                        │
                        │         + proximal[start]                │
                        │         + one random public region       │
                        └───────────────────┬──────────────────────┘
                                            │
                       ┌────────────────────┼─────────────────────┐
                       ▼                     ▼                     ▼
                  try region A ─►       try region B ─►       try region C
                   2xx? ✓ done          retryable(0/408/        ...exhausted
                   permanent(404)? ✓     429/5xx)? ─► next       ─► fail set
                   stop                                          (others continue)
                       │
                       │ page 1 pins the serving region;
                       │ later pages stay on it (tokens are region-specific)
                       ▼
            ┌──────────────────────────────────────────┐
            │  Models stamped with detected_region,    │
            │  committed atomically per filter set     │
            └───────────────────┬──────────────────────┘
                                │  (download)
                                ▼
            ┌──────────────────────────────────────────┐
            │  DownloadManager.DownloadModel(info)     │
            │                                          │
            │  region = explicit config                │
            │         → info.detected_region           │
            │         → eastus                         │
            └───────────────────┬──────────────────────┘
                                ▼
            ┌───────────────────────────────────────────┐
            │  ModelRegistryClient.ResolveModelContainer│
            │  (same RegionFallback chain) ─► SAS URI   │
            └───────────────────┬───────────────────────┘
                                ▼
                          Blob download
  • catalog_region: ""/"auto" → detect from headers; any other value → explicit override.
  • DisableRegionFallback: disables cross-region retries (single-region behavior).

Copilot AI review requested due to automatic review settings June 9, 2026 07:06
@vercel

vercel Bot commented Jun 9, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
foundry-local Ready Ready Preview, Comment Jun 9, 2026 7:06am

Request Review

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR replaces the C++ SDK’s embedded/static model catalog with a live Azure Foundry catalog client, adds automatic region detection from azureml-served-by-cluster, and introduces a region-fallback engine used by both catalog queries and model-registry SAS resolution so downloads target the region that served the catalog entry.

Changes:

  • Removed the static snapshot catalog + snapshot generation tool, and always build/use the live Azure catalog client.
  • Added RegionFallback + status-aware HTTP response plumbing to support regional retries and sticky-region behavior.
  • Stamped models with detected_region and used it to choose the model registry region when resolving download SAS URIs (with configurable fallback disable).

Reviewed changes

Copilot reviewed 32 out of 33 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
sdk_v2/cpp/tools/catalog_snapshot/main.cc Removed the snapshot generation tool entrypoint (static catalog pipeline retired).
sdk_v2/cpp/tools/catalog_snapshot/CMakeLists.txt Removed CMake target for the snapshot tool.
sdk_v2/cpp/test/sdk_api/shared_test_env.h Test env now always points at the live Azure catalog URL (no static path).
sdk_v2/cpp/test/internal_api/static_catalog_test.cc Removed tests that exercised the embedded static catalog client.
sdk_v2/cpp/test/internal_api/region_fallback_test.cc Added unit tests for region fallback chain building, retryability, sticky behavior, and exhaustion.
sdk_v2/cpp/test/internal_api/model_info_test.cc Added JSON round-trip coverage for ModelInfo.detected_region.
sdk_v2/cpp/test/internal_api/download_test.cc Updated registry client tests for status-aware HTTP + fallback; added download region-resolution tests.
sdk_v2/cpp/test/internal_api/catalog_cache_test.cc Added cache save/load coverage for detected_region.
sdk_v2/cpp/test/internal_api/c_api_test.cc Relaxed catalog population assertion to avoid network-dependent expectations.
sdk_v2/cpp/test/internal_api/azure_catalog_test.cc Expanded live catalog client tests to cover region detection, URL rewriting, stamping, and fallback behavior.
sdk_v2/cpp/test/CMakeLists.txt Updated unit test sources: added region_fallback_test, removed static_catalog_test, always includes azure_catalog_test.
sdk_v2/cpp/src/util/region_fallback.h New public/internal utility interface for region-aware retry/fallback execution.
sdk_v2/cpp/src/util/region_fallback.cc New implementation with proximal-region mapping, retryable classification, sticky region tracking, and diagnostics.
sdk_v2/cpp/src/model_info.h Added detected_region field to model metadata.
sdk_v2/cpp/src/model_info.cc Implemented JSON serialization/deserialization for detected_region (omitted when empty).
sdk_v2/cpp/src/manager.cc Defaulted catalog region to auto and plumbed DisableRegionFallback into download + catalog components.
sdk_v2/cpp/src/http/http_client.h Introduced HttpResponse and non-throwing Http*WithResponse APIs (status + headers + body).
sdk_v2/cpp/src/http/http_client.cc Implemented response-aware GET/POST (including header capture + transport failure as status==0).
sdk_v2/cpp/src/download/model_registry_client.h Updated registry client API to accept per-call region and use response-aware HTTP + fallback.
sdk_v2/cpp/src/download/model_registry_client.cc Implemented region-fallback-driven registry resolution and parsing of response bodies.
sdk_v2/cpp/src/download/download_manager.h Added config-region tracking and disable_region_fallback plumbing.
sdk_v2/cpp/src/download/download_manager.cc Resolved registry region via explicit config → detected region → eastus; wired fallback-enabled registry client.
sdk_v2/cpp/src/catalog/static_catalog_client.cc Removed embedded snapshot-backed catalog implementation.
sdk_v2/cpp/src/catalog/live_catalog_client_stub.cc Removed stub that previously threw when live client sources were absent.
sdk_v2/cpp/src/catalog/catalog_snapshot_data.h Removed snapshot data header (embedded bytes no longer part of build).
sdk_v2/cpp/src/catalog/catalog_client.h Updated contract/docs: MakeCatalogClient now always constructs live Azure client and accepts region/fallback controls.
sdk_v2/cpp/src/catalog/catalog_client.cc Retained cached-model + BYO synthesis helper; removed static-vs-live dispatch logic.
sdk_v2/cpp/src/catalog/azure_model_catalog.h Default catalog URL is now always Azure; added region + fallback configuration fields.
sdk_v2/cpp/src/catalog/azure_model_catalog.cc Plumbed region/fallback into MakeCatalogClient and persisted fetched model infos (incl. region) to cache.
sdk_v2/cpp/src/catalog/azure_catalog_client.h New live Azure catalog client interface with region detection and fallback-aware fetching.
sdk_v2/cpp/src/catalog/azure_catalog_client.cc New live Azure catalog implementation: request shaping, region detection, URL rewriting, filter-set atomicity, and fallback.
sdk_v2/cpp/CMakeLists.txt Removed tools option + static snapshot wiring; always builds live catalog + region fallback sources.

Comment on lines +80 to +83
if (response.status == 0) {
FL_THROW(FOUNDRY_LOCAL_ERROR_INTERNAL,
"transport failure from model registry API for asset_id: " + asset_id);
}
Comment on lines +85 to +89
if (response.status < 200 || response.status >= 300) {
FL_THROW(FOUNDRY_LOCAL_ERROR_INTERNAL,
"HTTP " + std::to_string(response.status) + " from model registry API for asset_id: " +
asset_id);
}
Comment on lines +316 to +320
const std::string url = regional && pinned_region
? BuildRegionalUrl(url_prefix_, url_suffix_, *pinned_region)
: BuildRequestUrl(base_url_, regional, region_, url_prefix_, url_suffix_);
FL_THROW(FOUNDRY_LOCAL_ERROR_INTERNAL,
"HTTP " + std::to_string(response.status) + " from catalog " + url);
Comment on lines +124 to +126
// Default RNG: per-instance, seeded from a clock. Jitter need not be strong.
auto rng = std::make_shared<std::mt19937_64>(
static_cast<uint64_t>(std::random_device{}()));
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants