Skip to content

feat: add Carbide provider implementation#128

Draft
fabiendupont wants to merge 8 commits into
NVIDIA:mainfrom
fabiendupont:feat/carbide-provider
Draft

feat: add Carbide provider implementation#128
fabiendupont wants to merge 8 commits into
NVIDIA:mainfrom
fabiendupont:feat/carbide-provider

Conversation

@fabiendupont

Copy link
Copy Markdown
Collaborator

Summary

Depends on #127

Carbide provider using carbidecli, covering 5 upstream templates:

Template Implementation
control-plane SSH key + VPC lifecycle
network VPC, subnet, prefix, NSG CRUD
image-registry OperatingSystem + InstanceType CRUD
bm Instance launch/describe/reboot/teardown
iam Token validation, scope coverage, write access
  • Full API from OpenAPI spec (24 resources)
  • CRUD library: CarbideResource class
  • Dynamic scope checking with pre-existing resource support
  • Test catalog entries for Carbide platforms

Test plan

  • All tests pass
  • Dry-run succeeds for all 5 configs
  • Live test on Carbide lab

🤖 Generated with Claude Code

fabiendupont and others added 8 commits March 10, 2026 12:03
Override configs can now merge into base check lists instead of replacing
them entirely, by including a `{__merge__: true}` marker. Matching checks
are deep-merged by key, new checks are appended, and `"__remove__"` drops
a check. Without the marker, lists are replaced as before (backward compat).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Fabien Dupont <fdupont@redhat.com>
Keys set to "__remove__" are deleted during deep_merge, enabling
selective removal of inherited values in nested dicts (e.g., dropping
a single label from expected_labels without replacing the entire dict).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Fabien Dupont <fdupont@redhat.com>
Templates now define WHAT to validate (checks with context variable
defaults), while provider configs define HOW to provision (commands
and stubs). This enables composable validation:

  isvctl test run -f templates/kaas.yaml -f aws/eks.yaml

Key changes:
- Remove commands block from all 7 templates
- Replace hardcoded values with {{ context.X | default('Y') }}
- Template stubs in stubs/ remain as copy-paste starting points
- Existing self-contained provider configs (aws/) keep working
- Update README with layered usage documentation

The merge engine combines tests from the template with commands
from the provider. Context variables flow through Jinja2 rendering
into validation parameters at runtime.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Fabien Dupont <fdupont@redhat.com>
Adds eks-layered.yaml that supplies only commands (Terraform stubs)
and context overrides, designed to pair with templates/kaas.yaml:

  isvctl test run \
    -f isvctl/configs/templates/kaas.yaml \
    -f isvctl/configs/aws/eks-layered.yaml

The existing self-contained eks.yaml is unchanged (backward compat).

Adds 6 integration tests verifying:
- Templates have no commands block
- Layered merge produces both commands and tests
- Context overrides flow through
- Standalone eks.yaml still works
- Layered and standalone have the same validation check names
- All 7 templates are validation-only

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Fabien Dupont <fdupont@redhat.com>
Implements the control-plane template for the Carbide provider using
carbidecli. Maps template concepts to Carbide resources:
  - API health → tenant get, site list
  - Access keys → SSH key group + SSH key CRUD
  - Tenants → VPC CRUD

Includes:
  - carbide/control-plane.yaml: layered provider config
  - stubs/carbide/common/carbide.py: shared helper (run_carbide, state mgmt)
  - stubs/carbide/control-plane/: 10 stub scripts matching template steps

Usage:
  isvctl test run \
    -f isvctl/configs/templates/control-plane.yaml \
    -f isvctl/configs/carbide/control-plane.yaml

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Fabien Dupont <fdupont@redhat.com>
Implements 3 upstream templates for the Carbide provider:

Network (carbide/network.yaml + 8 stubs):
  VPC CRUD, subnet configuration, VPC isolation, NSG security rules,
  connectivity and traffic validation via carbidecli.

Image Registry (carbide/image-registry.yaml + 6 stubs):
  OperatingSystem CRUD, instance launch from OS image, install config
  lifecycle — validates Carbide's image management capabilities.

Bare Metal (carbide/bm.yaml + 7 stubs):
  Instance launch/describe/list/reboot/teardown via carbidecli.
  Reinstall is skipped (not supported). NIM deploy/teardown reuse
  the shared template stubs.

All providers use the layered approach:
  isvctl test run -f templates/<template>.yaml -f carbide/<provider>.yaml

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Fabien Dupont <fdupont@redhat.com>
IAM provider (carbide/iam.yaml + 3 stubs):
  Validates API token, checks scope coverage for all templates,
  proves write access via temp SSH key lifecycle.

Full Carbide API surface (common/carbide.py):
  CARBIDE_API_RESOURCES maps all 24 resources from the OpenAPI spec
  (bare-metal-manager-rest) with their operations and scope names.

CRUD library (common/resources.py):
  Pre-configured CarbideResource instances for all resources:
  site, vpc, vpc-prefix, subnet, nsg, ipblock, allocation,
  instance, instance-type, machine, expected-machine, operating-system,
  infiniband-partition, nvlink-logical-partition, nvlink-interface,
  dpu-extension-service, sshkeygroup, sshkey, tenant, rack, tray,
  sku, audit.

Dynamic scope calculation:
  effective_scopes_for_template() reduces required scopes when
  pre-existing resources are set via CARBIDE_*_ID env vars.
  TEMPLATE_REQUIRED_SCOPES defines minimum scopes per template.

Pre-existing resource support in create/teardown stubs:
  CARBIDE_VPC_ID, CARBIDE_VPC_PREFIX_ID, CARBIDE_SUBNET_ID,
  CARBIDE_SSH_KEY_GROUP_ID, CARBIDE_OS_ID, CARBIDE_INSTANCE_ID.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Fabien Dupont <fdupont@redhat.com>
Register Carbide provider configs in PLATFORM_CONFIGS so test
coverage tracking knows which validations are used by the
Carbide provider (control-plane, network, image-registry, bm, iam).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Fabien Dupont <fdupont@redhat.com>
@copy-pr-bot

copy-pr-bot Bot commented Mar 10, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant