chore(docs): provide clear guidance on DPU configuration for site-controller nodes#3048
chore(docs): provide clear guidance on DPU configuration for site-controller nodes#3048spydaNVIDIA wants to merge 1 commit into
Conversation
Summary by CodeRabbit
WalkthroughThis change updates the site controller DPU requirements across the quick-start and hardware prerequisite documentation. It now requires fully provisioned BlueField-3 DPUs, DPU mode only, and the listed firmware, optics, connectivity, and download details. ChangesSite controller DPU requirements docs
🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
🔍 Container Scan SummaryNo Grype artifacts were found to aggregate. |
|
🌿 Preview your docs: https://nvidia-preview-pull-request-3048.docs.buildwithfern.com/infra-controller |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/getting-started/prerequisites/hardware.md`:
- Line 31: The connectivity example in the hardware prerequisites doc uses curl
with -k, which bypasses TLS verification; update the example in the connectivity
check to use plain curl so it actually validates the trust path. Keep the
guidance aligned with the surrounding prerequisites content and adjust the
example text in the relevant markdown section so it remains realistic and safe
for operators.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 3db2954a-cf23-41db-8c23-3f7f286a687e
📒 Files selected for processing (2)
docs/getting-started/prerequisites/hardware.mddocs/getting-started/quick-start.md
✅ Files skipped from review due to trivial changes (1)
- docs/getting-started/quick-start.md
polarweasel
left a comment
There was a problem hiding this comment.
Minor stuff, including arguing with Coderabbit 😁
| Today, site-controller nodes must have Bluefield-3 DPUs. Ensure the following requirements are met: | ||
| - Verify the correct DPU power cable has been ordered from the server vendor. | ||
| - The Bluefield-3's operating mode is DPU mode (not NIC mode). Today, NIC mode is not supported. |
There was a problem hiding this comment.
| Today, site-controller nodes must have Bluefield-3 DPUs. Ensure the following requirements are met: | |
| - Verify the correct DPU power cable has been ordered from the server vendor. | |
| - The Bluefield-3's operating mode is DPU mode (not NIC mode). Today, NIC mode is not supported. | |
| Site-controller nodes must have Bluefield-3 DPUs. Ensure the following requirements are met: | |
| - You have the correct DPU power cable from the server vendor. | |
| - The Bluefield-3's operating mode is DPU mode. NIC mode is not supported. |
| - The Bluefield-3's operating mode is DPU mode (not NIC mode). Today, NIC mode is not supported. | ||
| - For BF3 DPUs, verify link speed and optics: BF3 runs at 200 Gb, so match ports to 200 Gb-capable optics, fiber, or DACs. | ||
| - A basic onboard NIC for management is sufficient--no extra ConnectX NICs are needed. | ||
| - Verify that the DPU can connect to the outside world (curl -k https://www.google.com) |
There was a problem hiding this comment.
Use curl -I to get a quick connection report (just the HTTP status code and some headers). Also... why not ping NVIDIA instead of Google?
| - Verify that the DPU can connect to the outside world (curl -k https://www.google.com) | |
| - Verify that the DPU can connect to the outside world (curl -I https://www.nvidia.com) |
| - For BF3 DPUs, verify link speed and optics: BF3 runs at 200 Gb, so match ports to 200 Gb-capable optics, fiber, or DACs. | ||
| - A basic onboard NIC for management is sufficient--no extra ConnectX NICs are needed. | ||
| - Verify that the DPU can connect to the outside world (curl -k https://www.google.com) | ||
| - The DPUs are at the latest supported firmware version: DOCA 2.9.3 and HBN 2.4.3 |
There was a problem hiding this comment.
Do you want to include these numbers here and have to maintain them, or include links to their release pages instead?
| - Flash the DPU firmware to the latest supported version using the BlueField Firmware Bundle. Latest supported firmware versions: | ||
|
|
||
| | DOCA | HBN | | ||
| | ----- | ----- | | ||
| | 2.9.3 | 2.4.3 | |
There was a problem hiding this comment.
Same question as previous file: if you include version numbers, you have to maintain them, vs linking to release pages and letting the reader figure out what they need to do. UNLESS...do we not support the very latest releases?
| | ----- | ----- | | ||
| | 2.9.3 | 2.4.3 | | ||
|
|
||
| - Configure the Bluefield-3 device in DPU mode (operating mode). We do not support having DPUs in NIC mode today. |
There was a problem hiding this comment.
| - Configure the Bluefield-3 device in DPU mode (operating mode). We do not support having DPUs in NIC mode today. | |
| - Configure the Bluefield-3 device in DPU mode (operating mode). We do not currently support NIC mode. |
|
|
||
| - Configure the Bluefield-3 device in DPU mode (operating mode). We do not support having DPUs in NIC mode today. | ||
| - Ensure the DPU ARM OS is booted and reachable via its management interface. | ||
| - Verify that the DPU can connect to the outside world (curl -k https://www.google.com) |
There was a problem hiding this comment.
Same as previous file...
| - Verify that the DPU can connect to the outside world (curl -k https://www.google.com) | |
| - Verify that the DPU can connect to the outside world (curl -I https://www.nvidia.com) |
| Refer to the NVIDIA DOCA documentation and the BlueField Firmware Bundle download archive for firmware flashing instructions and supported firmware versions: | ||
|
|
||
| [https://developer.nvidia.com/doca-2-9-2-lts-ovs-doca-download-archive?deployment_platform=BlueField&deployment_package=BF-FW-Bundle](https://developer.nvidia.com/doca-2-9-2-lts-ovs-doca-download-archive?deployment_platform=BlueField&deployment_package=BF-FW-Bundle) | ||
| [https://developer.nvidia.com/doca-2-9-3-download-archive?deployment_platform=BlueField&deployment_package=BF-FW-Bundle](https://developer.nvidia.com/doca-2-9-3-download-archive?deployment_platform=BlueField&deployment_package=BF-FW-Bundle) |
There was a problem hiding this comment.
Same question, really. :) Can we just link to the docs and the release page(s) directly, instead of having to increment here every time these get updated?
chore(docs): provide clear guidance on DPU configuration for site-controller nodes
Related issues
#2992
Type of Change
Breaking Changes
Testing
Additional Notes