Skip to content

feat: update model registry and implement utility scripts for listing and updating benchmarks#1

Open
sarwar616816 wants to merge 3 commits into
PromptEngineer48:mainfrom
sarwar616816:main
Open

feat: update model registry and implement utility scripts for listing and updating benchmarks#1
sarwar616816 wants to merge 3 commits into
PromptEngineer48:mainfrom
sarwar616816:main

Conversation

@sarwar616816

@sarwar616816 sarwar616816 commented May 13, 2026

Copy link
Copy Markdown

Both list_models.py and update_models.py are helper tools designed to manage the list of models you are benchmarking. Here is how and when to use them:

1. list_models.py

Purpose: A diagnostic and discovery tool to see what models are available to your API key.

  • Use Cases:
    • Connectivity Check: Quickly verify that your NVIDIA_API_KEY is valid and the API is reachable.
    • Discovery: Check if a specific model (e.g., a new Llama or Mistral version) has been added to NVIDIA's catalog before it appears in your benchmark list.
    • Raw Inspection: View the exact id strings used by NVIDIA without modifying any files.
  • When to run:
    • When you are troubleshooting API issues.
    • When you want to see the raw list of every model available (including non-chat models like embeddings) just for information.

2. update_models.py

Purpose: A maintenance tool that automatically synchronizes your local models.json with the latest available chat models.

  • Use Cases:
    • Automated Sync: Instead of manually typing model IDs and labels into models.json, this script fetches them, filters out non-chat models (like embeddings), formats the names nicely, and updates the file for you.
    • Project Maintenance: Keeps your benchmark suite up-to-date as NVIDIA adds or retires models from their platform.
  • When to run:
    • Before a fresh benchmark: Run this before running python benchmark.py --restart to ensure you are testing the most current set of models.
    • After NVIDIA updates: If you hear that a new model (like "Llama 4" or "Gemma 3") has been released on NVIDIA NIM, run this to pull it into your project automatically.
    • Consistency: Run this if your models.json feels outdated or if you want to ensure all model labels follow a consistent naming convention.

Summary Table

Tool Primary Action Target File Recommended Frequency
list_models.py Print to terminal None As needed for discovery/debugging
update_models.py Overwrite local list models.json Every 1-2 weeks or before new benchmarks

Note: Always ensure you run them using your virtual environment:

.\.venv\Scripts\python.exe update_models.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant