fix: align mdata arginfo with runtime machine sections#1889
fix: align mdata arginfo with runtime machine sections#1889SchrodingersCattt wants to merge 2 commits into
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1889 +/- ##
==========================================
+ Coverage 49.80% 49.82% +0.02%
==========================================
Files 83 83
Lines 14986 14992 +6
==========================================
+ Hits 7464 7470 +6
Misses 7522 7522 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
📝 WalkthroughWalkthroughAdds a private helper Changesmdata task config validation and tests
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Possibly related issues
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
dpgen/arginfo.py (1)
4-5: ⚡ Quick winUse a NumPy-style docstring for the new helper.
As per coding guidelines,
dpgen/**/*.py: "Use Numpy-style docstrings for functions and classes".🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@dpgen/arginfo.py` around lines 4 - 5, The docstring for the `_is_mdata_task_config` function uses a simple Google-style format instead of the NumPy-style format required by the coding guidelines. Convert the docstring to NumPy-style format by adding properly structured sections including a Summary section, Parameters section (documenting the `value` parameter and its type), and Returns section (documenting the return type as bool and what the function returns).Source: Coding guidelines
tests/test_mdata_arginfo.py (1)
53-63: ⚡ Quick winAdd a regression test for empty-list task sections.
Given the new validator logic, please add a case like
data["train"] = []and assert rejection. This locks behavior to the runtime contract and prevents silent reintroduction of the empty-list gap.As per coding guidelines,
tests/**/*.py: "Add corresponding unit tests intests/directory for new features".🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_mdata_arginfo.py` around lines 53 - 63, Add a new test method following the existing patterns of test_rejects_list_with_non_dict_item and test_rejects_scalar_section that verifies empty-list task sections are properly rejected. Create a test that initializes _mdata, sets data["train"] = [] (an empty list), and asserts that calling normalize(run_mdata_arginfo(), data, strict_check=False) raises the appropriate error exception. This regression test ensures the validator prevents empty task sections from being silently accepted.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@dpgen/arginfo.py`:
- Around line 8-10: The validation logic for list values currently accepts empty
lists because all(isinstance(item, dict) for item in value) returns True for
empty iterables, but this causes downstream crashes when convert_mdata() tries
to access index 0 on an empty list. Modify the condition in the
isinstance(value, list) branch to additionally check that the list is not empty
before validating that all items are dictionaries, so that empty lists are
rejected as invalid during validation.
---
Nitpick comments:
In `@dpgen/arginfo.py`:
- Around line 4-5: The docstring for the `_is_mdata_task_config` function uses a
simple Google-style format instead of the NumPy-style format required by the
coding guidelines. Convert the docstring to NumPy-style format by adding
properly structured sections including a Summary section, Parameters section
(documenting the `value` parameter and its type), and Returns section
(documenting the return type as bool and what the function returns).
In `@tests/test_mdata_arginfo.py`:
- Around line 53-63: Add a new test method following the existing patterns of
test_rejects_list_with_non_dict_item and test_rejects_scalar_section that
verifies empty-list task sections are properly rejected. Create a test that
initializes _mdata, sets data["train"] = [] (an empty list), and asserts that
calling normalize(run_mdata_arginfo(), data, strict_check=False) raises the
appropriate error exception. This regression test ensures the validator prevents
empty task sections from being silently accepted.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 9164ecdc-16c8-463b-a2bc-8b7d7d6bfbe3
📒 Files selected for processing (2)
dpgen/arginfo.pytests/test_mdata_arginfo.py
njzjz
left a comment
There was a problem hiding this comment.
The list form has been deprecated and is just kept for compatibility. I don't think it is necessary to tell external tools that a deprecated usage is supported.
|
Thanks for the clarification. It sounds like the list form is intentionally kept only for backward compatibility, while |
Summary
general_mdata_arginfo()withconvert_mdata()runtime semantics for machine task sections.train,model_devi, andfpto be either a dict or a list of dicts.Related
Related to #1888.
Why
dpgen runaccepts list-form machine sections throughconvert_mdata():But
run_mdata_arginfo()previously exposed these sections as dict-only, causing external validators/tools to falsely reject machine files that runtime accepts.Validation
python -m pytest tests/test_mdata_arginfo.py tests/test_check_examples.py tests/tools/test_convert_mdata.py -qpython -m ruff check dpgen/arginfo.py tests/test_mdata_arginfo.pySummary by CodeRabbit
Bug Fixes
dictobjects or as non-empty lists ofdictobjects.Tests