Skip to content

feat(PF-4273): add eval for pf-unit-test-generator skill#115

Open
jpuzz0 wants to merge 1 commit into
patternfly:mainfrom
jpuzz0:PF-4273-unit-test-generator-eval
Open

feat(PF-4273): add eval for pf-unit-test-generator skill#115
jpuzz0 wants to merge 1 commit into
patternfly:mainfrom
jpuzz0:PF-4273-unit-test-generator-eval

Conversation

@jpuzz0

@jpuzz0 jpuzz0 commented Jun 15, 2026

Copy link
Copy Markdown
Collaborator

Summary

First eval using agent-eval-harness. Tests the skill's unique value: detecting consumer vs library context and making the right mocking decisions.

  • 3 test cases (consumer, library, library orchestration exception)
  • 4 inline pass/fail checks, all passing at 100%

Run with: /eval-run --config eval/pf-unit-test-generator/eval.yaml

Scoring results

has_test_content:               pass_rate=100.0%
consumer_no_child_mocking:      pass_rate=100.0%
library_mocks_children:         pass_rate=100.0%
orchestration_no_child_mocking: pass_rate=100.0%

Summary by CodeRabbit

  • New Features

    • Added ServiceCard, Tooltip, and Wizard React components (with prop interfaces, status/last-updated rendering, tooltip show/hide interactions, and multi-step navigation with save/back behavior).
    • Added a new unit-test generator evaluation job with dataset-driven cases and automated judging rules.
  • Chores

    • Updated configuration to define new evaluation contexts/rules and to supply inputs/fixtures for added cases.
    • Updated .gitignore to ignore evaluation run output.

@coderabbitai

coderabbitai Bot commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 8a0f1163-7ebf-40f8-a1e3-2cde3cf2447a

📥 Commits

Reviewing files that changed from the base of the PR and between d437ddb and 72b0911.

📒 Files selected for processing (11)
  • .gitignore
  • eval/pf-unit-test-generator/cases/consumer-with-children/ServiceCard.tsx
  • eval/pf-unit-test-generator/cases/consumer-with-children/annotations.yaml
  • eval/pf-unit-test-generator/cases/consumer-with-children/input.yaml
  • eval/pf-unit-test-generator/cases/library-tooltip/Tooltip.tsx
  • eval/pf-unit-test-generator/cases/library-tooltip/annotations.yaml
  • eval/pf-unit-test-generator/cases/library-tooltip/input.yaml
  • eval/pf-unit-test-generator/cases/library-wizard-orchestration/Wizard.tsx
  • eval/pf-unit-test-generator/cases/library-wizard-orchestration/annotations.yaml
  • eval/pf-unit-test-generator/cases/library-wizard-orchestration/input.yaml
  • eval/pf-unit-test-generator/eval.yaml
✅ Files skipped from review due to trivial changes (4)
  • eval/pf-unit-test-generator/cases/library-wizard-orchestration/annotations.yaml
  • .gitignore
  • eval/pf-unit-test-generator/cases/consumer-with-children/annotations.yaml
  • eval/pf-unit-test-generator/cases/library-tooltip/annotations.yaml
🚧 Files skipped from review as they are similar to previous changes (7)
  • eval/pf-unit-test-generator/cases/consumer-with-children/input.yaml
  • eval/pf-unit-test-generator/cases/consumer-with-children/ServiceCard.tsx
  • eval/pf-unit-test-generator/cases/library-wizard-orchestration/Wizard.tsx
  • eval/pf-unit-test-generator/cases/library-tooltip/Tooltip.tsx
  • eval/pf-unit-test-generator/cases/library-wizard-orchestration/input.yaml
  • eval/pf-unit-test-generator/cases/library-tooltip/input.yaml
  • eval/pf-unit-test-generator/eval.yaml

📝 Walkthrough

Walkthrough

Adds a pf-unit-test-generator evaluation suite consisting of three fixture React components (ServiceCard, Tooltip, Wizard) each paired with input.yaml and annotations.yaml metadata, plus a central eval.yaml defining a claude-code runner, four regex-based judges, and strict 1.0 pass-rate thresholds. The eval/runs/ directory is added to .gitignore.

Changes

pf-unit-test-generator Eval Suite

Layer / File(s) Summary
Eval case fixtures and metadata
eval/pf-unit-test-generator/cases/consumer-with-children/ServiceCard.tsx, eval/pf-unit-test-generator/cases/consumer-with-children/annotations.yaml, eval/pf-unit-test-generator/cases/consumer-with-children/input.yaml, eval/pf-unit-test-generator/cases/library-tooltip/Tooltip.tsx, eval/pf-unit-test-generator/cases/library-tooltip/annotations.yaml, eval/pf-unit-test-generator/cases/library-tooltip/input.yaml, eval/pf-unit-test-generator/cases/library-wizard-orchestration/Wizard.tsx, eval/pf-unit-test-generator/cases/library-wizard-orchestration/annotations.yaml, eval/pf-unit-test-generator/cases/library-wizard-orchestration/input.yaml
Adds three fixture components: ServiceCard (consumer context) with StatusIcon and LastUpdated child imports, Tooltip (library/default context) with hover and focus state management, and Wizard (library/orchestration-exception context) with step navigation. Each includes input.yaml with test generation prompts and annotations.yaml with context and rule labels that gate judge execution.
eval.yaml: runner, judges, and thresholds
eval/pf-unit-test-generator/eval.yaml
Defines the pf-unit-test-generator-eval job using claude-code runner with claude-sonnet-4-6 for generation and judging. Configures four regex-based judges: has_test_content (all cases), consumer_no_child_mocking (consumer context), library_mocks_children (library/default rule), and orchestration_no_child_mocking (orchestration-exception rule). All judges set min_pass_rate: 1.0.
.gitignore update
.gitignore
Adds eval/runs/ under a new # Eval run output comment to exclude generated evaluation run artifacts from version control.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related issues

  • Write eval for pf-unit-test-generator #107: This PR directly implements the requirements described in the issue by adding eval.yaml with the three specified test cases (consumer-with-children, library-tooltip, library-wizard-orchestration), their fixture files, and the four automated judge checks.
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically summarizes the main change: adding an evaluation for the pf-unit-test-generator skill using the eval framework.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

Review ran into problems

🔥 Problems

Linked repositories: Your configuration references 1 linked repositories, but your current plan allows 0. Analyzed ``, skipped anthropics/claude-plugins-official.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
eval/pf-unit-test-generator/cases/library-tooltip/Tooltip.tsx (1)

1-1: ⚡ Quick win

Use kebab-case for this fixture filename.

Tooltip.tsx violates the repository naming rule. Rename it to tooltip.tsx and update case references accordingly to keep fixture linkage intact.
As per coding guidelines, "Use kebab-case for directory and file names".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@eval/pf-unit-test-generator/cases/library-tooltip/Tooltip.tsx` at line 1,
Rename the file from Tooltip.tsx to tooltip.tsx to comply with the kebab-case
naming convention for fixture files. After renaming, search the codebase for any
references to the Tooltip.tsx filename (including import statements, path
references, and fixture linkage configurations) and update them to use the new
tooltip.tsx filename to maintain proper fixture linkage and consistency.

Source: Coding guidelines

eval/pf-unit-test-generator/cases/library-wizard-orchestration/Wizard.tsx (1)

1-1: ⚡ Quick win

Use kebab-case for this fixture filename.

Wizard.tsx violates the repository naming rule. Rename it to wizard.tsx and update case references so the fixture mapping remains correct.
As per coding guidelines, "Use kebab-case for directory and file names".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@eval/pf-unit-test-generator/cases/library-wizard-orchestration/Wizard.tsx` at
line 1, The file Wizard.tsx uses PascalCase instead of the required kebab-case
naming convention. Rename the file from Wizard.tsx to wizard.tsx and update all
imports and references throughout the codebase that point to this file to use
the new lowercase filename to maintain correct fixture mapping.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@eval/pf-unit-test-generator/cases/library-tooltip/Tooltip.tsx`:
- Line 1: Rename the file from Tooltip.tsx to tooltip.tsx to comply with the
kebab-case naming convention for fixture files. After renaming, search the
codebase for any references to the Tooltip.tsx filename (including import
statements, path references, and fixture linkage configurations) and update them
to use the new tooltip.tsx filename to maintain proper fixture linkage and
consistency.

In `@eval/pf-unit-test-generator/cases/library-wizard-orchestration/Wizard.tsx`:
- Line 1: The file Wizard.tsx uses PascalCase instead of the required kebab-case
naming convention. Rename the file from Wizard.tsx to wizard.tsx and update all
imports and references throughout the codebase that point to this file to use
the new lowercase filename to maintain correct fixture mapping.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 7760b41e-4fc9-4554-ac3b-ff1c094aa107

📥 Commits

Reviewing files that changed from the base of the PR and between bd8becd and d437ddb.

📒 Files selected for processing (11)
  • .gitignore
  • eval/pf-unit-test-generator/cases/consumer-with-children/ServiceCard.tsx
  • eval/pf-unit-test-generator/cases/consumer-with-children/annotations.yaml
  • eval/pf-unit-test-generator/cases/consumer-with-children/input.yaml
  • eval/pf-unit-test-generator/cases/library-tooltip/Tooltip.tsx
  • eval/pf-unit-test-generator/cases/library-tooltip/annotations.yaml
  • eval/pf-unit-test-generator/cases/library-tooltip/input.yaml
  • eval/pf-unit-test-generator/cases/library-wizard-orchestration/Wizard.tsx
  • eval/pf-unit-test-generator/cases/library-wizard-orchestration/annotations.yaml
  • eval/pf-unit-test-generator/cases/library-wizard-orchestration/input.yaml
  • eval/pf-unit-test-generator/eval.yaml

Add first eval using agent-eval-harness to verify the skill's
consumer vs library context detection (mocking decisions).
3 test cases, 4 inline check judges, all passing at 100%.
@jpuzz0 jpuzz0 force-pushed the PF-4273-unit-test-generator-eval branch from d437ddb to 72b0911 Compare June 16, 2026 16:45
@jpuzz0 jpuzz0 requested review from a team and dlabaj and removed request for a team June 16, 2026 17:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant