Skip to content

test(core): add tests for untested core modules — artifacts, review, api_key_service (#654)#694

Merged
frankbria merged 3 commits into
mainfrom
test/654-untested-core-modules
Jun 17, 2026
Merged

test(core): add tests for untested core modules — artifacts, review, api_key_service (#654)#694
frankbria merged 3 commits into
mainfrom
test/654-untested-core-modules

Conversation

@frankbria

Copy link
Copy Markdown
Owner

Summary

Closes #654 (P6.8.1 — test coverage hardening).

Adds direct unit tests for three previously-untested core modules and expands coverage of a fourth. All new tests carry @pytest.mark.v2. No live API calls (MockProvider / real tmp git+SQLite only).

Module New test file Coverage
core/api_key_service.py tests/core/test_api_key_service.py (NEW) create / list / revoke / rotate / get — main paths + errors (invalid scopes, wrong owner, not-found, rotate-keeps-old-key-on-failure)
core/artifacts.py tests/core/test_artifacts.py (NEW) export_patch / create_commit / get_status / list_patches against a real git repo + workspace (events persisted); git-missing / not-a-repo / no-changes errors; pure parse helpers
core/review.py tests/core/test_review.py (NEW) status & severity thresholds; review_files findings/empty/skip/analyzer-error; review_task delegation; get_review_summary
core/prd_stress_test.py tests/core/test_prd_stress_test.py (EXPAND, +13 tests) resolve_ambiguities_into_prd, JSON-failure fallbacks, malformed-children filtering, sync exception propagation

Bug fixed (surfaced by the new tests)

artifacts.get_status() parsed git status --porcelain via result.stdout.strip(). The porcelain status field is two columns wide (" M file" for a worktree-only change), so strip() removed the leading space of the first line and shifted column parsing — misclassifying that file as staged and dropping its first filename character (hello.pyello.py). Switched to splitlines(). This path feeds cf status and the git_v2/proof scope surfaces.

Acceptance criteria

  • api_key_service.py, artifacts.py, review.py have direct tests covering main paths + error cases
  • New tests carry the @pytest.mark.v2 marker
  • (Fix section) prd_stress_test.py coverage expanded

Verification

  • New suite: 124 passed (test_api_key_service 21, test_artifacts 23, test_review 26, test_prd_stress_test 45 [32 existing + 13 new], test_git_review 9)
  • Full tests/core/ suite: 2303 passed, 0 failures (no regression from the bug fix)
  • ruff check clean on all changed files

Known limitations

…expand prd_stress_test (#654)

Adds direct unit tests for three untested core modules and expands coverage
of a fourth, all marked @pytest.mark.v2:

- test_api_key_service.py: create/list/revoke/rotate/get against a real tmp
  Database — main paths + error cases (invalid scopes, wrong owner, not-found,
  rotate-keeps-old-key-on-creation-failure).
- test_artifacts.py: export_patch/create_commit/get_status/list_patches with a
  real git repo + workspace (events persisted); git-missing / not-a-repo /
  no-changes error paths; pure parse-helper tests.
- test_review.py: status/severity thresholds, review_files findings/empty/skip/
  analyzer-error paths, review_task delegation, get_review_summary.
- test_prd_stress_test.py: resolve_ambiguities_into_prd, JSON-failure fallbacks,
  malformed-children filtering, sync exception propagation.

Also fixes a real bug surfaced by the new tests: artifacts.get_status() used
result.stdout.strip(), which stripped the leading status column from the first
porcelain line — misclassifying a worktree-only change as staged and dropping
its first filename character. Switched to splitlines().

Closes #654
@coderabbitai

coderabbitai Bot commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Warning

Review limit reached

@frankbria, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 32 minutes and 11 seconds. Learn how PR review limits work.

Your organization has reached its usage spending cap. Adjust your spending cap in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 9ae33e59-b650-487f-9578-0cea4c2b0c24

📥 Commits

Reviewing files that changed from the base of the PR and between 402c48a and 547533c.

📒 Files selected for processing (5)
  • codeframe/core/artifacts.py
  • tests/core/test_api_key_service.py
  • tests/core/test_artifacts.py
  • tests/core/test_prd_stress_test.py
  • tests/core/test_review.py
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch test/654-untested-core-modules

Comment @coderabbitai help to get the list of available commands and usage tips.

@claude

claude Bot commented Jun 17, 2026

Copy link
Copy Markdown

Code Review -- PR 694

Overall: Approve with minor notes. A well-scoped coverage hardening PR that also surfaces a real bug. No previous Claude review to reconcile against.

BUG FIX in artifacts.py: The strip().split() to splitlines() change is correct. The comment explaining why is warranted -- this is the kind of silent column-shift nearly impossible to catch without a test. The 5-line comment is verbose per CLAUDE.md style but the subtlety justifies it.

test_api_key_service.py (238 lines, 21 tests):

  • Good: Cross-user isolation, hash-never-leaks, and the rollback safety test are the right things to cover.
  • Fragility (minor): USER_A=1 is hardcoded to the bootstrap admin ID. If Database.initialize() ever changes that ID, tests break non-obviously. Consider reading back inserted IDs from the fixture.
  • Correctness: The patch targets service.create_api_key. This only intercepts if rotate_api_key calls self.create_api_key rather than a module-level helper. Worth confirming.

test_artifacts.py (279 lines, 23 tests):

  • Good: Real git repo plus real Workspace. skip_if_no_git guard and the no-initial-commit fallback edge case are solid.
  • Fragility (minor): Workspace.new(Workspace) in TestListPatches and test_export_not_a_git_repo_raises bypasses init. Works today but silently misses new required attributes if Workspace grows. Prefer a factory helper or add a comment acknowledging the bypass.
  • Patching: monkeypatch.setattr(artifacts.shutil, 'which', ...) correctly patches on the artifacts module namespace. Right approach.

test_review.py (231 lines, 26 tests):

  • Good: autouse quiet_analyzers fixture elegantly silences all analyzers by default -- no subprocess leakage.
  • Good: Parametrized threshold tests for _determine_status and _severity_from_score cover boundaries cleanly.
  • Minor: _Finding carries only the 5 attrs review.py reads. A brief comment noting it mirrors the minimal analyzer interface would help -- if review.py ever reads a 6th attr, a missing-attribute error is the first signal.

test_prd_stress_test.py (13 new tests):

  • Good: _provider_returning factory makes tests readable.
  • Import style: Several new tests import from prd_stress_test inside the method body. If this mirrors the existing file pattern, fine; otherwise move to class/module scope for consistency with the other new test files in this PR.
  • test_truncated_rewrite_returns_original: Relies on the truncation heuristic being less-than-half-original. A brief inline comment naming that heuristic would help future readers.

Summary of actionable items (all minor):

  1. Read back inserted row IDs in the db fixture rather than assuming USER_A=1=bootstrap admin.
  2. Confirm rotate_api_key calls self.create_api_key (the patch target), not a module-level function.
  3. Add a comment to the Workspace.new bypass in TestListPatches.
  4. Move inline prd_stress_test imports to class/module scope in the new tests.

None of these block merge.

…, get_api_key contract

- review.py: leave the in-process ComplexityAnalyzer real in the autouse
  fixture and add a non-mocked smoke test that locks the finding attribute
  contract review.py depends on.
- artifacts.py: add the MM (staged-then-modified) porcelain case so the
  two-column status parser is exercised in both columns at once.
- api_key_service.py: pin get_api_key's intentional owner-agnostic behavior.
…ests

Addresses review note #3: consolidate the three Workspace.__new__ bypass sites
behind a documented factory helper that sets every attribute artifacts.py reads.
@frankbria

Copy link
Copy Markdown
Owner Author

Thanks for the review. Triage of the four minor notes:

  1. Read-back DB IDs (db fixture) — No change. The fixture uses INSERT OR REPLACE ... VALUES (?, ...) with explicit IDs, so USER_A/USER_B are deterministic regardless of what initialize() seeds — the test does not actually depend on the bootstrap admin id.
  2. Confirm rotate_api_key calls self.create_api_key — Confirmed: api_key_service.py:175 calls self.create_api_key(...), so patching service.create_api_key intercepts it (the passing test proves interception).
  3. Workspace.__new__ bypass — Done in 547533c: consolidated the three sites behind a documented _bare_workspace() factory that sets every attribute artifacts.py reads.
  4. Inline prd_stress_test imports — Kept. These mirror the existing 32 tests in that file (in-file consistency); moving only the new ones would make the file inconsistent with itself.

Also addressed earlier-round feedback: added a real (non-mocked) ComplexityAnalyzer smoke test, the MM staged-and-modified status case, and a test pinning get_api_key's intentional owner-agnostic contract.

@claude

claude Bot commented Jun 17, 2026

Copy link
Copy Markdown

Code Review — PR #694: test(core) coverage hardening (#654)

Overall: This is a well-executed test coverage PR. The structure is consistent across all four new/expanded test files, the real-infrastructure approach (real SQLite + real git) is the right call for these modules, and the artifacts.py bug fix is correct and clearly motivated. A few things worth addressing before merge.


Bug Fix — artifacts.py

The splitlines() change is correct. str.strip() on porcelain output removes the leading space of the first " M filename" line, shifting the two-column field parse and misclassifying it as staged. The added comment explains the why (a subtle off-by-one in column semantics), which is exactly the kind of non-obvious invariant worth documenting.

The dedicated test test_staged_and_then_modified_again directly validates the fix — good.


test_api_key_service.py

Hardcoded prefix length assumption (line 93):

assert result.prefix == result.key[:12]

This hardcodes 12 as the prefix length. If the key format or prefix length ever changes in ApiKeyService, this assertion will fail with a confusing index error rather than a clear contract violation. Either derive it from a named constant (KEY_PREFIX_LEN or similar if one exists) or at minimum assert result.key.startswith(result.prefix) — that's the actual invariant being tested.

User seeding pattern (lines 63–75): The raw INSERT OR REPLACE into users works, and the comment explaining it is helpful. One fragility: if Database.initialize() adds new non-nullable columns to users in future, this insert will start throwing without an obvious failure message. Worth noting in the fixture docstring so it's visible to whoever next touches the schema.

Rotate safety test (line 234): test_rotate_keeps_old_key_when_creation_fails is the most important test in this file — it pins the atomicity contract. Well done.


test_artifacts.py

_bare_workspace pattern: The docstring is honest about the fragility (attributes set by hand, bypasses __init__). That's the right call for pure-filesystem paths that don't need a DB, and the fail-loud guarantee if new attributes are added is sufficient.

test_staged_and_then_modified_again (line 498): Directly validates the bug fix with an MM porcelain entry. Consider adding an assertion that hello.py appears exactly once in each list — the current test confirms presence but not that the parser doesn't double-count:

assert status["staged"].count("hello.py") == 1
assert status["unstaged"].count("hello.py") == 1

test_prd_stress_test.py

test_truncated_rewrite_returns_original (line 716): The "shorter than half" guard is implicit — the test just uses a 200-char original with a 9-char rewrite. If the threshold in resolve_ambiguities_into_prd changes (e.g., to 1/3 or 2/3), this test might silently stop covering the guard. A comment stating the threshold (e.g., # guard: rewrite < 50% of original is treated as truncated) would make the intent durable.

test_no_resolved_answers_returns_original_without_calling_llm (line 685): Excellent test — verifies the early-exit path that avoids a gratuitous LLM call.


test_review.py

quiet_heavy_analyzers autouse fixture (line 821): Good pattern. Silencing bandit/OWASP by default while leaving the radon-based ComplexityAnalyzer real gives the right balance: deterministic by default, with integration coverage from test_real_complexity_analyzer_integration.

test_real_complexity_analyzer_integration (line 943): The generated function (15 if-branches + ternary) should reliably exceed any reasonable cyclomatic complexity threshold, so the assert result.findings assertion shouldn't be flaky. The comment pinning the attribute contract (category/severity/line_number/message/suggestion) is the right reason to keep this test real rather than mocked.

test_get_api_key_is_owner_agnostic (line 264): The docstring explaining that get_api_key() intentionally has no ownership check — and that this test pins that contract so a future change is deliberate — is exactly the right way to document a non-obvious security boundary.


Summary

Area Status
Bug fix (artifacts.py) ✅ Correct, well-tested
test_api_key_service.py ⚠️ Minor: harden prefix-length assertion
test_artifacts.py ✅ Minor: duplicate-count assertion in MM test
test_prd_stress_test.py ✅ Minor: document truncation threshold
test_review.py ✅ Clean
Marker compliance (@pytest.mark.v2) ✅ All new tests carry it
No live API calls

The prefix-length assertion on line 93 is worth fixing before merge (it's the easiest kind of latent breakage to introduce). Everything else is low-priority suggestions. No blocking issues.

@claude

claude Bot commented Jun 17, 2026

Copy link
Copy Markdown

Code Review - PR 694: test coverage hardening. Overall: solid coverage PR. Bug fix is correct, all four test files carry pytestmark v2, no live API calls, tests cover main paths plus realistic error/edge cases. artifacts.py fix: correct and necessary - splitlines() properly preserves the leading space of porcelain status lines that strip().split was eating, fixing misclassification of worktree-only changes as staged. Style note: the 5-line comment added conflicts with CLAUDE.md one-line-max rule; condense to a single line naming the porcelain two-column format. test_api_key_service.py: pass - real SQLite DB, per-user isolation, rotate-on-failure safety invariant. Minor: USER_A=1 couples to bootstrap admin seed ID detail. test_artifacts.py: pass with concern - Workspace.new(Workspace) without init in three stubs bypasses constructor invariants; safer to use a real constructor fixture. The MM porcelain regression test is exactly the right coverage. test_prd_stress_test.py: nit - local imports inside test methods should move to file/class top for consistency. _provider_returning helper is clean. test_review.py: pass - quiet_heavy_analyzers autouse fixture is a good pattern, score arithmetic constants lock the threshold contract, integration test pins the attribute contract against the real analyzer. Summary: comment condensation in artifacts.py is the only CLAUDE.md conflict; everything else is advisory.

@frankbria frankbria merged commit 1c5ebe5 into main Jun 17, 2026
10 checks passed
@frankbria frankbria deleted the test/654-untested-core-modules branch June 17, 2026 21:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[P6.8.1] Add tests for untested core modules: artifacts.py, review.py, api_key_service.py

1 participant