feat(cli): agentv history read command for per-test score timeline

Part of #1155.

## Objective

Add a read-only `agentv history` subcommand that surfaces per-test score timelines from a sidecar baseline file by parsing its `git log`. Gives users "continuous improvement tracking" of eval scores over time without requiring any new storage format or server.

## Why

AgentV's sidecar `<eval>.baseline.jsonl` files are already committed to repos; each time someone runs `agentv compare --update-baseline` (see #1158), the file gets a new commit. `git log -p` on that file is already the source of truth for how scores have evolved. `agentv history` is a thin CLI wrapper that renders that log as a per-test timeline.

## Design latitude

```bash
# Timeline for one test across all commits that touched the baseline file:
agentv history --baseline evals/my-eval.baseline.jsonl --test "handles empty input"

# Example output:
# commit       date        score  verdict
# abc123def    2026-04-23  0.82   pass
# def456abc    2026-04-22  0.79   borderline
# ...

# JSONL for piping into downstream tools:
agentv history --baseline evals/my-eval.baseline.jsonl --test "handles empty input" --format jsonl
```

- Uses git plumbing (`git log --follow -p <file>`) to walk the baseline's history.
- Parses each revision of the JSONL to extract the named test's score and verdict.
- `--since <date>` and `--until <date>` for range filtering (optional).
- No new storage, no server, no config.

## Acceptance signals

- Works on a repo where the baseline file has at least 3 commits touching it (can be set up with a small test fixture).
- Outputs a correctly-ordered (newest first) timeline.
- `--format jsonl` pipes cleanly into `jq`.
- Errors helpfully if the baseline file isn't tracked in git, or if the named test doesn't appear in any revision.

## Non-goals

- Not a dashboard. Terminal/JSONL output only.
- Not a cross-repo or cross-file aggregator — one baseline file per invocation.
- Not writing to or mutating git history.

## Depends on

Nothing new — works against AgentV's existing sidecar baseline convention. More useful once #1158 ships (because `--update-baseline` will produce regular commits to parse), but does not require it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(cli): agentv history read command for per-test score timeline #1160

Objective

Why

Design latitude

Acceptance signals

Non-goals

Depends on

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

feat(cli): agentv history read command for per-test score timeline #1160

Description

Objective

Why

Design latitude

Acceptance signals

Non-goals

Depends on

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions