Tools for downloading minute-level market data and backtesting miners against Synth Subnet scoring.
The library is organised into two pieces:
- app/lib/preparation/market_data.py — downloads and stores minute closes as daily parquet partitions.
- app/lib/backtester/backtest.py — scores a miner's predictions against real price paths and computes smoothed scores / reward weights using the same logic the live validator runs.
- Python ≥ 3.12
uvfor dependency management
Install dependencies once:
uv syncData is fetched from Pyth (spot equities, commodities, majors) or Hyperliquid
(perps) depending on the asset, and cached under
market_data/pyth/{ASSET}/1m/date=YYYY-MM-DD.parquet. Existing finalised
partitions are skipped unless --force-refresh is passed.
uv run app/lib/preparation/market_data.pyThis downloads 15 months of history ending yesterday. The asset list comes
from synth.validator.price_data_provider.PriceDataProvider and includes
majors such as BTC, ETH, SOL, plus tokenised equities/commodities
(XAU, SPYX, NVDAX, TSLAX, AAPLX, GOOGLX).
uv run app/lib/preparation/market_data.py --asset BTC# Last 3 days for BTC only, including today
uv run app/lib/preparation/market_data.py --asset BTC --days 3
# Last 7 days across every asset
uv run app/lib/preparation/market_data.py --days 7--days N anchors the window at today (inclusive). Today's partition is
marked is_final=False and is re-downloaded on every subsequent run.
uv run app/lib/preparation/market_data.py --asset BTC --force-refresh| Flag | Default | Description |
|---|---|---|
--asset |
all assets | Single asset symbol; omit to download every supported asset |
--days |
(15-month default window) | Download N days ending today (inclusive); overrides the default window |
--force-refresh |
off | Re-download partitions that already exist on disk |
from app.lib.preparation.market_data import download_market_data, download_all_assets
# Single asset, default window
download_market_data("BTC")
# Recent N days, every asset
download_all_assets(days=30)market_data/
└── pyth/
└── BTC/
└── 1m/
├── date=2025-01-15.parquet
├── date=2025-01-16.parquet
└── ...
Each parquet contains timestamp, close, source, ingested_at, and
is_final columns. Rows are minute-aligned UTC; gaps are stored as NaN.
A backtest compares a miner's prediction files to real prices, computes CRPS per prompt, and then replays the validator's smoothed-score / reward-weight calculation to produce the miner's rank over time.
The backtester reads prediction files from
miner_outputs/{miner_name}/predictions/**/*.json. Filenames must follow:
YYYY-MM-DD_HH:MM:SSZ_{ASSET}_{time_length}.json
For example: 2026-03-28_00:00:00Z_BTC_86400.json. Two JSON layouts are
accepted (see load_prediction):
- Flat notebook format:
{"start_timestamp", "asset", "time_increment", "time_length", "paths", ...} - ArtifactManager format:
{"simulation_input": {...}, "prediction": [meta, meta, path, ...]}
paths is a num_simulations × (num_steps + 1) array of simulated price paths
starting from the current price.
Targets are expressed as a pair: which frequency profile to run (--profile)
and which asset(s) to run within it (--asset). Defaults are
--profile all --asset ALL, i.e. every asset in every profile. If no
prediction files exist under miner_outputs/{miner_name}/predictions/, the
script auto-generates random-walk predictions into a temp directory so the
pipeline can be verified end-to-end.
# Default: both profiles × every asset, last 2 days
uv run app/lib/backtester/scripts/run_backtest.py --miner-name gbm_agent
# BTC across both profiles
uv run app/lib/backtester/scripts/run_backtest.py --miner-name gbm_agent --asset BTC
# Multiple assets in LOW_FREQUENCY (space-, comma-, or quoted list all work)
uv run app/lib/backtester/scripts/run_backtest.py --miner-name gbm_agent --profile low --asset BTC ETH TSLAX
uv run app/lib/backtester/scripts/run_backtest.py --miner-name gbm_agent --profile low --asset BTC,ETH,TSLAX
uv run app/lib/backtester/scripts/run_backtest.py --miner-name gbm_agent --profile low --asset "BTC ETH TSLAX"
# Every asset in HIGH_FREQUENCY only
uv run app/lib/backtester/scripts/run_backtest.py --miner-name gbm_agent --profile high --asset ALL
# Longer window or custom predictions directory
uv run app/lib/backtester/scripts/run_backtest.py \
--miner-name gbm_agent \
--days 7 \
--profile low --asset BTC \
--predictions-dir /path/to/predictionsAssets not in a given profile's asset_list are filtered out for that
profile; if nothing matches anywhere the script exits with a clear error
listing the supported assets per profile.
CLI flags (see run_backtest.py):
| Flag | Default | Description |
|---|---|---|
--miner-name |
btc_research |
Subdirectory under miner_outputs/ for predictions and chart output |
--days |
2 |
Length of the backtest window (ending now) |
--profile |
all |
Profile to backtest: low, high, or all |
--asset |
ALL |
One or more asset symbols (space-, comma-, or space-inside-quotes separated), or ALL for every asset in the selected profile(s) |
--predictions-dir |
miner_outputs/{miner_name}/predictions |
Where to read predictions from |
The runner uses three nested pools, so a full sweep finishes much faster than the asset count would suggest:
- Profiles (
low,high) run in aThreadPoolExecutor(one thread per profile). - Inside each profile, assets run in their own
ThreadPoolExecutor(up to 6 at a time, I/O-bound on the Synth API). - Inside each asset, per-prompt CRPS scoring is dispatched to a shared
ProcessPoolExecutorsized tocpu_count() - 2.
from synth.validator.prompt_config import LOW_FREQUENCY
from app.lib.backtester.backtest import run_backtest, backtest
# Single asset
single = backtest(
miner_name="gbm_agent",
asset="BTC",
time_length=86_400, # 24h prompts (LOW_FREQUENCY) — use 3_600 for HIGH_FREQUENCY
time_increment=300, # 5-minute steps
n_backtest_days=7,
)
# Whole profile (parallel across assets + emits per-profile TOTAL charts when ≥2 succeed)
results, combined = run_backtest(
miner_name="gbm_agent",
prompt_config=LOW_FREQUENCY,
n_backtest_days=7,
)
print(single.summary) # {num_prompts, mean_crps, final_smoothed_score, ...}
single.prompt_df # per-prompt CRPS and scores (incl. every other miner)
single.smoothed_scores # per-round smoothed score + reward_weightCharts are written to miner_outputs/{miner_name}/charts/:
Per-asset:
rank_evolution_{asset}_{time_length}.png— rank over time (1 = best)crps_over_time_…png,crps_by_hour_…png,crps_by_day_…pngcrps_ratio_dist_…png— distribution of your CRPS relative to medianweekly_percentile_…png— percentile rank per calendar week
Per profile (emitted when ≥2 assets in that profile produce results):
rank_evolution_TOTAL_{profile}.png— combined rank across the profile's assetsestimated_earnings_{profile}.png— per-round USD + cumulative earnings estimate
Grand total (emitted when both profiles produced data):
rank_evolution_GRAND_TOTAL.pngestimated_earnings_GRAND_TOTAL.png
The console prints per-asset rank, reward weight, smoothed score, prompt count, mean CRPS, and the paths of every saved chart.
Unit and integration tests live in tests/lib/backtester/.
uv run pytest tests/lib/backtester/- The backtester pulls scored prompts and rewards history from
https://api.synthdata.co. The Synth API rate-limits; requests are retried with exponential backoff. Long multi-asset runs take a while. download_price_datareads exclusively from local parquet partitions. Make sure the relevantmarket_data/pyth/{ASSET}/1m/directory is populated (see section 1) — the HIGH_FREQUENCY profile needs coverage up to today since its prompt window is only 1h.