diff --git a/gists/timeseries/conformal/README.md b/gists/timeseries/conformal/README.md
new file mode 100644
index 0000000..8761992
--- /dev/null
+++ b/gists/timeseries/conformal/README.md
@@ -0,0 +1,120 @@
+# NNS under one honest forecasting protocol
+
+A single block walk-forward over one heteroskedastic data-generating process,
+emitting two coherent analyses from the *same* leak-free NNS forecast:
+
+1. **Point duel** — does NNS actually forecast better than a fair baseline?
+2. **Interval study** — given NNS's forecast, is its native prediction interval
+   better or worse than conformalizing the same residuals?
+
+This replaces the benchmark withdrawn under issue #57, which (a) leaked each
+evaluation chunk into `NNS.ARMA.optim`'s validation tail and (b) compared methods
+across mismatched protocols (online h=1 baselines vs. an NNS multi-step block).
+
+## The protocol (one rule for everyone)
+
+> At each origin *t*, given only data through *t*, every method forecasts an
+> `h`-step block (`h = implied_h = t·(1−0.9)/0.9`) with **no online updating** —
+> no method may peek at a realized value to predict the next one.
+
+This is *forecasting by design*, exposed to its own compounding error — the
+opposite of an online method that re-anchors on the truth every step. NNS's model
+(periods/method/bias) is selected on a strictly **historical** validation tail, so
+the scored block is never shown to the optimizer (the #57 leak is fixed).
+
+- **DGP:** non-linear, heteroskedastic AR(1), slow trend, two seasonal components
+  (periods 50, 200), piecewise volatility regimes (σ → 2.5, 0.55, 1.8). 10 seeds,
+  T = 3500, ~12 blocks/seed.
+
+## Point duel
+
+- **NNS block** — `NNS.ARMA.optim` native block forecast (leak-free selection).
+- **Ridge (recursive)** — ridge on `N_LAGS` lags, fit on all data ≤ *t*, projected
+  `h` steps recursively (its own predictions become the lags; no true intermediate
+  values — this removes the h=1 true-lag crutch the original baselines enjoyed).
+- **Persistence** — last value carried forward (floor).
+
+```
+           method   MAE  RMSE  median_AE
+        NNS block 1.514 2.056      1.122
+Ridge (recursive) 2.664 3.230      2.403
+      Persistence 2.493 3.241      1.991
+```
+
+## Interval study
+
+We deliberately **adapt** conformal into a forecast wrapper here, and say so. Online
+adaptive conformal (ACI/PID/NexCP) gets its coverage from a per-step feedback loop —
+a control-theoretic property, not forecast skill — so pitting it against a multi-step
+block forecaster on a 1-step task tests the wrong thing. Instead we hold the NNS point
+forecast **fixed** and vary only the band, to discern **coverage guarantees on a
+heteroskedastic process** — conditional (per-regime / worst-window) coverage, not just
+the marginal rate:
+
+- **NNS native PI** — `results ± pi_width`, NNS's own flat nonparametric rule.
+- **NNS + split-CP (flat)** — empirical (1−α) conformal quantile of NNS residuals.
+- **NNS + split-CP (per-lead)** — a quantile *per lead-time k* (the only band that
+  widens with horizon).
+- **NNS + Gaussian (flat)** — `z · std(residuals)`.
+- **Ridge / Persistence + split-CP (per-lead)** — same wrapper on the weaker point
+  models, to show interval quality follows point quality.
+
+```
+                                 method  marg_cov  worst_win_cov  cov_lowvol  cov_hivol  cond_cov_gap  width  interval_score
+                          NNS native PI     0.845          0.561       0.927      0.853         0.149  5.658           8.674
+                  NNS + split-CP (flat)     0.824          0.515       0.899      0.850         0.169  5.398           8.735
+                  NNS + Gaussian (flat)     0.821          0.516       0.894      0.842         0.170  5.332           8.773
+              NNS + split-CP (per-lead)     0.918          0.649       0.996      0.858         0.097  8.595          10.505
+Ridge (recursive) + split-CP (per-lead)     0.865          0.553       0.955      0.826         0.149 10.169          13.853
+      Persistence + split-CP (per-lead)     0.887          0.333       0.981      0.807         0.162 12.107          16.440
+```
+
+Point table sorted by RMSE, interval table by interval (Winkler) score; lower is
+better throughout.
+
+## Findings
+
+- **NNS wins the point duel decisively** — MAE 1.51 vs. recursive ridge 2.66
+  (~43% lower) and persistence 2.49, same ordering on RMSE/median, consistent
+  across all 10 seeds. Recursive ridge falls *below* persistence on MAE: strip the
+  h=1 true-lag crutch and a linear AR model's error compounds over the block until
+  naive carry-forward beats it. Structural forecasting by design vs. reactive by
+  compulsion, made concrete.
+- **NNS's native PI is the efficiency winner and ≈ a flat split-conformal band.**
+  Best interval score (8.674) at the tightest competitive width, and it tracks the
+  empirical split-conformal quantile on the *same* residuals (8.735) — confirming
+  the native band is, in effect, flat split conformal, slightly more conservative.
+- **Every flat band under-covers the volatile regime** (marg 0.82–0.85, worst-window
+  ~0.51–0.56, `cov_hivol` ~0.84–0.85). Calibrating a single width on historical
+  residuals cannot transport to a multi-step block that compounds error and crosses
+  σ-regimes — a textbook exchangeability failure under heteroskedasticity.
+- **Only the horizon-adaptive per-lead wrapper recovers the guarantee** — near-nominal
+  0.918 marginal, the smallest conditional gap (0.097) and best worst-window (0.649) —
+  at ~52% wider intervals and the worst Winkler score among the NNS bands. No free
+  lunch: coverage on a heteroskedastic process costs width.
+- **Interval quality follows point quality.** The *same* per-lead wrapper on the
+  weaker point models scores far worse (NNS 10.5 ≪ ridge 13.9 ≪ persistence 16.4)
+  with much wider bands — a better forecast makes a tighter, better-centred interval.
+
+## How to read it
+
+- **Marginal coverage near 0.90 is necessary but cheap.** On a heteroskedastic DGP
+  the honest test is the **conditional** gap between calm (`cov_lowvol`) and volatile
+  (`cov_hivol`) regimes and the **worst rolling window** — a flat band that hits 0.90
+  on average by over-covering calm and under-covering volatile has not solved the
+  problem. Every flat band here under-covers the volatile regime: that is the
+  exchangeability-violation penalty, common to all of them, not specific to any one.
+- **Width is only a fair tie-breaker at matched coverage.** Among methods with
+  different coverage, a narrower band may just be under-covering more. The interval
+  (Winkler) score combines the two (`width + (2/α)·exceedance`), which is why it is
+  the sort key.
+
+## Run it
+
+```bash
+pip install ovvo-nns numpy pandas scipy scikit-learn
+python run_conformal.py
+```
+
+Writes `results/point.csv`, `results/interval.csv` (aggregated) and the per-seed
+`*_all.csv`.
diff --git a/gists/timeseries/conformal/results/interval.csv b/gists/timeseries/conformal/results/interval.csv
new file mode 100644
index 0000000..206495f
--- /dev/null
+++ b/gists/timeseries/conformal/results/interval.csv
@@ -0,0 +1,7 @@
+method,marg_cov,worst_win_cov,cov_lowvol,cov_hivol,cond_cov_gap,width,interval_score
+NNS native PI,0.845217041800643,0.561,0.9271704180064309,0.852572347266881,0.14855305466237945,5.658247904936664,8.673563693193246
+NNS + split-CP (flat),0.8243569131832797,0.515,0.8993569131832798,0.84983922829582,0.1686495176848875,5.397754711147717,8.734633715776543
+NNS + Gaussian (flat),0.8206993569131832,0.516,0.8940514469453376,0.8421221864951768,0.17025723472668813,5.332100605902861,8.772635078222827
+NNS + split-CP (per-lead),0.917604501607717,0.649,0.9959807073954984,0.8577170418006432,0.09655948553054661,8.59459297384414,10.505483113927781
+Ridge (recursive) + split-CP (per-lead),0.865032154340836,0.553,0.9546623794212218,0.8263665594855306,0.14855305466237945,10.16897349633998,13.853289701795891
+Persistence + split-CP (per-lead),0.88725884244373,0.333,0.9810289389067524,0.807395498392283,0.1617684887459807,12.106951264773958,16.439867386859973
diff --git a/gists/timeseries/conformal/results/interval_all.csv b/gists/timeseries/conformal/results/interval_all.csv
new file mode 100644
index 0000000..cb5ba21
--- /dev/null
+++ b/gists/timeseries/conformal/results/interval_all.csv
@@ -0,0 +1,61 @@
+method,marg_cov,worst_win_cov,cov_lowvol,cov_hivol,cond_cov_gap,width,interval_score
+NNS native PI,0.8476688102893891,0.67,0.9131832797427653,0.8070739549839229,0.09453376205787789,5.3336289551117595,8.347311881079282
+NNS + split-CP (flat),0.8271704180064309,0.58,0.8954983922829582,0.8311897106109325,0.14919614147909965,5.227161385540743,8.534113534345261
+NNS + Gaussian (flat),0.8295819935691319,0.58,0.8858520900321544,0.8279742765273312,0.14276527331189715,5.220493265073913,8.535512774975702
+NNS + split-CP (per-lead),0.9147909967845659,0.65,0.9967845659163987,0.8183279742765274,0.09678456591639872,8.425865285679825,10.568242001830663
+Ridge (recursive) + split-CP (per-lead),0.8661575562700965,0.64,0.9598070739549839,0.7909967845659164,0.10900321543408364,10.581823596303991,14.592993775300465
+Persistence + split-CP (per-lead),0.8721864951768489,0.44,0.9710610932475884,0.707395498392283,0.19260450160771703,10.549102615950632,14.52448029198288
+NNS native PI,0.8287781350482315,0.52,0.9163987138263665,0.8617363344051447,0.2022508038585209,5.538396478733216,9.039315486588494
+NNS + split-CP (flat),0.815112540192926,0.51,0.9067524115755627,0.8440514469453376,0.23118971061093252,5.423595785308306,9.16356799285496
+NNS + Gaussian (flat),0.8163183279742765,0.52,0.9163987138263665,0.8408360128617364,0.23118971061093252,5.371605324250339,9.132929020698203
+NNS + split-CP (per-lead),0.9111736334405145,0.6,0.9967845659163987,0.8456591639871383,0.09678456591639872,8.679187273640938,10.817084198706139
+Ridge (recursive) + split-CP (per-lead),0.8356109324758842,0.55,0.932475884244373,0.7909967845659164,0.1909967845659164,9.768945508312589,14.299961658445255
+Persistence + split-CP (per-lead),0.8934887459807074,0.52,0.9967845659163987,0.7508038585209004,0.14919614147909965,11.420957332270556,14.557742149799997
+NNS native PI,0.8480707395498392,0.54,0.9453376205787781,0.8713826366559485,0.14115755627009652,5.498313881263709,8.568987367833104
+NNS + split-CP (flat),0.8348070739549839,0.47,0.9115755627009646,0.8858520900321544,0.16366559485530552,5.475440096762711,8.646211618108355
+NNS + Gaussian (flat),0.8340032154340836,0.47,0.905144694533762,0.8681672025723473,0.17009646302250803,5.3825939727945,8.623802541408669
+NNS + split-CP (per-lead),0.9127813504823151,0.6,0.9871382636655949,0.8810289389067524,0.09292604501607715,8.192814278346807,10.105807673835704
+Ridge (recursive) + split-CP (per-lead),0.8701768488745981,0.6,0.9517684887459807,0.8520900321543409,0.16045016077170415,9.911929457282502,13.517878973369866
+Persistence + split-CP (per-lead),0.8782154340836013,0.24,0.9919614147909968,0.7893890675241158,0.15562700964630227,13.918516762035877,18.90446027710822
+NNS native PI,0.8404340836012861,0.55,0.9469453376205788,0.8762057877813505,0.17009646302250803,5.417943415993259,8.509212166486481
+NNS + split-CP (flat),0.8183279742765274,0.51,0.8858520900321544,0.8569131832797428,0.15241157556270102,5.073486836478882,8.35866609307891
+NNS + Gaussian (flat),0.8207395498392283,0.54,0.8697749196141479,0.8569131832797428,0.13151125401929264,5.187680907668864,8.312439020985863
+NNS + split-CP (per-lead),0.9232315112540193,0.65,0.9935691318327974,0.8665594855305466,0.09356913183279736,8.47780821511154,10.420173113316652
+Ridge (recursive) + split-CP (per-lead),0.8629421221864951,0.56,0.954983922829582,0.8054662379421221,0.14115755627009652,10.224738158689307,13.735472897847638
+Persistence + split-CP (per-lead),0.9143890675241158,0.11,1.0,0.8118971061093248,0.09999999999999998,14.479797580928631,19.260873784808723
+NNS native PI,0.8472668810289389,0.51,0.9405144694533762,0.7733118971061094,0.12668810289389065,5.912994900178406,8.952518643444352
+NNS + split-CP (flat),0.8243569131832797,0.58,0.9067524115755627,0.7765273311897106,0.15401929260450165,5.307560736619756,8.610536381423858
+NNS + Gaussian (flat),0.8171221864951769,0.57,0.887459807073955,0.7733118971061094,0.1508038585209004,5.2117633612580585,8.673499632507543
+NNS + split-CP (per-lead),0.9212218649517685,0.67,0.9903536977491961,0.8327974276527331,0.0903536977491961,8.612047167435271,10.189042171041397
+Ridge (recursive) + split-CP (per-lead),0.8621382636655949,0.48,0.9421221864951769,0.8183279742765274,0.15401929260450165,9.940825035097003,13.418993354272594
+Persistence + split-CP (per-lead),0.9308681672025724,0.6,0.9453376205787781,0.8569131832797428,0.07588424437299035,12.099187370057201,13.957050019151211
+NNS native PI,0.860128617363344,0.58,0.9501607717041801,0.8906752411575563,0.15562700964630227,5.662363541166043,8.391587116904702
+NNS + split-CP (flat),0.8243569131832797,0.37,0.887459807073955,0.8745980707395499,0.14919614147909965,5.3533136605690315,8.60722523849902
+NNS + Gaussian (flat),0.8247588424437299,0.37,0.9019292604501608,0.8729903536977492,0.15241157556270102,5.31108027242601,8.649115111814341
+NNS + split-CP (per-lead),0.917604501607717,0.62,0.9967845659163987,0.8971061093247589,0.09678456591639872,8.460963622012667,10.275595768590357
+Ridge (recursive) + split-CP (per-lead),0.8677652733118971,0.5,0.9308681672025724,0.8504823151125402,0.14437299035369777,10.008320672419277,13.318984886956887
+Persistence + split-CP (per-lead),0.8995176848874598,0.09,0.9967845659163987,0.7877813504823151,0.1122186495176849,15.609723603863243,21.539983068871113
+NNS native PI,0.8432475884244373,0.62,0.9019292604501608,0.8408360128617364,0.10900321543408364,5.958905251551655,8.884737288789054
+NNS + split-CP (flat),0.802652733118971,0.55,0.9019292604501608,0.8344051446945338,0.18778135048231515,5.514395594522053,9.282531631715036
+NNS + Gaussian (flat),0.7930064308681672,0.54,0.887459807073955,0.8215434083601286,0.1958199356913184,5.443032862596285,9.404878484224922
+NNS + split-CP (per-lead),0.9228295819935691,0.68,1.0,0.8440514469453376,0.09999999999999998,9.20007850046642,10.997501962538344
+Ridge (recursive) + split-CP (per-lead),0.8782154340836013,0.59,0.9839228295819936,0.8456591639871383,0.14758842443729903,10.477958983260796,14.128418910843173
+Persistence + split-CP (per-lead),0.8380225080385852,0.24,0.9823151125401929,0.7379421221864951,0.24565916398713827,11.622172326163268,17.7186519202099
+NNS native PI,0.8404340836012861,0.42,0.9501607717041801,0.8665594855305466,0.19742765273311902,5.533785215198311,8.749410650539787
+NNS + split-CP (flat),0.8336012861736335,0.47,0.9212218649517685,0.8488745980707395,0.16045016077170415,5.394905745099468,8.708916464271185
+NNS + Gaussian (flat),0.8211414790996785,0.46,0.9115755627009646,0.8408360128617364,0.18135048231511253,5.203825372900099,8.80921517840864
+NNS + split-CP (per-lead),0.9143890675241158,0.65,1.0,0.864951768488746,0.09999999999999998,8.344194523078558,10.216779329031802
+Ridge (recursive) + split-CP (per-lead),0.8633440514469454,0.5,0.9630225080385852,0.837620578778135,0.17813504823151127,9.899806431265855,13.626610825171564
+Persistence + split-CP (per-lead),0.9184083601286174,0.72,0.9758842443729904,0.8585209003215434,0.07588424437299035,10.229330031021911,12.202300632363936
+NNS native PI,0.8484726688102894,0.58,0.905144694533762,0.8778135048231511,0.13151125401929264,6.028058412975543,8.799144860895987
+NNS + split-CP (flat),0.8311897106109325,0.52,0.887459807073955,0.8697749196141479,0.1620578778135049,5.725218656291451,8.758728314500287
+NNS + Gaussian (flat),0.8255627009646302,0.52,0.8858520900321544,0.8569131832797428,0.1620578778135049,5.649387814291777,8.818937346230603
+NNS + split-CP (per-lead),0.9172025723472669,0.66,0.9983922829581994,0.8520900321543409,0.09839228295819935,8.733533232757578,10.717532145849189
+Ridge (recursive) + split-CP (per-lead),0.8633440514469454,0.51,0.9565916398713826,0.8279742765273312,0.14276527331189715,10.30781624713549,14.048835006470576
+Persistence + split-CP (per-lead),0.8464630225080386,0.2,0.9565916398713826,0.8842443729903537,0.28906752411575565,9.805366045553008,15.29539421577076
+NNS native PI,0.8476688102893891,0.62,0.9019292604501608,0.860128617363344,0.1572347266881029,5.69808899719474,8.493411469371221
+NNS + split-CP (flat),0.8319935691318328,0.59,0.8890675241157556,0.8762057877813505,0.17652733118971065,5.482468614284766,8.675839888968568
+NNS + Gaussian (flat),0.8247588424437299,0.59,0.8890675241157556,0.8617363344051447,0.1845659163987139,5.339542905768758,8.766021670973783
+NNS + split-CP (per-lead),0.9208199356913184,0.71,1.0,0.8745980707395499,0.09999999999999998,8.819437639911786,10.747072774537559
+Ridge (recursive) + split-CP (per-lead),0.8806270096463023,0.6,0.9710610932475884,0.8440514469453376,0.11704180064308689,10.567570873632986,13.844746729280885
+Persistence + split-CP (per-lead),0.8810289389067524,0.17,0.9935691318327974,0.8890675241157556,0.22154340836012865,11.335358979895252,16.437737508532983
diff --git a/gists/timeseries/conformal/results/point.csv b/gists/timeseries/conformal/results/point.csv
new file mode 100644
index 0000000..3cfd679
--- /dev/null
+++ b/gists/timeseries/conformal/results/point.csv
@@ -0,0 +1,4 @@
+method,MAE,RMSE,median_AE
+NNS block,1.5136011272156582,2.0556925586936883,1.12208606811395
+Ridge (recursive),2.6644732816036893,3.2304601906237487,2.4028239883431057
+Persistence,2.493290248083914,3.24141533222641,1.9910934283341892
diff --git a/gists/timeseries/conformal/results/point_all.csv b/gists/timeseries/conformal/results/point_all.csv
new file mode 100644
index 0000000..7900c80
--- /dev/null
+++ b/gists/timeseries/conformal/results/point_all.csv
@@ -0,0 +1,31 @@
+method,MAE,RMSE,median_AE
+NNS block,1.4624613757188,2.0095523514648352,1.0714033811834391
+Ridge (recursive),2.7093079489173206,3.2849148002794246,2.4276541206659417
+Persistence,2.5207425403649664,3.1838137342389228,2.0779938586077042
+NNS block,1.5179195200934121,2.0875213420368293,1.105426032360385
+Ridge (recursive),2.8282510401794556,3.3952268823991485,2.5875729210928973
+Persistence,2.3736672434422657,3.040742657230395,1.8980380282638447
+NNS block,1.4996940127463412,2.00757652089875,1.115767753513877
+Ridge (recursive),2.5709301773652182,3.1339434936155377,2.335881443966338
+Persistence,2.9117732238980736,3.794140173915152,2.234972066371844
+NNS block,1.4593534764791098,2.001246583495766,1.0864650913903005
+Ridge (recursive),2.6298954565866794,3.190132803482008,2.3755657648263804
+Persistence,2.4740376467312735,3.304377116943708,1.9953112501835901
+NNS block,1.5271989128921286,2.049606520740077,1.1602056950481388
+Ridge (recursive),2.596901417871215,3.1481544415297216,2.3307565425306667
+Persistence,2.3415372011880113,2.992295950923396,1.9604867694701609
+NNS block,1.4920674612633236,2.008429290064173,1.1172318673345731
+Ridge (recursive),2.614711558422066,3.1746112295444617,2.308934798983637
+Persistence,2.6319063703191286,3.543647928791187,2.0564798773344357
+NNS block,1.6060225425685897,2.1818865891046224,1.1965950466152333
+Ridge (recursive),2.717725103031881,3.295389524149246,2.4449283484676214
+Persistence,2.6267172703935375,3.4629309634408636,1.9490510884493548
+NNS block,1.4824820449499643,2.0065783979058245,1.1035396755182818
+Ridge (recursive),2.6981442061600074,3.249449182036675,2.5191619066463744
+Persistence,2.16538422214265,2.7236495214185332,1.867662248183055
+NNS block,1.5806024483770957,2.126087726777259,1.1775229045548874
+Ridge (recursive),2.6103148900670363,3.1982342648198507,2.298518235730201
+Persistence,2.4716622184905397,3.181652504844425,1.9899147900891014
+NNS block,1.5082094770678165,2.0784402644487474,1.086703233620384
+Ridge (recursive),2.6685510174360134,3.2345452843814164,2.3992658005209986
+Persistence,2.4154745438686924,3.1869027705175195,1.8810243063888008
diff --git a/gists/timeseries/conformal/run_conformal.py b/gists/timeseries/conformal/run_conformal.py
new file mode 100644
index 0000000..9245bc0
--- /dev/null
+++ b/gists/timeseries/conformal/run_conformal.py
@@ -0,0 +1,324 @@
+"""
+run_conformal.py
+================================================================================
+Time-series block benchmark — NNS under one honest protocol (issue #57 redux).
+
+A single walk-forward emits BOTH analyses, from the same leak-free NNS block
+forecast (computed once per origin):
+
+  POINT DUEL — who forecasts best?
+    At each origin t, given only data through t, every method forecasts an
+    h-step block (h = implied_h = t*(1-0.9)/0.9) with NO online updating — no
+    method peeks at a realized value to predict the next one.
+      * NNS block        — NNS.ARMA.optim, model selected on a strictly
+                           historical validation tail (leak-free).
+      * Ridge (recursive)— ridge on N_LAGS lags, fit on all data <= t, projected
+                           h steps recursively (own predictions become lags).
+      * Persistence      — last value carried forward (floor).
+
+  INTERVAL STUDY — given the NNS forecast, how should the band be drawn?
+    The NNS point forecast is held FIXED and only the interval construction
+    varies, on identical residuals:
+      * NNS native PI            — results +/- pi_width (flat, NNS's own rule)
+      * NNS + split-CP (flat)    — empirical (1-a) quantile of NNS residuals
+      * NNS + split-CP (per-lead)— quantile per lead-time k (widens w/ horizon)
+      * NNS + Gaussian (flat)    — z * std(residuals)
+    Plus Ridge/Persistence + split-CP (per-lead) to show that interval quality
+    follows point quality.
+
+Framed as an adaptation to discern coverage guarantees on a heteroskedastic
+process: marginal coverage is cheap; the test is conditional (per-regime /
+worst-window) coverage and whether width adapts.
+
+Run:
+    pip install ovvo-nns numpy pandas scipy scikit-learn
+    python run_conformal.py
+Writes results/{point,interval}{,_all}.csv
+"""
+from __future__ import annotations
+
+import os
+import math
+import time
+import warnings
+from collections import defaultdict
+
+import numpy as np
+import pandas as pd
+from scipy import stats
+
+warnings.filterwarnings("ignore")
+
+try:
+    import nns as NNS
+except ImportError:
+    raise SystemExit("ovvo-nns is required. Run: pip install ovvo-nns")
+
+try:
+    from sklearn.linear_model import Ridge
+    from sklearn.pipeline import make_pipeline
+    from sklearn.preprocessing import StandardScaler
+    HAS_SKLEARN = True
+except ImportError:
+    HAS_SKLEARN = False
+
+# ── Constants ────────────────────────────────────────────────────────────────
+ALPHA       = 0.10
+TARGET_COV  = 1.0 - ALPHA
+N_LAGS      = 12
+CAL_END     = 1000
+WINDOW      = 100
+N_SEEDS     = 10
+TRAINING_FRAC = 0.90
+VAL_FRAC    = 0.10
+CP_MIN_POOL = 8
+LOCAL_WIN   = 2
+
+os.makedirs("results", exist_ok=True)
+_MSE = lambda predicted, actual: np.mean((predicted - actual) ** 2)
+POINT_METHODS = ["NNS block", "Ridge (recursive)", "Persistence"]
+
+# ── DGP ──────────────────────────────────────────────────────────────────────
+def make_timeseries(T: int = 3500, seed: int = 0, heavy_tail: bool = False) -> dict:
+    rng = np.random.default_rng(seed + 1)
+    tt    = np.arange(1, T + 1, dtype=float)
+    level = 0.002 * tt + 1.50 * np.sin(2*np.pi*tt/50) + 0.75 * np.sin(2*np.pi*tt/200)
+    sigma = np.ones(T)
+    sigma[(tt > 900)  & (tt <= 1400)] = 2.5
+    sigma[(tt > 1900) & (tt <= 2450)] = 0.55
+    sigma[(tt > 2800)]                = 1.8
+    eps = rng.standard_t(5, size=T) / math.sqrt(5/3) if heavy_tail else rng.standard_normal(T)
+    y   = np.empty(T)
+    y[0] = level[0] + sigma[0] * eps[0]
+    for i in range(1, T):
+        y[i] = level[i] + 0.55 * (y[i-1] - level[i-1]) + sigma[i] * eps[i]
+    return {"y": y, "level": level, "sigma": sigma}
+
+def lag_features(y: np.ndarray, n_lags: int = N_LAGS):
+    n  = len(y)
+    yy = y[n_lags:]
+    X  = np.column_stack([y[n_lags - k: n - k] for k in range(1, n_lags + 1)])
+    return X, yy
+
+# ── Point forecasters (each returns an h-step block from end of y_hist) ──────
+def forecast_ridge_recursive(y_hist: np.ndarray, h: int, n_lags: int = N_LAGS) -> np.ndarray:
+    X, yy = lag_features(y_hist, n_lags)
+    if HAS_SKLEARN:
+        model = make_pipeline(StandardScaler(), Ridge(alpha=1.0))
+        model.fit(X, yy)
+        predict = lambda lags: float(model.predict(lags.reshape(1, -1))[0])
+    else:
+        Xtr = np.column_stack([X, np.ones(len(X))])
+        coef, *_ = np.linalg.lstsq(Xtr, yy, rcond=None)
+        predict = lambda lags: float(np.append(lags, 1.0) @ coef)
+    hist = list(y_hist[-n_lags:]); out = []
+    for _ in range(h):
+        lags = np.array(hist[-1::-1][:n_lags])
+        yhat = predict(lags); out.append(yhat); hist.append(yhat)
+    return np.asarray(out)
+
+def forecast_persistence(y_hist: np.ndarray, h: int) -> np.ndarray:
+    return np.full(h, float(y_hist[-1]))
+
+def get_nns_seas_periods(series: np.ndarray, train_n: int) -> list[int]:
+    max_period = int(train_n / 3) - 1
+    try:
+        raw = NNS.nns_seas(series, plot=False).get("periods", np.array([]))
+        seen, valid = set(), []
+        for p in raw:
+            pi = int(p)
+            if 1 < pi < max_period and pi not in seen:
+                seen.add(pi); valid.append(pi)
+        return valid if valid else [2]
+    except Exception:
+        return [2]
+
+def forecast_nns_block(y_hist: np.ndarray, h: int):
+    """Leak-free NNS block: returns (point, native_lo, native_hi, cal_resid)."""
+    n = len(y_hist)
+    val_h = max(1, int(round(VAL_FRAC * n)))
+    periods = get_nns_seas_periods(y_hist, n)
+    fit = NNS.nns_arma_optim(
+        variable=y_hist, h=h, training_set=n - val_h, seasonal_factor=periods,
+        negative_values=True, obj_fn=_MSE, objective="min",
+        linear_approximation=True, pred_int=TARGET_COV, print_trace=False, plot=False,
+    )
+    point  = np.asarray(fit["results"], float)
+    nat_lo = np.asarray(fit["lower.pred.int"], float)
+    nat_hi = np.asarray(fit["upper.pred.int"], float)
+    cal_resid = np.asarray(fit["errors"], float) + float(fit["bias.shift"])
+    return point, nat_lo, nat_hi, cal_resid
+
+# ── Metrics ──────────────────────────────────────────────────────────────────
+def z_alpha(alpha: float = ALPHA) -> float:
+    return float(stats.norm.ppf(1 - alpha / 2))
+def coverage(lo, hi, y) -> float:
+    lo, hi, y = map(np.asarray, (lo, hi, y)); return float(np.mean((y >= lo) & (y <= hi)))
+def mean_width(lo, hi) -> float:
+    return float(np.mean(np.asarray(hi) - np.asarray(lo)))
+def rolling_coverage(lo, hi, y, window: int = WINDOW) -> np.ndarray:
+    lo, hi, y = map(np.asarray, (lo, hi, y)); n = len(y)
+    if n < window: return np.array([])
+    return np.array([coverage(lo[i:i+window], hi[i:i+window], y[i:i+window]) for i in range(n-window+1)])
+def worst_window_coverage(lo, hi, y, window: int = WINDOW) -> float:
+    rc = rolling_coverage(lo, hi, y, window); return float(np.min(rc)) if len(rc) > 0 else float("nan")
+def interval_score(lo, hi, y, alpha: float = ALPHA) -> float:
+    lo, hi, y = map(np.asarray, (lo, hi, y))
+    return float(np.mean((hi - lo) + (2/alpha)*np.maximum(lo - y, 0) + (2/alpha)*np.maximum(y - hi, 0)))
+def coverage_by_stratum(lo, hi, y, sigma, k: int = 4) -> list[float]:
+    lo, hi, y, sigma = map(np.asarray, (lo, hi, y, sigma))
+    ranks = stats.rankdata(sigma, method="ordinal")
+    grp   = np.ceil(ranks / len(ranks) * k).astype(int).clip(1, k)
+    return [coverage(lo[grp == j], hi[grp == j], y[grp == j]) for j in range(1, k+1)]
+def _emp_q(scores, level=TARGET_COV):
+    s = np.sort(np.abs(np.asarray(scores, float))); n = len(s)
+    if n == 0: return None
+    k = int(np.ceil((n + 1) * level)) - 1
+    return float("inf") if k >= n else float(s[k])
+
+# ── One seed: single walk-forward, both analyses ─────────────────────────────
+def run_once(seed: int = 0, heavy_tail: bool = False, verbose: bool = True) -> dict:
+    d = make_timeseries(T=3500, seed=seed, heavy_tail=heavy_tail)
+    y, sig = d["y"], d["sigma"]
+    T_raw = len(y); z = z_alpha(ALPHA)
+
+    pt_pred = defaultdict(list)
+    acc_y, acc_sig = [], []
+    lo_acc = defaultdict(list); hi_acc = defaultdict(list); sig_acc = defaultdict(list)
+    per_lead = {m: defaultdict(list) for m in POINT_METHODS}   # realized |resid| by lead
+    flat_pool = {m: [] for m in POINT_METHODS}                 # realized |resid| pooled
+
+    current_train = N_LAGS + CAL_END
+    t0 = time.time(); n_chunks = 0
+
+    def cp_width(m, k, q_fallback):
+        pool = []
+        for kk in range(max(0, k - LOCAL_WIN), k + LOCAL_WIN + 1):
+            pool.extend(per_lead[m].get(kk, []))
+        if len(pool) >= CP_MIN_POOL:
+            q = _emp_q(pool, TARGET_COV)
+            if q is not None and np.isfinite(q):
+                return q
+        qf = _emp_q(flat_pool[m], TARGET_COV) if flat_pool[m] else None
+        return qf if (qf is not None and np.isfinite(qf)) else q_fallback
+
+    while current_train < T_raw:
+        remaining = T_raw - current_train
+        implied_h = max(1, int(current_train * (1 - TRAINING_FRAC) / TRAINING_FRAC))
+        h = min(remaining, implied_h)
+        y_hist = y[:current_train]
+        yb  = y[current_train:current_train + h]
+        sgb = sig[current_train:current_train + h]
+
+        try:
+            nns_pt, nat_lo, nat_hi, cal_resid = forecast_nns_block(y_hist, h)
+        except Exception as exc:
+            if verbose: print(f"  [seed {seed} chunk {n_chunks+1} NNS failed: {exc}] skip")
+            current_train += h; continue
+
+        preds = {
+            "NNS block":         nns_pt,
+            "Ridge (recursive)": forecast_ridge_recursive(y_hist, h),
+            "Persistence":       forecast_persistence(y_hist, h),
+        }
+        n_chunks += 1
+
+        q_flat_nns = _emp_q(cal_resid, TARGET_COV)
+        if q_flat_nns is None or not np.isfinite(q_flat_nns):
+            q_flat_nns = float(np.std(cal_resid) + 1e-6)
+        s_flat_nns = max(float(np.std(cal_resid)), 1e-8)
+
+        # ── INTERVAL STUDY on the fixed NNS point ────────────────────────────
+        lo_acc["NNS native PI"].extend(nat_lo); hi_acc["NNS native PI"].extend(nat_hi)
+        sig_acc["NNS native PI"].extend(np.maximum((nat_hi - nat_lo)/2.0, 1e-8)/z)
+
+        lo_acc["NNS + split-CP (flat)"].extend(nns_pt - q_flat_nns)
+        hi_acc["NNS + split-CP (flat)"].extend(nns_pt + q_flat_nns)
+        sig_acc["NNS + split-CP (flat)"].extend(np.full(h, max(q_flat_nns/z, 1e-8)))
+
+        lo_acc["NNS + Gaussian (flat)"].extend(nns_pt - z*s_flat_nns)
+        hi_acc["NNS + Gaussian (flat)"].extend(nns_pt + z*s_flat_nns)
+        sig_acc["NNS + Gaussian (flat)"].extend(np.full(h, s_flat_nns))
+
+        # ── per-lead split-CP for each point method (shared machinery) ───────
+        for m in POINT_METHODS:
+            p = preds[m]
+            name = f"{m} + split-CP (per-lead)" if m != "NNS block" else "NNS + split-CP (per-lead)"
+            lo = np.empty(h); hi = np.empty(h); ss = np.empty(h)
+            for k in range(h):
+                qk = cp_width(m, k, q_flat_nns)
+                lo[k] = p[k] - qk; hi[k] = p[k] + qk; ss[k] = max(qk/z, 1e-8)
+            lo_acc[name].extend(lo); hi_acc[name].extend(hi); sig_acc[name].extend(ss)
+
+        for m in POINT_METHODS:
+            pt_pred[m].extend(preds[m])
+        acc_y.extend(yb); acc_sig.extend(sgb)
+
+        # update pools AFTER scoring (now data <= next origin)
+        for m in POINT_METHODS:
+            r = np.abs(yb - preds[m])
+            flat_pool[m].extend(r.tolist())
+            for k in range(h):
+                per_lead[m][k].append(r[k])
+
+        current_train += h
+
+    y_arr = np.asarray(acc_y); sig_arr = np.asarray(acc_sig)
+
+    # POINT table
+    point_rows = []
+    for m in POINT_METHODS:
+        err = np.asarray(pt_pred[m]) - y_arr
+        point_rows.append({"method": m, "MAE": float(np.mean(np.abs(err))),
+                           "RMSE": float(np.sqrt(np.mean(err**2))),
+                           "median_AE": float(np.median(np.abs(err)))})
+
+    # INTERVAL table
+    int_rows = []
+    for name in lo_acc:
+        lo = np.asarray(lo_acc[name]); hi = np.asarray(hi_acc[name]); s_ = np.asarray(sig_acc[name])
+        cbs = coverage_by_stratum(lo, hi, y_arr, sig_arr)
+        int_rows.append({
+            "method": name, "marg_cov": coverage(lo, hi, y_arr),
+            "worst_win_cov": worst_window_coverage(lo, hi, y_arr),
+            "cov_lowvol": cbs[0], "cov_hivol": cbs[-1],
+            "cond_cov_gap": max(abs(c - TARGET_COV) for c in cbs if not math.isnan(c)),
+            "width": mean_width(lo, hi), "interval_score": interval_score(lo, hi, y_arr),
+        })
+
+    if verbose:
+        mae = {m: np.mean(np.abs(np.asarray(pt_pred[m]) - y_arr)) for m in POINT_METHODS}
+        print(f"  [seed {seed}] {n_chunks} blocks  MAE: "
+              + "  ".join(f"{m}={mae[m]:.3f}" for m in POINT_METHODS) + f"  ({time.time()-t0:.0f}s)")
+
+    return {"point": pd.DataFrame(point_rows), "interval": pd.DataFrame(int_rows)}
+
+# ── Aggregate ────────────────────────────────────────────────────────────────
+def run_all() -> dict:
+    pts, ints = [], []
+    for s in range(N_SEEDS):
+        print(f"\n=== seed {s} ===")
+        r = run_once(seed=s, verbose=True)
+        pts.append(r["point"]); ints.append(r["interval"])
+
+    point_dt = pd.concat(pts, ignore_index=True)
+    int_dt   = pd.concat(ints, ignore_index=True)
+    point_agg = point_dt.groupby("method", sort=False).mean().reset_index().sort_values("RMSE")
+    int_agg   = int_dt.groupby("method", sort=False).mean().reset_index().sort_values("interval_score")
+
+    point_dt.to_csv("results/point_all.csv", index=False)
+    int_dt.to_csv("results/interval_all.csv", index=False)
+    point_agg.to_csv("results/point.csv", index=False)
+    int_agg.to_csv("results/interval.csv", index=False)
+
+    pd.set_option("display.width", 200, "display.max_columns", 20)
+    print(f"\n=== POINT DUEL  (block h-step, no online updating; mean over {N_SEEDS} seeds) ===\n")
+    print(point_agg.round(3).to_string(index=False))
+    print(f"\n=== INTERVAL STUDY  (point = NNS fixed; alpha={ALPHA}, target={TARGET_COV}) ===\n")
+    print(int_agg.round(3).to_string(index=False))
+    print("\nWrote results/{point,interval}{,_all}.csv")
+    return {"point": point_agg, "interval": int_agg}
+
+
+if __name__ == "__main__":
+    run_all()