feat(timeseries): add idxmin / idxmax to TimeSeries#3115
Conversation
`TimeSeries.min(axis=0)` and `.max(axis=0)` return a single-row series whose timestamp is just the *first* entry of the original time index, not the timestamp of the actual min / max value. The original issue documents how this is misleading; the maintainer has agreed (unit8co#2696) that adding `idxmin` / `idxmax` is the cleanest fix. This commit: - Adds `TimeSeries.idxmin()` and `TimeSeries.idxmax()` returning a `pandas.Series` indexed by component name with values being the time-index entries (Timestamp or int) at which each component attains its min / max — mirroring `pandas.DataFrame.idxmin/idxmax` semantics. This sidesteps the multivariate dilemma flagged in the issue (each component can have a different argmin/argmax) cleanly and is purely additive (no breaking change). - For stochastic series we reduce samples with the median first so the returned index does not depend on `n_samples`. - Adds a one-line note to `min()` / `max()` docstrings pointing readers to the new helpers. - Adds 4 regression tests covering the univariate, multivariate, RangeIndex, and stochastic cases. The univariate test fails on master (`AttributeError: 'TimeSeries' object has no attribute 'idxmin'`) and passes on this branch. - CHANGELOG entry under Unreleased > Improved. Refs: unit8co#2696 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
jakubchlapek
left a comment
There was a problem hiding this comment.
Hey @jbbqqf, thanks for this PR.
The code looks fine, just had a few documentation changes I left in the comments. I also debated adding the axis parameter to the idxmin/max methods, but can't see a real usecase in which we wouldn't use multi series instead. You can let me know what you think.
| .. note:: | ||
| With ``axis=0`` the returned timestamp is the first entry of the | ||
| original ``time_index`` and **does not** correspond to the | ||
| timestamp of the actual minimum value. Use :func:`idxmin` to get | ||
| the timestamp at which each component attains its minimum. |
There was a problem hiding this comment.
i'd say this is redudnant with the rest of the docstring as the behavior is already mentioned above (`If we reduce over time (axis=0), the series will have length one and will use the first entry of the
original ``time_index```)
| .. note:: | ||
| With ``axis=0`` the returned timestamp is the first entry of the | ||
| original ``time_index`` and **does not** correspond to the | ||
| timestamp of the actual maximum value. Use :func:`idxmax` to get | ||
| the timestamp at which each component attains its maximum. | ||
|
|
There was a problem hiding this comment.
same case as above for min
|
|
||
| Useful as a companion to :func:`min` because ``min(axis=0)`` returns a | ||
| single-row series whose timestamp is the *first* time index entry of | ||
| the original series, not the entry of the actual minimum (see | ||
| `issue #2696 <https://github.com/unit8co/darts/issues/2696>`_). | ||
|
|
There was a problem hiding this comment.
i'd say this is unnecesssary
|
|
||
| Useful as a companion to :func:`max` because ``max(axis=0)`` returns a | ||
| single-row series whose timestamp is the *first* time index entry of | ||
| the original series, not the entry of the actual maximum (see | ||
| `issue #2696 <https://github.com/unit8co/darts/issues/2696>`_). | ||
|
|
| # Reduce samples first so the result is independent of stochasticity; | ||
| # using the median (rather than mean) keeps the returned index value | ||
| # an actual observed value when n_samples == 1. |
There was a problem hiding this comment.
unnecessary, already mentioned in the docstring
|
|
||
| **Improved** | ||
|
|
||
| - Added `TimeSeries.idxmin()` and `TimeSeries.idxmax()`, returning a `pandas.Series` (indexed by component) of the time index value at which each component attains its minimum / maximum. Also clarified the docstrings of `min()` / `max()` to point users to these new helpers when they want the actual argmin/argmax timestamp. Closes [#2696](https://github.com/unit8co/darts/issues/2696). |
There was a problem hiding this comment.
| - Added `TimeSeries.idxmin()` and `TimeSeries.idxmax()`, returning a `pandas.Series` (indexed by component) of the time index value at which each component attains its minimum / maximum. Also clarified the docstrings of `min()` / `max()` to point users to these new helpers when they want the actual argmin/argmax timestamp. Closes [#2696](https://github.com/unit8co/darts/issues/2696). | |
| - Added `TimeSeries.idxmin()` and `TimeSeries.idxmax()`, returning a `pandas.Series` (indexed by component) of the time index value at which each component attains its minimum / maximum. Closes [#3115](https://github.com/unit8co/darts/pull/3115). |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #3115 +/- ##
==========================================
- Coverage 96.55% 96.48% -0.07%
==========================================
Files 160 160
Lines 17285 17293 +8
==========================================
- Hits 16689 16686 -3
- Misses 596 607 +11 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
Fixes #2696.
Summary
TimeSeries.min(axis=0)and.max(axis=0)return a single-row series whose timestamp is just the first entry of the original time index, not the timestamp of the actual minimum / maximum. As @eschibli pointed out in #2696, this is misleading — users may not read themin/maxdocstring before trusting the index. @madtoinou agreed in the same thread that the corresponding time-step is well-defined per component, even in the multivariate case.This PR adds
TimeSeries.idxmin()andTimeSeries.idxmax(), which return apandas.Seriesindexed by component name with values being the time-index entry (Timestamp orintforRangeIndex-based series) at which each component attains its minimum / maximum — mirroringpandas.DataFrame.idxmin/idxmax. The change is purely additive (no break in existing behavior).Other Information
Why a
pd.Seriesand not aTimeSeries?Each component can have a different argmin / argmax (this is exactly the multivariate dilemma raised in the issue thread). A single-row
TimeSeriescannot represent that — it would have to pick one component's argmax index for the whole row. Returning apd.Series(which is what users were already falling back to viaseries.pd_dataframe().idxmax()) sidesteps the problem cleanly and matches pandas semantics.Stochastic series. For probabilistic series we reduce samples with the median first, so the returned index does not depend on
n_samples.min/maxdocstrings. Added a one-line note pointing readers to the new helpers when they want the actual argmin/argmax timestamp; the existing behaviour (and signature) is unchanged.CHANGELOG. Updated under
Unreleased>For users of the library>Improved.Reproduce BEFORE/AFTER yourself (copy-paste)
The two commands are identical except for the checked-out ref.
What I ran locally
uv run pytest darts/tests/test_timeseries.py::TestSimpleStatistics -v→ 14 / 14 passed (10 pre-existing + 4 new).uv run pytest darts/tests/test_timeseries.py -k 'min or max'→ all green.uv run ruff check darts/timeseries.py darts/tests/test_timeseries.py→ all checks passed.test_idxmin_idxmax_univariate_datetimeagainstorigin/master(with the test cherry-picked but the implementation reverted) — fails withAttributeError: 'TimeSeries' object has no attribute 'idxmin'. The new tests therefore both fail onmasterand pass on this branch.Edge cases tested
argmin=3,argmax=2idxmin → 2020-01-04,idxmax → 2020-01-03test_idxmin_idxmax_univariate_datetimepd.Seriestest_idxmin_idxmax_multivariateRangeIndextime indexint, notTimestamptest_idxmin_idxmax_range_index2020-01-03regardless of sample noisetest_idxmin_idxmax_stochasticRisk / blast radius
Purely additive — two new methods on
TimeSeries, no change tomin/maxsemantics, no signature changes elsewhere. The twomin/maxdocstrings gain a.. note::block but their behavior and return shape are identical to before.PR drafted with assistance from Claude Code (Anthropic). Implementation, regression tests, and the BEFORE/AFTER reproducer above were reviewed manually against
unit8co/darts@masterand the discussion in #2696. The reproducer block was used during development; reviewers can paste it verbatim to verify.