[ntuple] Support multiple column representations in the merger by silverweed · Pull Request #22017 · root-project/root

silverweed · 2026-04-22T15:07:43Z

This Pull request:

Significantly reworks the innards of the RNTupleMerger to support fast merging of fields with different but compatible column representations.
Basically it does two things:

turns all L3 merging cases into L2/L1.
no longer rejects merging fields with different column representations (previously this was only supported for representations that were the split/unsplit version of each other, and only via L3 merging).

A potentially negative consequence that we might want to revisit is that now the merger won't ever adapt the columns' splitness to the output compression (e.g. if merging changes the source compression from 0 to 505 it will still encode the columns as unsplit, and vice-versa). This will probably be readded in a future PR.

In order to achieve this, some new internal functionality had to be added, most notably RPagePersistentSink::AddColumnRepresentation.

Note that this PR is independent on #21740, which in fact might not be needed at all.

IMPORTANT

This PR introduces our first feature flag and thus the first bump to the specs' major version (1.1.0.0). This means we can now start producing RNTuples which cannot be read by older ROOT versions.

TODO

check if we need a feature flag for the changes in AddExtendedColumnRanges
add a test for merging of Real32Trunc/Quant columns with different bit width/value range
properly split the big merger commit
update Merging.md

Checklist:

tested changes locally
updated the docs (if necessary)

github-actions · 2026-04-22T17:16:23Z

Test Results

19 files 19 suites 2d 20h 11m 13s ⏱️
3 869 tests 3 868 ✅ 0 💤 1 ❌
66 349 runs 66 348 ✅ 0 💤 1 ❌

For more details on these failures, see this check.

Results for commit 9eaef1b.

♻️ This comment has been updated with latest results.

pcanal · 2026-05-23T05:19:25Z

now the merger won't ever adapt the columns' splitness to the output compression (e.g. if merging changes the source compression from 0 to 505 it will still encode the columns as unsplit, and vice-versa). This will probably be readded in a future PR.

indeed, we do need to provide a way for the user to require the L3 type of merging ('urgency' of this is less if we already have a way to for L4 type of merging).

Addendum: L4 is currently not supported, so re-adding L3 would be helpful. In particular to allow re-selection of the compression algorithm used.

pcanal · 2026-05-23T05:23:43Z

+
+| Flag Bit | Introduced in | Name                    | Meaning                                      |
+|----------|---------------|-------------------------|----------------------------------------------|
+| 0        | 1.1.0.0       | Nested Deferred Columns | Signals that the RNTuple contains at least one deferred column that is part of a collection and was extended<br>(i.e. it appears in the footer). This can happen when merging two RNTuples that have the same collection field<br>backed by columns with different encoding, e.g. a `vector<float>` whose elements are represented by SplitReal32<br>in the first ntuple and by Real32 in the second. |


This mentions explicitly collections. Is the feature (merging RNTuple with 'same' column with different representation) not supported for simple type (i.e. just a float instead of a vector<float>)? If not, why not?

It is supported but it requires no feature flag because it produces files that are still readable by previous ROOT versions.
The feature flag is only added if the feature produces files that are forward-incompatible for older versions, which is only the case for collections.

pcanal · 2026-05-23T05:24:36Z

I did not mean to close this.

hahnjo

Some initial comments for the preparatory changes. I wasn't able to properly review the actual changes in the merger, it's probably me missing enough context and time to thoroughly think it through...

jblomer

Thank you! That is a huge step forward.

With all the added complexity, I think we need to give the merger another fresh look and try reduce the logic a bit to smaller blocks. For a future PR.

We should remember to backport

[ntuple] Fix RFieldFundamental::GenerateColumns using the wrong repr idx
[ntuple] update merger test to make sure we test nRepetitions
[ntuple] Clarify a bit RFieldBase::EntryToColumnElementIndex
[ntuple] Introduce feature flag 0 (nested deferred columns)
Patch up column ids in the header extension (part of [ntuple] Support multiple column representations in the merger)

jblomer · 2026-06-15T13:30:07Z

+   const RColumnRange *TryGetColumnRange(ROOT::DescriptorId_t physicalId) const
+   {
+      if (auto it = fColumnRanges.find(physicalId); it != fColumnRanges.end())
+         return &it->second;
+      return nullptr;
+   }


I think this is only used once and can be replaed by ContainsColumn() followed by GetColumnRange() (at the cost of a small performance penalty). I think it's not worth it at this point adding the public method.

On the other hand, is it really a problem to have the method? I imagine this could be useful for users in general and since we're hiding the underlying maps we should not force a suboptimal access pattern via the API imho.

jblomer · 2026-06-16T12:58:00Z

+         if (!projectionPointeesInitialized) {
+            for (const auto &field : descCmp.fExtraDstFields) {
+               if (field->IsProjectedField())
+                  projectionPointees[field->GetProjectionSourceId()].push_back(field);
+            }
+            projectionPointeesInitialized = true;
+         }


Again, for simplicity, we could always fill that map

jblomer · 2026-06-16T13:09:38Z

   }
 }

+TEST(RNTupleMerger, MergeDeferredAdvanced2)


The commit message says that we test fields with a different number of column representations. Can you pinpoint where this happens?

jblomer · 2026-06-16T13:24:03Z

-      if (std::find_if(fColumnRepresentatives.begin(), fColumnRepresentatives.end(),
-                       [&r](const auto &rep) { return r == rep.get(); }) == fColumnRepresentatives.end())
-         fColumnRepresentatives.emplace_back(*itRepresentative);
+      fColumnRepresentatives.emplace_back(*itRepresentative);


Perhaps also for backport.

Instead of calling continue multiple times in the AddColumnFromField loop, just early return in case of projected fields.

We are currently serializing columns per-field, but in case of late column extension this might result in inconsistent sorting of the columns in the serialized footer. e.g. assume you have fields "A" and "B", both late model extended, both with a single column: - col 0 -> field A, repr 0 - col 1 -> field B, repr 0 Now you add a new column representation to field "A"; this new column has id 2: - col 2 -> field A, repr 1 When serializing this RNTuple, all columns are written in the footer by RNTupleSerialize::SerializeColumnsForFields(). Before this change, they would end up on disk in order: [0, 2, 1]. This would corrupt the data by swapping the pages for columns 2 and 1. After this change, they get written as [0, 1, 2] which is the correct order. Note that this exact case is tested in ntuple_merger in the unit test MergeDeferredAdvanced.

Also fix the type of result

This is necessary to support multiple representations that use the same column type but different metadata (e.g. different bit width on Real32Trunc columns)

In code, this feature is represented by an update in the logic of RClusterDescriptorBuilder::AddExtendedColumnRanges(), which now handles properly the case where a column is added in a later cluster and therefore cannot rely on the (NEntries * NRepetitions) calculation, so it instead copies its FirstElementIndex and NElements from the 0th representation (which is guaranteed to have valid numbers for them).

- RPagePersistentSink::AddColumnRepresentation - RPagePersistentSink::AddAliasColumn Internal functionality to be used by the Merger. This entails 2 additional changes: - AddExtendedColumnRanges needs to be updated to handle the case where a column representation is added to a field during writing after some clusters have already been written; - ShiftAliasColumns needs to properly shift the ids of extended alias columns when called, otherwise a mismatch may happen when serializing the footer

Two field descriptors might legitimately have different numbers of elements in their LogicalColumnIds vectors due to having a different number of representations. The proper check to do is the one on the column cardinality, which makes sure that the same number of columns is actually used at a time.

This test checks two things: 1. that the fix applied by 14ade04 works properly (this is already checked by MergeDeferredAdvanced but that might not be the case anymore if we update the merger to change the column encodings depending on the output compression - in which case that test would not be adding a late extended column anymore); 2. that the merger properly handles fields with the same cardinality but different number of column representations

…ation Also set the anchor version in InitFromDescriptor().

silverweed requested a review from jblomer as a code owner April 22, 2026 15:07

silverweed marked this pull request as draft April 22, 2026 15:07

silverweed changed the title ~~Ntuple merge colrep2~~ [ntuple] Support multiple column representations in the merger Apr 22, 2026

silverweed added the in:RNTuple label Apr 22, 2026

silverweed self-assigned this Apr 22, 2026

silverweed force-pushed the ntuple_merge_colrep2 branch 3 times, most recently from b2ae5fc to 1db6b5e Compare April 22, 2026 15:22

silverweed force-pushed the ntuple_merge_colrep2 branch 9 times, most recently from 82e5299 to d3efcfc Compare April 30, 2026 09:45

silverweed force-pushed the ntuple_merge_colrep2 branch 2 times, most recently from 857d425 to 60b1bde Compare May 4, 2026 13:36

silverweed mentioned this pull request May 4, 2026

[ntuple] Some changes/fixes to better handle feature flags #22135

Merged

2 tasks

silverweed force-pushed the ntuple_merge_colrep2 branch 3 times, most recently from 1d462ee to a249301 Compare May 5, 2026 07:42

silverweed force-pushed the ntuple_merge_colrep2 branch from a249301 to ab08930 Compare May 13, 2026 15:05

silverweed marked this pull request as ready for review May 15, 2026 06:48

silverweed requested review from enirolf, hahnjo, pcanal and vepadulano May 15, 2026 06:48

pcanal closed this May 23, 2026

pcanal reviewed May 23, 2026

View reviewed changes

pcanal reopened this May 23, 2026

hahnjo reviewed Jun 3, 2026

View reviewed changes

Comment thread tree/ntuple/inc/ROOT/RField/RFieldFundamental.hxx

Comment thread tree/ntuple/src/RNTupleSerialize.cxx

Comment thread tree/ntuple/src/RNTupleDescriptor.cxx

silverweed added this to the 6.42.00 milestone Jun 8, 2026

silverweed force-pushed the ntuple_merge_colrep2 branch 3 times, most recently from 4697f42 to db9c393 Compare June 9, 2026 13:47

jblomer reviewed Jun 16, 2026

View reviewed changes

silverweed added 17 commits June 17, 2026 15:19

[ntuple] Fix RFieldFundamental::GenerateColumns using the wrong repr idx

92ef8ff

[ntuple] Small improvement in RNTupleMerger

979768c

Instead of calling continue multiple times in the AddColumnFromField loop, just early return in case of projected fields.

[ntuple] update merger test to make sure we test nRepetitions

93ada51

[ntuple] Clarify a bit RFieldBase::EntryToColumnElementIndex

e1599a0

Also fix the type of result

[ntuple] Allow adding duplicate column representations

1d1d5a3

This is necessary to support multiple representations that use the same column type but different metadata (e.g. different bit width on Real32Trunc columns)

[ntuple] Add RClusterDescriptor::TryGetColumnRange

a88fefe

[ntuple] Add multiple column representation support in the Merger

9b7db69

[ntuple] Merger: extract column repr matching into a separate function

cd91ac2

[ntuple] support merging columns with metadata (with different types)

5ab7637

[ntuple] Support merging columns with same type and different metadata

b3e6c54

[ntuple][NFC] update Merging.md

1feb2aa

[ntuple] ensure valid descriptor in AddAliasColumn/AddColumnRepresent…

9eaef1b

…ation Also set the anchor version in InitFromDescriptor().

silverweed force-pushed the ntuple_merge_colrep2 branch from db9c393 to 9eaef1b Compare June 17, 2026 15:07

Conversation

silverweed commented Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

This Pull request:

IMPORTANT

TODO

Checklist:

Uh oh!

github-actions Bot commented Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test Results

Uh oh!

pcanal commented May 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pcanal May 23, 2026

Choose a reason for hiding this comment

Uh oh!

silverweed May 25, 2026

Choose a reason for hiding this comment

Uh oh!

pcanal commented May 23, 2026

Uh oh!

hahnjo left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jblomer left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jblomer Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

silverweed Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

jblomer Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jblomer Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jblomer Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

silverweed commented Apr 22, 2026 •

edited

Loading

github-actions Bot commented Apr 22, 2026 •

edited

Loading

pcanal commented May 23, 2026 •

edited

Loading