Exploiting VectorTyped to avoid copies during the NL construction by Iximiel · Pull Request #1401 · plumed/plumed2

Iximiel · 2026-04-22T12:15:04Z

Description

First of all I added a set of test that can be useful to understand how the NL is breaking in all the possible situations (serial, serial+omp, mpi, mpi+omp)

Then I exploited the trick PLUMED uses to transfer Vectors wit MPI to avoid creating an extra unsigned array to be gathered and then copied into the local neigborlist.
In a serial or non mpi run this skips a significant portion of code since it works directly on the neighbors_ array. In a MPI run the code still uses a temporary array, but communicate directly into the main neighbors_ array, skipping the last copy.

I think this will lower the RAM tax that the NL imposes on the PC for larger systems

I have a fast benchmark on drag races of 60 seconds (how many steps in 60 seconds) with 4 omp threads:

Atomi	LC	LC-v	Δ% LC→LC-v	NL	NL-v	Δ% NL→NL-v
125	506813	557075	+9.92%	445501	456190	+2.40%
1000	42082	44953	+6.82%	38427	40167	+4.53%
8000	6705	6732	+0.40%	3201	3342	+4.40%
27000	1559	1586	+1.73%	501	517	+3.19%
42875	946	960	+1.48%	201	221	+9.95%

And with 2 MPI processes with 3 opemMPtreads:

Atomi	LC	LC-v	Δ% LC→LC-v	NL	NL-v	Δ% NL→NL-v
125	329986	335234	+1.59%	278411	278566	+0.06%
1000	29640	29800	+0.54%	25404	25970	+2.23%
8000	3909	3879	−0.77%	1941	2001	+3.09%
27000	901	894	−0.78%	281	301	+7.12%
42875	531	501	−5.65%	121	121	0.00%

NL runs this:

cpu:     COORDINATION GROUPA=@mdatoms GROUPB=@mdatoms SWITCH={RATIONAL R_0=0.5 NN=6 MM=10 D_MAX=2.0} NLIST      NL_CUTOFF=3.5 NL_STRIDE=20

LC runs this:

cpucl:   COORDINATION GROUPA=@mdatoms GROUPB=@mdatoms SWITCH={RATIONAL R_0=0.5 NN=6 MM=10 D_MAX=2.0} NLISTCELLS NL_CUTOFF=2.0 NL_STRIDE=1

**-v is this thread performances, considering that the LC do not make the extra copy I am surprised by the improvement even if it is not expected. NL has a worse improvement, but the algorithm runs every 20 steps, where for LC at each step
(me being lazy I made an AI calculate the % columns, the number of steps are correct, I double checked them)

With mpi looks less attractive, but I only did a single run for both the tries on my workstation

If this work for you, it will be the base of the "standard" NL accelerated by the linked cells algorithm. This is not necessary for that, a somehow working version already exists, but I believe that rebasing that on these modifications will enhance its performace a little more.

Target release

I would like my code to appear in release v2.11

Type of contribution

changes to code or doc authored by PLUMED developers, or additions of code in the core or within the default modules
changes to a module not authored by you
new module contribution or edit of a module authored by you

Copyright

I agree to transfer the copyright of the code I have written to the PLUMED developers or to the author of the code I am modifying.

the module I added or modified contains a COPYRIGHT file with the correct license information. Code should be released under an open source license. I also used the command cd src && ./header.sh mymodulename in order to make sure the headers of the module are correct.

Tests

I added a new regtest or modified an existing regtest to validate my changes.
I verified that all regtests are passed successfully on GitHub Actions.

github-advanced-security · 2026-04-22T13:41:17Z

You are seeing this message because GitHub Code Scanning has recently been set up for this repository, or this pull request contains the workflow file for the Code Scanning tool.

What Enabling Code Scanning Means:

The 'Security' tab will display more code scanning analysis results (e.g., for the default branch).
Depending on your configuration and choice of analysis tool, future pull requests will be annotated with code scanning analysis results.
You will be able to see the analysis results for the pull request's branch on this overview once the scans have completed and the checks have passed.

For more information about GitHub Code Scanning, check out the documentation.

Iximiel · 2026-04-28T11:34:42Z

@GiovanniBussi I redid the benchmarks: again steps in fixed time, with runs of 600 seconds.

mpirun with 2 processes with 3 threads each (I have 6 physical cores):

Here in NL the communication of the couples puts the result directly in the neighbors_ array skipping a copy, but it still needs a temporary local array.
in Base the extra cost to converting the array to a pair, I think
In LC the branch changes the push_back (I think?), note that LC does not exploits omp or MPI in PLMD::NeighborList, but the calculations which atom is in which cell is distributed with MPI
3375* appears twice because the 57491 steps for LC was too much out of the trend and I reran the benchmark for every configuration, it is interesting as a check for consistency
I think that the 199941 in the 8000 atoms for the NL column is also a lucky shot for my branch or an unlucky run for master

#atoms	NL	NL-v	Δ% NL→NL-v	LC	LC-v	Δ% LC→LC-v	Base	Base-v	Δ% Base→Base-v
125	2779181	2781688	+0.09%	3294460	3368342	+2.24%	3700608	3691941	-0.23%
1000	254581	257540	+1.16%	295449	300581	+1.74%	95715	94087	-1.70%
3375	62526	63162	+1.02%	76009	57491	-24.36%	9129	9110	-0.21%
3375*	62203	62934	+1.18%	75581	76189	+0.80%	9128	9110	-0.20%
8000	18081	19941*	+10.29%	39019	38511	-1.30%	1622	1617	-0.31%
15625	7221	7481	+3.60%	16153	16276	+0.76%	431	427	-0.93%
27000	3081	3161	+2.60%	9308	9465	+1.69%	144	144	+0.00%
42875	1401	1441	+2.86%	5970	5939	-0.52%	57	56	-1.75%

4 threads:

NL with only omp skips the extra copy in NL, by merging the list of couples directly in the neighbors_ array in the #omp critical section
I have no Idea why there is a boost in performance in LC

#atoms	NL	NL-v	Δ% NL→NL-v	LC	LC-v	Δ% LC→LC-v	Base	Base-v	Δ% Base→Base-v
125	4447001	4518810	+1.61%	4998651	5537396	+10.78%	7443775	7419281	-0.33%
1000	371291	387762	+4.44%	419098	453133	+8.12%	164844	163544	-0.79%
3375	97551	100121	+2.63%	124122	130720	+5.32%	14740	13731	-6.85%
3375*	97081	99262	+2.25%	124830	130810	+4.79%	14745	14614	-0.89%
8000	32202	33517	+4.08%	67039	69639	+3.88%	2620	2613	-0.27%
15625	12482	12756	+2.20%	28289	29102	+2.87%	677	687	+1.48%
27000	5201	5361	+3.08%	16367	16847	+2.93%	230	230	+0.00%
42875	2381	2481	+4.20%	10450	10883	+4.14%	91	91	+0.00%

maybe in the NL algorithm, creating the extra arrays with some capacity to avoid too many reallocation in the push_back might help

using the even simpler std::array to speed up the NL

+          &local_nl_size[0],
+          &disp[0]);
      }
+      // no need for an else neighbors_.resize(0);


Iximiel force-pushed the feature/dataInNL branch from d29a9e8 to 35ac680 Compare April 22, 2026 12:21

github-advanced-security AI found potential problems Apr 22, 2026

View reviewed changes

Comment thread src/tools/NeighborList.cpp Fixed

Iximiel force-pushed the feature/dataInNL branch from 35ac680 to b1da497 Compare April 23, 2026 09:42

This was referenced Apr 23, 2026

Lowering memory usage in the NL #1402

Merged

Using Linkcells to accelerate the NL #1403

Draft

Iximiel force-pushed the feature/dataInNL branch from b1da497 to 58570ca Compare May 4, 2026 08:09

using the VectorTyped exploit in comm to speed up the neigborlist

3afba3f

using the even simpler std::array to speed up the NL

Iximiel force-pushed the feature/dataInNL branch from 58570ca to 3afba3f Compare May 4, 2026 08:20

github-advanced-security AI found potential problems May 4, 2026

View reviewed changes

Comment thread src/tools/NeighborList.cpp

&local_nl_size[0],

&disp[0]);

}

// no need for an else neighbors_.resize(0);

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Exploiting VectorTyped to avoid copies during the NL construction#1401

Exploiting VectorTyped to avoid copies during the NL construction#1401
Iximiel wants to merge 1 commit intoplumed:masterfrom
Iximiel:feature/dataInNL

Iximiel commented Apr 22, 2026 •

edited

Loading

Uh oh!

Uh oh!

github-advanced-security AI commented Apr 22, 2026

Uh oh!

Iximiel commented Apr 28, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Iximiel commented Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Target release

Type of contribution

Copyright

Tests

Uh oh!

Uh oh!

github-advanced-security AI commented Apr 22, 2026

What Enabling Code Scanning Means:

Uh oh!

Iximiel commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Iximiel commented Apr 22, 2026 •

edited

Loading

Iximiel commented Apr 28, 2026 •

edited

Loading