Exploiting VectorTyped to avoid copies during the NL construction#1401
Open
Iximiel wants to merge 1 commit intoplumed:masterfrom
Open
Exploiting VectorTyped to avoid copies during the NL construction#1401Iximiel wants to merge 1 commit intoplumed:masterfrom
Iximiel wants to merge 1 commit intoplumed:masterfrom
Conversation
d29a9e8 to
35ac680
Compare
|
You are seeing this message because GitHub Code Scanning has recently been set up for this repository, or this pull request contains the workflow file for the Code Scanning tool. What Enabling Code Scanning Means:
For more information about GitHub Code Scanning, check out the documentation. |
35ac680 to
b1da497
Compare
This was referenced Apr 23, 2026
Member
Author
|
@GiovanniBussi I redid the benchmarks: again steps in fixed time, with runs of 600 seconds. mpirun with 2 processes with 3 threads each (I have 6 physical cores):
4 threads:
maybe in the NL algorithm, creating the extra arrays with some capacity to avoid too many reallocation in the push_back might help |
using the even simpler std::array to speed up the NL
| &local_nl_size[0], | ||
| &disp[0]); | ||
| } | ||
| // no need for an else neighbors_.resize(0); |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
First of all I added a set of test that can be useful to understand how the NL is breaking in all the possible situations (serial, serial+omp, mpi, mpi+omp)
Then I exploited the trick PLUMED uses to transfer
Vectors wit MPI to avoid creating an extra unsigned array to be gathered and then copied into the local neigborlist.In a serial or non mpi run this skips a significant portion of code since it works directly on the
neighbors_array. In a MPI run the code still uses a temporary array, but communicate directly into the mainneighbors_array, skipping the last copy.I think this will lower the RAM tax that the NL imposes on the PC for larger systems
I have a fast benchmark on drag races of 60 seconds (how many steps in 60 seconds) with 4 omp threads:
And with 2 MPI processes with 3 opemMPtreads:
NLruns this:LCruns this:**-vis this thread performances, considering that the LC do not make the extra copy I am surprised by the improvement even if it is not expected.NLhas a worse improvement, but the algorithm runs every 20 steps, where forLCat each step(me being lazy I made an AI calculate the % columns, the number of steps are correct, I double checked them)
With mpi looks less attractive, but I only did a single run for both the tries on my workstation
If this work for you, it will be the base of the "standard" NL accelerated by the linked cells algorithm. This is not necessary for that, a somehow working version already exists, but I believe that rebasing that on these modifications will enhance its performace a little more.
Target release
I would like my code to appear in release v2.11
Type of contribution
Copyright
COPYRIGHTfile with the correct license information. Code should be released under an open source license. I also used the commandcd src && ./header.sh mymodulenamein order to make sure the headers of the module are correct.Tests