Using Linkcells to accelerate the NL#1403
Draft
Iximiel wants to merge 7 commits intoplumed:masterfrom
Draft
Conversation
| } | ||
| double value=modulo2(distance); | ||
| if(value<=d2) { | ||
| //neighbors_.push_back({A,B}); |
Comment on lines
+295
to
+297
| //now cells are setup along all MPI ranks | ||
| //const unsigned stride=(serial_)? 1 : comm.Get_size(); | ||
| //const unsigned rank =(serial_)? 0 : comm.Get_rank(); |
Comment on lines
+299
to
+302
| //const unsigned elementsPerRank = std::ceil(double(nc)/stride); | ||
| const unsigned int start=0;// rank*elementsPerRank; | ||
| const unsigned int end = nc;//((start + elementsPerRank)< nc)?(start + elementsPerRank): nc; | ||
| //Initialization of List A and B is here beausue the access to them is threadsafe (at the moment of writing this) |
6ab15f8 to
dc376ac
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Hello, I think this is the last (or at least the latest) PR in the NL series.
Fist of all I need to do some extra modification to the code (like compacting and move the new template function in the LC header, and rebasing this on the modifications of #1401, if it gets accepted), but I wanted to show the current state of this to get some feedback.
I tried to combine linkCells with the standard NL algorithm, to see if things will be speed up.
Along the way I rationalized the LinkCells loops into a template function and I tried to parallelize the LC version of the NL.
Here a few graphs, they follow the #1401 nomenclature and I used the same commands
**-vis this PR, and**-noompis this PR with the openmp code for the linkcell commented out.These are the results on the NL algorithm:
The LC-NL "linearizes" the time and, contrary to LC -see below-, this seems to benefit from the omp parallelization.
As we were predicting, you can see the old implementation start faster but scales worse than the new implementation. So I was thinking of keeping the two implementations and decide when use one or another, with the default behavior derived from some measurements that can be overridden by the user input.
These are the timing on LC the
-vhas openmp, the-noomphas the new interface but no openmpI do not think that adding openmp is a good idea to LC, but that was not the point o these modifications. Maybe because it is working directly in the result array.
Target release
I would like my code to appear in release v2.11
Type of contribution
Copyright
COPYRIGHTfile with the correct license information. Code should be released under an open source license. I also used the commandcd src && ./header.sh mymodulenamein order to make sure the headers of the module are correct.Tests