Optimize get words #1015

GiovanniBussi · 2024-02-02T08:47:12Z

I managed to optimize getWords.

@carlocamilloni can you also check the performance in this branch?

In short, there's a new getWords written ad hoc for plumed_cmd strings, which is significantly faster for the following two reasons:

I removed the possibility to use complex syntax (parenthesis, tabs etc) in cmd string. I think this is fine, please let me know if anyone has concerns.
I used a small_vector<std::string_view> to avoid memory allocations. std::string_view is c++17, small_vector is an external library that I included here, similarly to how we include lepton and other libraries. Notice that the small_vector library triggers a bug with gcc 8 that I fixed here.

Before merging this branch I want to check with the author of the small_vector library if my fix is ok. To me, small_vector is a very useful tool that might allow optimizations in many places in PLUMED (basically, whenever we waste time allocating small temporary vectors). If I cannot sort this out, I can easily reimplement the optimization of getWords using a fixed size std::array (say with a maximum size of 4 words), that would work certainly in case of interpreting plumed_cmd strings.

This PR closes #1011

I was planning to implement something similar but I found this on GitHub and I decided to embed it. https://github.com/gharveymn/small_vector It seems well written (certainly better than what I would do), does exactly what we need, and the licence allows this. As usual this is putting everyting in PLMD namespace For reference, this is d98388b21b73be55e8c36ee18809e19a6fa82d34 of the original repository

Unfortunately it seems gcc 8 has a bug and cannot recognize a noexcept statement. My understanding is that this statement can be removed since it would just lead to a worse performance in some specific operation on some types of objects, but not to errors. I thus added a check so that this statement is removed by the preprocessor when using GCC 8 or less.

I basically reimplemented a simpler version of getWords which has some limitation: - only understand space (" ") as separator (no newlines or tabs) - do not understand braces This implies we cannot anymore do: plumed_cmd(p,"{init}") or plumed_cmd(p,"init\t") I think we never meant plumed_cmd to be used in this way. If we want to pass more structured strings, we can always pass them as the second argument. So I guess this is not a problem. Another way to make it faster is to avoid allocations. Instead of returning a std::vector<std::string>, we accept a reference to a preallocated small_vector<std::string_view>. This should require literally zero new allocations, as long as the number of words is low (for us it's always 0, 1 or 2). This required some modifications to the PlumedMain code, because - these std::string_view should be converted to std::string in some cases (with a subsequent allocation) - I had to modify the word_map object to be able to access it with a std::string_view With this optimization, and a simple test with a 20 atoms system, 4 of which used in a biased CV, I can gain ~ 20% speed on my laptop

codecov-commenter · 2024-02-02T09:29:26Z

Codecov Report

Attention: 3 lines in your changes are missing coverage. Please review.

Comparison is base (f210467) 84.18% compared to head (89193cd) 84.17%.

❗ Current head 89193cd differs from pull request most recent head 0329172. Consider uploading reports for the commit 0329172 to get more accurate results

Files	Patch %	Lines
src/core/PlumedMain.cpp	90.90%	2 Missing ⚠️
src/tools/Tools.cpp	91.66%	1 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #1015      +/-   ##
==========================================
- Coverage   84.18%   84.17%   -0.02%     
==========================================
  Files         612      613       +1     
  Lines       56497    56568      +71     
==========================================
+ Hits        47563    47614      +51     
- Misses       8934     8954      +20

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

carlocamilloni · 2024-02-02T17:08:42Z

@GiovanniBussi confirmed, with 11,500 ns/day vs 10,600 ns/day without this and 10,700 ns/day before htt.
Here the gain appears in the total time (~9 s vs 10.6 s) but not in the detailed timers.

carlocamilloni · 2024-02-02T17:10:04Z

as a comparison without plumed we are 25,000 ns/day

GiovanniBussi · 2024-02-02T17:11:54Z

Here the gain appears in the total time (~9 s vs 10.6 s) but not in the detailed timers

This is expected. I think we are not including in the detailed timers what happens between when we call plumed_cmd and we start doing something. We always assumed this part of the code is not impacting performances.

I think this mostly relevant for very small (and irrelevant :-)) systems, but still it's a good exercise in hunting for bottlenecks.

carlocamilloni · 2024-02-02T17:28:40Z

i agree I tested a 110,000 atom using saxs (the one for which we had 2.5% loss of performances by moving to htt) and the loss of performance is still exactly 2.5% irrespectively of all the optimisations done from that time. The main difference is in the "waiting data" (left is today, middle is before htt, with the htt at that time)

but this test is short (250 ps using 4 openMP threads)

As suggested by the author: gharveymn/small_vector#5 (comment)

GiovanniBussi added 3 commits February 1, 2024 22:31

GiovanniBussi added the wip Do not merge label Feb 2, 2024

GiovanniBussi added 3 commits February 4, 2024 18:48

Fixed small vector

e2ed95b

As suggested by the author: gharveymn/small_vector#5 (comment)

Merge branch 'master' into optimize-getWords

6e3ba0d

Merge branch 'master' into optimize-getWords

0329172

GiovanniBussi merged commit 0329172 into master Feb 9, 2024
32 of 34 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize get words #1015

Optimize get words #1015

GiovanniBussi commented Feb 2, 2024

codecov-commenter commented Feb 2, 2024 •

edited

Loading

carlocamilloni commented Feb 2, 2024

carlocamilloni commented Feb 2, 2024

GiovanniBussi commented Feb 2, 2024

carlocamilloni commented Feb 2, 2024 •

edited

Loading

Optimize get words #1015

Optimize get words #1015

Conversation

GiovanniBussi commented Feb 2, 2024

codecov-commenter commented Feb 2, 2024 • edited Loading

Codecov Report

carlocamilloni commented Feb 2, 2024

carlocamilloni commented Feb 2, 2024

GiovanniBussi commented Feb 2, 2024

carlocamilloni commented Feb 2, 2024 • edited Loading

codecov-commenter commented Feb 2, 2024 •

edited

Loading

carlocamilloni commented Feb 2, 2024 •

edited

Loading