Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using ParallelTaskManager in ProjectPoints, ThreeBodyGFunctions and MatrixVectorProduct #1214

Open
wants to merge 12 commits into
base: final-backpropegation
Choose a base branch
from

Conversation

gtribello
Copy link
Member

Description

@Iximiel this change uses the ParallelTaskManager for some other methods in PLUMED. In making these work I made some changes in ParallelTaskManager. Can you check if this revised code still works on the GPU, please? If it does, then I will merge it into the final-backpropegation branch.

In terms of running these things on the GPU, I reckon that you can try with MatrixVectorProductBase and with MatrixProductDiagonal. I think ProjectPoints will never run on the GPU. Similarly, ThreeBodyGFunctions uses lepton so will not run on the GPU. I don't think it is important to get this stuff running on the GPU. None of it is very computationally expensive. I started with these classes because they were relatively easy to convert to the new way of doing things and so I could build some familiarity with doing this sort of thing before starting on the hard stuff.

Target release

I would like my code to appear in release 2.11

Type of contribution
  • changes to code or doc authored by PLUMED developers, or additions of code in the core or within the default modules
  • changes to a module not authored by you
  • new module contribution or edit of a module authored by you
Copyright
  • I agree to transfer the copyright of the code I have written to the PLUMED developers or to the author of the code I am modifying.
  • the module I added or modified contains a COPYRIGHT file with the correct license information. Code should be released under an open source license. I also used the command cd src && ./header.sh mymodulename in order to make sure the headers of the module are correct.
Tests
  • I added a new regtest or modified an existing regtest to validate my changes.
  • I verified that all regtests are passed successfully on GitHub Actions.

@Iximiel
Copy link
Member

Iximiel commented Mar 4, 2025

@gtribello What is the priority between the actions from this PR and volume and secondary structures?

@Iximiel
Copy link
Member

Iximiel commented Mar 4, 2025

@gtribello matrix view is for working with sparse matrices?

@gtribello
Copy link
Member Author

@gtribello What is the priority between the actions from this PR and volume and secondary structures?

secondary structure is way more important!

@gtribello
Copy link
Member Author

@gtribello matrix view is for working with sparse matrices?

yes, but with the way I have implemented sparse matrices in PLUMED. there may be a better way to implement them in the longer term.

@gtribello
Copy link
Member Author

gtribello commented Mar 6, 2025

Hi @Iximiel

I added an implementation of the RMSD action that underpins my implementation of the Path CVs here that can potentially be parallelised with the GPUs. I think this is something that we should perhaps work on getting on the GPU once the secondary structure variables are done. I think this for two reasons:

  1. It is a logical next step. Paths use RMSD much like the secondary structure variables, so if you have to do RMSD for secondary structure variables, it shouldn't be too hard to get them to also work for the path CVs.
  2. This is something that Francesco Gervasio (who is now paying some of the bills) might be very interested in having. I think he uses these PathCVs a lot, so if we have a fast implementation of them on the GPU, it would be a good thing.

I think the first step is to download this branch and check that you can still compile and run everything that you have done so far on the GPU (i.e. multicolvars and secondary structure variables). If that works, we can merge this into the final-backpropagation branch and then start work on the GPU version of RMSDVector.

It is perhaps worth noting that I had to make some changes to the way that ParallelTaskManager operates. The significant one that I think relaxes an assumption that I made in earlier versions of the code and it is perhaps worth explaining some terminology that I have introduced in this regard.

So, as you know, multicolvar basically calculates a vector or set of vectors. These vectors are stored in PLMD::Value objects that are then passed to other actions. The main loop is parallelised over tasks, and each task calculates one element for each of the vector objects that the PLMD::Value objects that the multicolvar can pass. Consequently, if an action has four PLMD::Value components, we know that performTask will return four scalars.

In this version, I have relaxed this assumption, I thus have two important quantities in ParallelActionInput:

  1. input.ncomponents = Number of PLMD::Value objects that are being calculated
  2. input.nscalars = Number of scalars that are being calculated during the task.

In all the cases we have looked at so far input.nscalars = input.ncomponents. In RMSDVector and some other things that we will get to in the coming weeks input.nscalars > input.ncomponents. In other words, there will be tasks that calculate more than one of the scalars that will be stored in the underlying PLMD::Value. I think I have made this change everywhere it needs to be made but I thought I would explain it just in case I have made a mistake when modifying the GPU code.

To be clear, the secondary structure variables are still the main priority. Once you have done that, though, it would be good to get on to getting this branch merged and the RMSDVector stuff working. You can find tests for this action in regtest/mapping. mapping/rt39 is a good place to start.

@Iximiel
Copy link
Member

Iximiel commented Mar 6, 2025

As now it compiles, but crashes on the GPU
I think there are some problem with how we should manage the ParallelActionsInput in the CPU/GPU data passage, but I need to dig deeper on this

@gtribello
Copy link
Member Author

OK. I imagine the most likely culprit is the std::vector called args.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants