Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interleave IO operations with kernel calculation #23

Open
felipeZ opened this issue Feb 13, 2020 · 1 comment
Open

Interleave IO operations with kernel calculation #23

felipeZ opened this issue Feb 13, 2020 · 1 comment
Assignees
Labels
enhancement New feature or request

Comments

@felipeZ
Copy link
Member

felipeZ commented Feb 13, 2020

Currently, all IO and Kernel operations happen in a single stream. The performance would be significantly increase if we interleave multiple streams.

@felipeZ felipeZ added the enhancement New feature or request label Feb 13, 2020
@felipeZ felipeZ self-assigned this Feb 13, 2020
@benvanwerkhoven
Copy link

I've been thinking about this. If you want to overlap these things you have to indeed ensure that streams are used so that computation in one stream can overlap with data transfers in other streams. It might be enough to use multiple threads, one for each stream. However, I know that in a single threaded application it is necessary to allocate host memory in a way that ensures that the cudamemcpy operations can be performed by DMA. It's the only way to make the async API calls truly asynchronous with respect to the host.

Perhaps you won't need it, because you will be using multiple threads, in which case it might not hurt performance when the cpu thread blocks on the cudamemcpyasync. But if you don't see any overlap between copies in one stream and copies (in the opposite direction) and computations in other streams then this could be the cause. Also, I expect the achieved bandwidth of cudamemcpy to increase significantly if you allocate host memory that is page-locked and aligned. But depending on how Eigen is coded it might require modifying Eigen to really achieve this, I haven't checked that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants