-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce allocations for multipling LazyTensor of sparse and dense #80
Conversation
Codecov Report
@@ Coverage Diff @@
## master #80 +/- ##
==========================================
+ Coverage 92.61% 92.71% +0.10%
==========================================
Files 24 24
Lines 3089 3104 +15
==========================================
+ Hits 2861 2878 +17
+ Misses 228 226 -2
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
This is a great improvement, thank you for taking initiative on it! Could you share a |
Here is a quick before/after. Before:
After:
The change in performance is quite minimal in this microbenchmark (no runtime duration change, but half the allocations), but still better than status quo. |
Which for the current release gives gives;
And for this branch gives gives;
|
Curious that the operator case allocations go down way more than the state case. Do you know why? |
benchmark notebooks |
The only difference between Ket and Operator is the reshape of Ket.data to a Matrix |
Also, in the notebooks you can see that propagating operator with a dense hamiltonian has lots of allocations, whereas propagating a state does not. Is that a |
Maybe it's a
This is JuliaLang/julia#46865, which I have been trying to get fixed! |
This reduces the allocation for But this increases the allocation of LazySum of LazyTensor of dense by a bit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the latest changes to avoid reshaping may be broken, but otherwise this looks good.
Update: Not broken. Now just suggesting we modify the Bra version too and relax to AbstractArray.
…mm_recursive_dense_lazy
Added dim check related to #74 Also, fix |
Following suggestion from @amilsted regarding qojulia/QuantumOptics.jl#352
shape
,strides_j
andstrides_k
in_gemm_puresparse
function are implemented asTuple
to reduce allocations.