Skip to content

More tweaks#375

Draft
kshyatt wants to merge 9 commits into
mainfrom
ksh/cuda_tweaks
Draft

More tweaks#375
kshyatt wants to merge 9 commits into
mainfrom
ksh/cuda_tweaks

Conversation

@kshyatt
Copy link
Copy Markdown
Member

@kshyatt kshyatt commented Feb 18, 2026

Needed to get more MPSKit examples working

Comment thread ext/TensorKitCUDAExt/auxiliary.jl Outdated
Comment thread ext/TensorKitCUDAExt/cutensormap.jl Outdated
Comment thread ext/TensorKitCUDAExt/cutensormap.jl Outdated
Comment thread ext/TensorKitCUDAExt/cutensormap.jl Outdated
Comment thread ext/TensorKitCUDAExt/cutensormap.jl Outdated
Comment thread ext/TensorKitCUDAExt/cutensormap.jl Outdated
Comment thread src/tensors/braidingtensor.jl Outdated
Comment thread src/tensors/treetransformers.jl Outdated
@codecov
Copy link
Copy Markdown

codecov Bot commented Feb 26, 2026

Codecov Report

❌ Patch coverage is 66.66667% with 6 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
ext/TensorKitCUDAExt/cutensormap.jl 0.00% 3 Missing ⚠️
src/tensors/tensor.jl 0.00% 2 Missing ⚠️
src/tensors/abstracttensor.jl 85.71% 1 Missing ⚠️
Files with missing lines Coverage Δ
ext/TensorKitCUDAExt/truncation.jl 96.77% <100.00%> (ø)
src/tensors/adjoint.jl 89.65% <ø> (+16.32%) ⬆️
src/tensors/braidingtensor.jl 88.51% <100.00%> (+25.24%) ⬆️
src/tensors/diagonal.jl 89.76% <100.00%> (+50.69%) ⬆️
src/tensors/indexmanipulations.jl 72.50% <100.00%> (+6.90%) ⬆️
src/tensors/tensoroperations.jl 96.27% <100.00%> (+5.32%) ⬆️
src/tensors/abstracttensor.jl 56.34% <85.71%> (+18.60%) ⬆️
src/tensors/tensor.jl 82.65% <0.00%> (+15.60%) ⬆️
ext/TensorKitCUDAExt/cutensormap.jl 70.83% <0.00%> (-3.84%) ⬇️

... and 43 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@kshyatt kshyatt marked this pull request as draft February 27, 2026 11:14
@kshyatt
Copy link
Copy Markdown
Member Author

kshyatt commented Feb 27, 2026

Let's make this a draft too to cut down on CI thrash

@kshyatt kshyatt force-pushed the ksh/cuda_tweaks branch 2 times, most recently from f5857b3 to 32e182d Compare March 12, 2026 12:36
@kshyatt kshyatt force-pushed the ksh/cuda_tweaks branch 2 times, most recently from f5faaf6 to 2359d28 Compare March 23, 2026 14:24
@lkdvos lkdvos mentioned this pull request Mar 26, 2026
Copy link
Copy Markdown
Member

@lkdvos lkdvos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comment throughout, there are some things that I am not entirely convinced by but the rest looks great, thanks for working through all of this!

For the similarstoragetype(tensor, storagetype) calls that you added, this seems like something we should probably discuss over a separate PR, and it would be great if we could consolidate this one to get the remainder of the fixes in.
Would you be up for splitting these two things, and then getting this merged?

The same kind of holds for some of the comments I made too, if we can just postpone the things that are not obvious, but already get the other parts in, that would probably be helpful.

(Note that I am very much aware that none of this is your fault and this PR has lived for too long so the design shifts a bit, for which I do apologize!)

Comment thread ext/TensorKitCUDAExt/cutensormap.jl Outdated
Comment thread src/tensors/abstracttensor.jl Outdated
Comment thread ext/TensorKitCUDAExt/cutensormap.jl Outdated
Comment thread ext/TensorKitCUDAExt/cutensormap.jl Outdated
Comment thread src/tensors/abstracttensor.jl
Comment thread src/tensors/indexmanipulations.jl Outdated
Comment thread src/tensors/indexmanipulations.jl Outdated
Comment thread src/tensors/indexmanipulations.jl Outdated
Comment thread src/tensors/tensoroperations.jl Outdated
twistB = false
end

TTC = storagetype(C)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this effectively means that we are deciding to promote inputs to the storagetype of the output. I'm not sure if I am fully convinced that we should solve this automatically at all, since I think that is also inconsistent with how regular matrices work (same for adding):

julia> CUDA.rand(2, 2) * rand(Float32, 2, 2)
ERROR: Scalar indexing is disallowed.

I do think that this might be the right approach, and requiring explicit conversions in the cases of mixed inputs seems like the right call to me. (Even though I can see how that is annoying for MPSKit 😉 )

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's for

    TTC = storagetype(C)                                                                                                                                                                                   
    # Bring A in the correct form for BLAS contraction                                                                                                                                                     
    if copyA                                                                                                                                                                                               
        Anew = TO.tensoralloc_add(TTC, A, pA, false, Val(true), allocator)                                                                                                                                 
        Anew = TO.tensoradd!(Anew, A, pA, false, One(), Zero(), backend, allocator)                                                                                                                        
        twistA && twist!(Anew, filter(!isdual  Base.Fix1(space, Anew), domainind(Anew)))                                                                                                                  
    else                                                                                                                                                                                                   
        Anew = permute(A, pA)                                                                                                                                                                              
    end

Without this change, Anew will always have Vector{scalartype(T)} storage even if A was a BraidingTensor or some other object that only gets instantiated here. With the changes in #393 this won't be necessary. It's more than "annoying", with this change or #393 you have to define new tensoralloc methods for the mixed case, it's quite painful 😭

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reopening this since it seems like you changed this again: I'm really confused about why you are claiming that Anew will have the wrong storagetype: it should have the same storage type as A, not as C no? It sounds like you are trying to make contractions with mismatching storagetypes in the inputs work, while I would have expected that this is not what we want to support? Am I missing something here, or am I wrong?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What seems to be happening is that:

  • TTC = scalartype(C) gives you a Float64 (let's say)
  • Anew = TO.tensoralloc_add(TTC, A, pA, false, Val(true), allocator) then spits out something with storage type Vector{Float64} if A is a SparseBlockMatrix{AbstractTensorMap} even if it should in fact be giving you Anew having storage type CuVector{Float64} -- this is because of the fact that AbstractTensorMap storage type forcibly defaults to Vector, I think

So it's not that I'm trying to mix incompatible storage types, it's that TensorKit as currently set up forces me to do so without this change.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this make sense? I think the JordanMPOTensor changes could obviate this but I think we should link that issue in a comment and keep this moving

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That does make sense, but that mostly seems to indicate that we should remove the storagetype(::Type{<:AbstractTensorMap}) default, as that is just incorrect, and instead make storagetype(x::SparseBlockTensorMap{AbstractTensorMap}) do a runtime computation of the storagetype when this isn't possible to deduce from the type itself.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe so but does this have to happen in this PR?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It raises additional problems if the SparseBlockTensorMap or BlockTensorMap has empty blocks

@kshyatt
Copy link
Copy Markdown
Member Author

kshyatt commented Mar 31, 2026

It's completely fine!! This has stayed open as I work through adding more tests for MPSKit, so I think we can pare off the simpler stuff we agree on, and then discuss things that are more contentious.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Mar 31, 2026

Your PR no longer requires formatting changes. Thank you for your contribution!

Comment thread src/tensors/braidingtensor.jl
Comment thread src/tensors/indexmanipulations.jl Outdated
Comment thread src/tensors/indexmanipulations.jl Outdated
Comment thread src/tensors/indexmanipulations.jl Outdated
Comment thread src/tensors/abstracttensor.jl Outdated
Comment thread src/tensors/adjoint.jl Outdated
Comment thread src/tensors/adjoint.jl Outdated
Comment thread src/tensors/indexmanipulations.jl
Copy link
Copy Markdown
Member

@lkdvos lkdvos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like some of the rebasing and the github UI has made it hard to spot the comments I left before, although I think many of them are still unresolved and could be discussed :)

Comment thread ext/TensorKitCUDAExt/cutensormap.jl
@kshyatt
Copy link
Copy Markdown
Member Author

kshyatt commented May 11, 2026

I'm happy to discuss them, you'll note this PR is still in draft state.

@lkdvos
Copy link
Copy Markdown
Member

lkdvos commented May 11, 2026

My bad, I got my notifications messed up and thought this was a review request 🙃 I will still blame github, but this one might actually also be (partially) on me

@kshyatt
Copy link
Copy Markdown
Member Author

kshyatt commented May 11, 2026

GitHub UI against the world

Comment thread src/tensors/braidingtensor.jl Outdated
Comment thread src/tensors/tensor.jl
Comment thread test/cuda/tensors.jl Outdated
@kshyatt kshyatt force-pushed the ksh/cuda_tweaks branch from d557629 to 90de428 Compare May 12, 2026 10:04
end

perm = sortperm(parent(values); strategy.by, strategy.rev)
perm = isempty(parent(values)) ? () : sortperm(parent(values); strategy.by, strategy.rev)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume sortperm returns a type of AbstractVector{Int}, not sure if it is Vector{Int} or CuVector{Int} in this case. Is the problem that empty arrays fail? Should the isempty case return (Cu)Vector{Int}(undef, 0) to avoid a type stability?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is deeper within GPUArrays, it tries to launch a kernel of size 0 which is not allowed

end

perm = sortperm(parent(values); by = abs, rev = false)
perm = isempty(parent(values)) ? () : sortperm(parent(values); by = abs, rev = false)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same

Comment thread src/tensors/braidingtensor.jl Outdated
@kshyatt kshyatt force-pushed the ksh/cuda_tweaks branch from 0e18e39 to 668a898 Compare May 13, 2026 07:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants