Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kernel Storage V2 #50

Merged
merged 22 commits into from
Nov 13, 2024
Merged

Kernel Storage V2 #50

merged 22 commits into from
Nov 13, 2024

Conversation

xinyazhang
Copy link
Collaborator

@xinyazhang xinyazhang commented Oct 22, 2024

The Kernel Storage V2 achieves the following goals

Major changes:

  • Move GPU kernel image to separate files​
    • Now organized as aotriton.images/<vendor>-<arch>/<kernel_family>/<kernel_name>/FONLY__<functionals>___<GPU>.aks2'
      • <GPU> can be a family of GPUs, e.g., MI300X/MI300A/MI325X
    • This enables per-architecture delivery
    • No more linking errors when binaries are bloated.
  • Introduce AKS2 file format to compress GPU kernels with LZMA. Reduce the total package down to 200MB (MI200/300/Navi31)
    • AKS2 means "Aotriton Kernel Storage version 2"
    • LZMA is picked over Zstandard for much better compression ratio (~7% vs ~12%) and acceptable performance (<0.1s)
  • Look up kernel image relative to the .so file (achieved through dladdr)
  • Can only build the C++ part by setting cmake option AOTRITON_NOIMAGE_MODE to ON

Minor changes:

@xinyazhang xinyazhang changed the title [Queued PR] Kernel Storage V2 [Draft] Kernel Storage V2 Oct 22, 2024
@xinyazhang xinyazhang changed the title [Draft] Kernel Storage V2 [Queued PR] Kernel Storage V2 Oct 22, 2024
@xinyazhang xinyazhang force-pushed the xinyazhang/storagev2 branch from 43f5f33 to 1ff178e Compare October 30, 2024 18:47
@xinyazhang xinyazhang force-pushed the xinyazhang/storagev2 branch from 1ff178e to 801842e Compare October 30, 2024 23:13
@xinyazhang xinyazhang force-pushed the xinyazhang/storagev2 branch from 801842e to 93299bd Compare October 30, 2024 23:27
lamikr added a commit to rocmnavi/aotriton that referenced this pull request Nov 4, 2024
- ROCm/aotriton#50
- upstream repository versions used without modifications:
  aotriton:
      https://github.com/ROCm/aotriton.git
      93299bd
      xinyazhang/storagev2
      aotriton submodules
          thirdparty/triton
              https://github.com/ROCm/triton.git
              c69fa304e0ba07944376ff5e7da175f32c276d85
              origin/aotriton/0.8
          thirdparty/pypind11
              https://github.com/pybind/pybind11.git
              tag v2.11.1
              8a099e44b3d5f85b20f05828d919d2332a8de841
          thirdparty/incbin
              https://github.com/graphitemaster/incbin.git
              6e576cae5ab5810f25e2631f2e0b80cbe7dc8cb

    original files removed
    ======================
    rm -rf aotriton/.gitmodules
           aotriton/.git
           aotriton/third_party/incbin/.git
           aotriton/third_party/triton/.git
           aotriton/third_party/pybind11/.git

    commands used to add all of the required files to rocmsdk aotrition
    ===================================================================
    git add -A
    git add -f third_party/pybind11/docs/Makefile
    git add -f third_party/triton/.github/workflows/llvm-build/almalinux.Dockerfile
    git add -f third_party/triton/.github/workflows/llvm-build/centos.Dockerfile
    git add -f third_party/triton/docs/getting-started/tutorials/grouped_vs_row_major_ordering.png
    git add -f third_party/triton/docs/getting-started/tutorials/parallel_reduction.png
    git add -f third_party/triton/docs/getting-started/tutorials/random_bits.png
    git add -f third_party/triton/python/triton/backends/__init__.py
    git add -f third_party/triton/python/triton/backends/compiler.py
    git add -f third_party/triton/python/triton/backends/driver.py

Signed-off-by: Mika Laitio <lamikr@gmail.com>
lamikr added a commit to rocmnavi/aotriton that referenced this pull request Nov 4, 2024
- ROCm/aotriton#50
- upstream repository versions used without modifications:
  aotriton:
      https://github.com/ROCm/aotriton.git
      93299bd
      xinyazhang/storagev2
      aotriton submodules
          thirdparty/triton
              https://github.com/ROCm/triton.git
              c69fa304e0ba07944376ff5e7da175f32c276d85
              origin/aotriton/0.8
          thirdparty/pypind11
              https://github.com/pybind/pybind11.git
              tag v2.11.1
              8a099e44b3d5f85b20f05828d919d2332a8de841
          thirdparty/incbin
              https://github.com/graphitemaster/incbin.git
              6e576cae5ab5810f25e2631f2e0b80cbe7dc8cb

    original files removed
    ======================
    rm -rf aotriton/.gitmodules
           aotriton/.git
           aotriton/third_party/incbin/.git
           aotriton/third_party/triton/.git
           aotriton/third_party/pybind11/.git

    commands used to add all of the required files to rocmsdk aotrition
    ===================================================================
    git add -A
    git add -f third_party/pybind11/docs/Makefile
    git add -f third_party/triton/.github/workflows/llvm-build/almalinux.Dockerfile
    git add -f third_party/triton/.github/workflows/llvm-build/centos.Dockerfile
    git add -f third_party/triton/docs/getting-started/tutorials/grouped_vs_row_major_ordering.png
    git add -f third_party/triton/docs/getting-started/tutorials/parallel_reduction.png
    git add -f third_party/triton/docs/getting-started/tutorials/random_bits.png
    git add -f third_party/triton/python/triton/backends/__init__.py
    git add -f third_party/triton/python/triton/backends/compiler.py
    git add -f third_party/triton/python/triton/backends/driver.py

Signed-off-by: Mika Laitio <lamikr@gmail.com>
liblzma is needed instead of libzstd right now.
(Notably the header file, the library itself is shipped with Python)
@xinyazhang xinyazhang force-pushed the xinyazhang/storagev2 branch from 93299bd to 0dc5705 Compare November 7, 2024 18:08
@xinyazhang xinyazhang changed the base branch from xinyazhang/gqa to main November 7, 2024 18:08
@xinyazhang xinyazhang marked this pull request as ready for review November 7, 2024 18:10
@xinyazhang xinyazhang changed the title [Queued PR] Kernel Storage V2 Kernel Storage V2 Nov 7, 2024
@jithunnair-amd
Copy link
Contributor

@ethanwee1 to help with testing of building AOTriton using this PR with and without AOTRITON_NOIMAGE_MODE=ON

@jithunnair-amd
Copy link
Contributor

Changes for this PR tested using PR 51: #51 (comment)

@xinyazhang xinyazhang merged commit 0eb03a6 into main Nov 13, 2024
@ethanwee1
Copy link

ethanwee1 commented Nov 13, 2024

Tested with these commands to build
aotriton-e278d4a853170c7a9063cfe847419414cb7b62b6-manylinux_2_28_x86_64-rocm6.2-shared.tar.gz

Commands:

git clone https://github.com/ROCm/aotriton.git
cd aotriton/
git checkout xinyazhang/manylinux_2_28-dockerfile
cd dockerfile/
export AMDGPU_INSTALLER=https://repo.radeon.com/amdgpu-install/6.2.4/el/8.10/amdgpu-install-6.2.60204-1.el8.noarch.rpm
mkdir -p output
cd input
vi install_aotriton.sh
added -DAOTRITON_NOIMAGE_MODE=ON

TRITON_LLVM_HASH="b5cc222d"  bash build.sh input tmpfs output e278d4a853170c7a9063cfe847419414cb7b62b6 "MI300X;MI200" 2>&1 | tee buildlog2.log
tar tvf output/*.tar*

Output:
Size: 1.6MB
aotriton-e278d4a853170c7a9063cfe847419414cb7b62b6-manylinux_2_28_x86_64-rocm6.2-shared.tar.gz
buildlogNOIMAGE.log

@jithunnair-amd
Copy link
Contributor

Output: Size: 107MB aotriton-e278d4a853170c7a9063cfe847419414cb7b62b6-manylinux_2_28_x86_64-rocm6.2-shared_WITHAOTRITON_NOIMAGE_MODE_ON.txt buildlog2.log

@ethanwee1 This doesn't look right. The tarball size and txt file indicate that kernel images are still being built. Either your experiement was faulty or the AOTRITON_NOIMAGE_MODE logic is faulty. @xinyazhang, thoughts?

@xinyazhang
Copy link
Collaborator Author

xinyazhang commented Nov 14, 2024

Output: Size: 107MB aotriton-e278d4a853170c7a9063cfe847419414cb7b62b6-manylinux_2_28_x86_64-rocm6.2-shared_WITHAOTRITON_NOIMAGE_MODE_ON.txt buildlog2.log

@ethanwee1 This doesn't look right. The tarball size and txt file indicate that kernel images are still being built. Either your experiement was faulty or the AOTRITON_NOIMAGE_MODE logic is faulty. @xinyazhang, thoughts?

I think you need to download the log for review since it's rather large and likely get truncated if viewing with a browser. The trailing -- Install configuration: "Release" lines have indicated this project installed to the target directory.

The size is right. I only specified two architectures and 100MB is about the right size for 2 arches under Kernel Storage V2.

(However non-square causal masks added some more kernels and two arches ~= 130MB when releasing 0.8b)

@xinyazhang
Copy link
Collaborator Author

AOTRITON_NOIMAGE_MODE will be tested in another dockerfile related PRs

@jithunnair-amd
Copy link
Contributor

Output: Size: 107MB aotriton-e278d4a853170c7a9063cfe847419414cb7b62b6-manylinux_2_28_x86_64-rocm6.2-shared_WITHAOTRITON_NOIMAGE_MODE_ON.txt buildlog2.log

@ethanwee1 This doesn't look right. The tarball size and txt file indicate that kernel images are still being built. Either your experiement was faulty or the AOTRITON_NOIMAGE_MODE logic is faulty. @xinyazhang, thoughts?

I think you need to download the log for review since it's rather large and likely get truncated if viewing with a browser. The trailing -- Install configuration: "Release" lines have indicated this project installed to the target directory.

The size is right. I only specified two architectures and 100MB is about the right size for 2 arches under Kernel Storage V2.

(However non-square causal masks added some more kernels and two arches ~= 130MB when releasing 0.8b)

Just to resolve the confusion here, I was referring to Ethan's initial numbers for the AOTRITON_NOIMAGE_MODE=ON build. He corrected his build steps thereafter and updated his comment to show that the new size is only 1.6MB, which is what I expected. So, in summary, looks like AOTRITON_NOIMAGE_MODE=ON is working as expected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants