-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add docker based package builder and switch to system compiler #51
Conversation
python -m pip install cmake ninja wheel | ||
RUN mkdir /root/build | ||
|
||
ARG amdgpu_installer="https://repo.radeon.com/amdgpu-install/6.2.2/el/8.10/amdgpu-install-6.2.60202-1.el8.noarch.rpm" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a maintenance headache, we need to make this more generic for this to be generally usable for any ROCm release
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed this offline, this is the default value. The intention is for any CI job or users to provide a link to the desired amdgpu installer for their desired ROCm version.
We could use logic similar to this file:
https://raw.githubusercontent.com/wiki/ROCmSoftwarePlatform/pytorch/files/install_kdb_files_for_pytorch_wheels.sh
to keep the default argument always up-to-date.
cd /root/build | ||
scl enable gcc-toolset-13 "cd aotriton/third_party/triton/python; python setup.py develop --user" | ||
scl enable gcc-toolset-13 "mkdir -p aotriton/build; cd aotriton/build; cmake .. -DCMAKE_PREFIX_PATH=/opt/rocm -DPYTHON_EXECUTABLE=/usr/bin/python3.11 -DCMAKE_INSTALL_PREFIX=installed_dir/aotriton -DCMAKE_BUILD_TYPE=Release -DAOTRITON_GPU_BUILD_TIMEOUT=0 \"-DTARGET_GPUS=${AOTRITON_TARGET_GPUS}\" -DAOTRITON_NO_PYTHON=ON -G Ninja && ninja install" | ||
rocmver=$(scl enable gcc-toolset-13 "cpp -I/opt/rocm/include /input/print_rocm_version.h"|tail -n 1|sed 's/ //g') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@naromero77amd This is another interesting and even more compact way of getting the ROCM_VERSION* defines. Something to consider if we want to make our LoadHIP.cmake logic even more compact.
if(AOTRITON_BUILD_FOR_TUNING) | ||
target_compile_definitions(aotriton_v2 PRIVATE -DAOTRITON_BUILD_FOR_TUNING=1) | ||
else(AOTRITON_BUILD_FOR_TUNING) | ||
target_compile_definitions(aotriton_v2 PRIVATE -DAOTRITON_BUILD_FOR_TUNING=0) | ||
endif(AOTRITON_BUILD_FOR_TUNING) | ||
target_link_libraries(aotriton_v2 lzma_interface) | ||
target_link_libraries(aotriton_v2 PRIVATE lzma_interface) | ||
target_link_libraries(aotriton_v2 PUBLIC hip::host hip::amdhip64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to introduce an explicit dependency on the HIP targets because we are not using the hipcc compiler as the default compiler anymore: https://github.com/ROCm/aotriton/pull/51/files#diff-1e7de1ae2d059d21e1dd75d5812d5a34b0222cef273b7c3a2af62eb747f9d20aL35
@@ -31,9 +33,6 @@ if(NOT AOTRITON_NO_PYTHON) | |||
add_subdirectory(third_party/pybind11) | |||
endif() | |||
|
|||
# Must be after pybind11 | |||
set(CMAKE_CXX_COMPILER hipcc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do not need to use hipcc
since we are not compiling any kernels, so a host compiler should be good enough.
529395c
to
2147593
Compare
@@ -168,13 +168,14 @@ target_include_directories(aotriton_v2 PUBLIC ${CMAKE_CURRENT_LIST_DIR}/../inclu | |||
target_include_directories(aotriton_v2 PUBLIC ${CMAKE_BINARY_DIR}/include) # for <build>/include/aotriton/config.h. Code should use <aotriton/config.h> | |||
target_include_directories(aotriton_v2 PRIVATE ${CMAKE_CURRENT_BINARY_DIR}) | |||
target_include_directories(aotriton_v2 PRIVATE ${CMAKE_CURRENT_LIST_DIR}/../third_party/incbin) | |||
target_compile_options(aotriton_v2 PRIVATE -fPIC --no-offload-arch=all) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed since we're not using hipcc anymore
.cu_seqlens_q = &cu_seqlens_q, | ||
.cu_seqlens_k = &cu_seqlens_k, | ||
.num_seqlens = num_seqlens, | ||
.max_seqlen_q = max_seqlen_q, | ||
.max_seqlen_k = max_seqlen_k, | ||
.head_dim = head_size, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Required because we're switching to gcc13, which is more strict about the order of struct fields
@@ -8,7 +8,7 @@ namespace py = pybind11; | |||
namespace pyaotriton { | |||
#define EV(name) value(#name, name) | |||
void def_hipruntime(py::module_& m) { | |||
py::enum_<hipError_t>(m, "hipError_t") | |||
py::enum_<hipError_t>(m, "hipError_t", py::module_local()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is not due to the compiler switch or any other changes in this PR, then it would be better to separate this into its own PR with reasoning.
Dockerfile tested locallly and can build 0.7.1b |
@ethanwee1 Can you please validate this PR by trying to build aotriton 0.7b and 0.7.1b using the Dockerfile in this PR? Please make sure that it generates a tarball with the following files:
|
43f5f33
to
1ff178e
Compare
2147593
to
7ca74d9
Compare
1ff178e
to
801842e
Compare
7ca74d9
to
63fe1cc
Compare
801842e
to
93299bd
Compare
63fe1cc
to
ca18036
Compare
Validated 0.7.1b and 0.7b after following build.sh
|
93299bd
to
0dc5705
Compare
ca18036
to
4357504
Compare
@xinyazhang Taking this out of Draft mode as it is ready to merge (even if queued) based on Ethan's validation |
@jithunnair-amd nope, queued PR should not be taken out of draft because it's based on previous work (for this PR specifically its #50) |
3d075ee
to
e278d4a
Compare
Tested with these commands to build Commands:
Output: |
Use AlmaLinux 8 as base image.
However due to their more limit support of designated initializer the order of fields have to be adjusted.
This is more general since -b only supports tags and branches.
e278d4a
to
8338d4c
Compare
Some notable lines in the log:
|
13 MiB is also significant considering it's functionality. I believe most of the size comes from the generated dispatching code. |
Major Changes:
dockerfile/build.sh
script that builds AOTriton from offical AlmaLinux 8 docker image.hipcc
is not compatible with SCL.scl enable gcc-toolset-13 "/opt/rocm/bin/hipcc -v"
is supposed to show "Found candidate GCC installation" but nothing is displayed, and eventually this triggers Missing cmath/libcxx in ROCm 5.3.0 ROCm#1843 during compiling.Minor Changes
pyaotriton
, registerhipError_t
locally to avoid "hipError_t
is already registered" bug.