'named symbol not found' error at runtime in cuda_piKernelCreate if static library with kernel is linked #4307

ivorobts · 2021-08-10T16:42:03Z

A simple example showing the usage of static library with kernel which fails at runtime with CUDA BE:
Here is the content of file sycl_lib.cpp:

#include<CL/sycl.hpp>
using namespace sycl;
void func() {
  queue q;
  q.submit([&](sycl::handler &h) {
      sycl::stream os(1024, 768, h);
      h.parallel_for(32, [=](sycl::id<1> i) {
          os<<i<<"\n";
        });
    });
}

and a simple main code:

void func();
int main() {
  func();
  return 0;
}

Compilation of sycl_lib.cpp, static library creation and linking works fine:

clang++  -fsycl -fsycl-targets=nvptx64-nvidia-cuda-sycldevice -fsycl-unnamed-lambda -c sycl_lib.cpp
ar rvs sycl_lib.a sycl_lib.o
clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda-sycldevice main.cpp -o test-lib sycl_lib.a
SYCL_BE=PI_CUDA ./test-lib

The application crashes at runtime with following error:

PI CUDA ERROR:
        Value:           500
        Name:            CUDA_ERROR_NOT_FOUND
        Description:     named symbol not found
        Function:        cuda_piKernelCreate
        Source Location: /llvm/sycl/plugins/cuda/pi_cuda.cpp:2380

terminate called after throwing an instance of 'cl::sycl::runtime_error'
  what():  Native API failed. Native API returns: -999 (Unknown OpenCL error code) -999 (Unknown OpenCL error code)

Note that ahead of time compilation and linking works fine for CPU or Intel GPU devices. This looks like a limitation of CUDA BE.

The text was updated successfully, but these errors were encountered:

Michoumichmich · 2021-08-10T16:51:28Z

Hello, I get also the same issue on the CUDA be, but this is not a limitation of the CUDA back-end. Could you try: https://github.com/Michoumichmich/llvm/commit/e3eb050b73bff24cb8335f84fa6a51f1ebaf05c5.diff, this is barely a temporary workaround to a bug that was introduced by f7ce532 in a pulldown

ravil-mobile · 2021-08-12T19:59:16Z

Thanks a lot. It saved my day.

Michoumichmich · 2021-08-12T20:01:09Z

Thanks a lot. It saved my day.

You're welcome!

danchitnis · 2021-08-20T17:42:27Z

It seems to happen on some versions but not others. It happens both on WSL and native Linux, I assumed it was related to GPU driver mismatch with CUDA. perhaps can you make the error info more useful with debugging traces?

Michoumichmich · 2021-08-20T17:52:39Z

It seems to happen on some versions but not others. It happens both on WSL and native Linux, I assumed it was related to GPU driver mismatch with CUDA. perhaps can you make the error info more useful with debugging traces?

The error is coming from the fact that the names of the generated device object files were changed, but clang-offload-bundler is still called on the old names it fails silently. If you compile with -v you won't even see ptxas compiling down the kernels during the final linking of the .so object or executable. It might seem to work if you did not clean your build directory, as you're probably using old files.
If your bug depends on the version of CUDA, it might not be related to that exact one

danchitnis · 2021-08-23T18:24:53Z

So far it is working for me, but I am not using the latest commit (a month old). I am aware that there is a recent bug with nvidia containers influencing WSL (and native Linux), so then these two bugs may not be related.

Michoumichmich · 2021-09-03T15:16:21Z

Hello @ivorobts can you try compiling with nvptx64-nvidia-cuda--sm_50 on the current branch to see if it's fixed ?

npmiller · 2021-09-08T12:47:20Z

This was fixed recently by this PR:

[Driver] Properly set offload-deps target triple #4305

If you want to try it note that your SYCL code is missing a wait which might cause issues at the moment, something like this works fine though:

#include<CL/sycl.hpp>
using namespace sycl;
void func() {
  queue q;
  q.submit([&](sycl::handler &h) {
      sycl::stream os(1024, 768, h);
      h.parallel_for(32, [=](sycl::id<1> i) {
          os<<i<<"\n";
        });
    }).wait_and_throw();
}

I'm not entirely sure if it should work without the wait but either way that would be a separate issue to investigate, the original problem reported here should be fixed.

npmiller · 2021-09-08T13:22:23Z

Just as a side note, I double checked with the SYCL 2020 specification and the wait is indeed required in this case, the queue destructor doesn't not implicitly wait on kernel completion, section 3.9.8.1:

Note that the destructors of other SYCL objects (sycl::queue, sycl::context,...) do not block. Only a
sycl::buffer, sycl::sampled_image or sycl::unsampled_image destructor might block. The rationale is that
an object without any side effect on the host does not need to block on destruction as it would impact the
performance. So it is up to the programmer to use a member function to wait for completion in some
cases if this does not fit the goal.

AerialMantis · 2021-09-14T11:45:58Z

We believe this issue to be resolved now, so we are closing the ticket, @ivorobts if you continue to see any issue here please feel free to comment and we can re-open the ticket.

Luigi-Crisci · 2021-09-21T13:03:31Z

Hi,
I'm having the same issue with the lastest build from the sycl branch when I try to compile the same code provided by @ivorobts. This happens when targeting nvptx64_nvidia_cuda.

This was fixed recently by this PR:

[Driver] Properly set offload-deps target triple #4305

If you want to try it note that your SYCL code is missing a wait which might cause issues at the moment, something like this works fine though:
#include<CL/sycl.hpp>
using namespace sycl;
void func() {
  queue q;
  q.submit([&](sycl::handler &h) {
      sycl::stream os(1024, 768, h);
      h.parallel_for(32, [=](sycl::id<1> i) {
          os<<i<<"\n";
        });
    }).wait_and_throw();
}
I'm not entirely sure if it should work without the wait but either way that would be a separate issue to investigate, the original problem reported here should be fixed.

I see that it should have been fixed, maybe the lastest build broke it again?

bader · 2021-09-21T13:13:17Z

Re-open the issue to check again.

npmiller · 2021-09-21T15:37:14Z

I had a look and it is indeed failing again with the most recent build, I was able to track it down to the following commit:

9838076

I'll look into fixing this again.

The patch 9838076 changed triple processing so gpu arch could be deducted even without extra `-` so we no longer need to add padding, it also seems like SYCL was inadvertently removed from the branch adding in the bound arch to the triple. So this patch fixes adding the bound arch in the triple when using SYCL, removes leftover triple padding code in the offload deps command. The test was also updated accordingly and it now also checks the triple used for `clang-offload-deps` so that we can hopefully catch mismatch between the two earlier in the future. This fixes intel#4307

The patch 9838076 changed triple processing so gpu arch could be deducted even without extra `-` so we no longer need to add padding, it also seems like SYCL was inadvertently removed from the branch adding in the bound arch to the triple. So this patch fixes adding the bound arch in the triple when using SYCL, removes leftover triple padding code in the offload deps command. The test was also updated accordingly and it now also checks the triple used for `clang-offload-deps` so that we can hopefully catch mismatch between the two earlier in the future. This fixes #4307

ivorobts added the bug Something isn't working label Aug 10, 2021

rodburns added the cuda CUDA back-end label Aug 11, 2021

AerialMantis added the compiler Compiler related issue label Aug 31, 2021

ravil-mobile mentioned this issue Sep 3, 2021

dpcpp: added a new package spack/spack#25130

Merged

AerialMantis closed this as completed Sep 14, 2021

bader reopened this Sep 21, 2021

npmiller mentioned this issue Sep 22, 2021

[Driver][SYCL] Fix offload-bundler and offload-deps triples #4616

Merged

romanovvlad closed this as completed in #4616 Sep 23, 2021

yuxianch mentioned this issue Nov 2, 2021

[CUDA] 'named symbol not found' error at runtime in cuda_piKernelCreate if static library with kernel is linked on Windows #4873

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

'named symbol not found' error at runtime in cuda_piKernelCreate if static library with kernel is linked #4307

'named symbol not found' error at runtime in cuda_piKernelCreate if static library with kernel is linked #4307

ivorobts commented Aug 10, 2021 •

edited

Loading

Michoumichmich commented Aug 10, 2021 •

edited

Loading

ravil-mobile commented Aug 12, 2021

Michoumichmich commented Aug 12, 2021 •

edited

Loading

danchitnis commented Aug 20, 2021

Michoumichmich commented Aug 20, 2021 •

edited

Loading

danchitnis commented Aug 23, 2021

Michoumichmich commented Sep 3, 2021 •

edited

Loading

npmiller commented Sep 8, 2021

npmiller commented Sep 8, 2021

AerialMantis commented Sep 14, 2021

Luigi-Crisci commented Sep 21, 2021

bader commented Sep 21, 2021

npmiller commented Sep 21, 2021

'named symbol not found' error at runtime in cuda_piKernelCreate if static library with kernel is linked #4307

'named symbol not found' error at runtime in cuda_piKernelCreate if static library with kernel is linked #4307

Comments

ivorobts commented Aug 10, 2021 • edited Loading

Michoumichmich commented Aug 10, 2021 • edited Loading

ravil-mobile commented Aug 12, 2021

Michoumichmich commented Aug 12, 2021 • edited Loading

danchitnis commented Aug 20, 2021

Michoumichmich commented Aug 20, 2021 • edited Loading

danchitnis commented Aug 23, 2021

Michoumichmich commented Sep 3, 2021 • edited Loading

npmiller commented Sep 8, 2021

npmiller commented Sep 8, 2021

AerialMantis commented Sep 14, 2021

Luigi-Crisci commented Sep 21, 2021

bader commented Sep 21, 2021

npmiller commented Sep 21, 2021

ivorobts commented Aug 10, 2021 •

edited

Loading

Michoumichmich commented Aug 10, 2021 •

edited

Loading

Michoumichmich commented Aug 12, 2021 •

edited

Loading

Michoumichmich commented Aug 20, 2021 •

edited

Loading

Michoumichmich commented Sep 3, 2021 •

edited

Loading