Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL][UR][Graph] Require OpenCL simultaneous use #17658

Merged
merged 3 commits into from
Apr 3, 2025

Conversation

EwanC
Copy link
Contributor

@EwanC EwanC commented Mar 26, 2025

To support the SYCL-Graph extension on an OpenCL backend, we currently only require the presence of the cl_khr_command_buffer extension. This PR introduces an extra requirement on the CL_COMMAND_BUFFER_SIMULTANEOUS_USE_KHR capability being present.

This is based on the graph execution wording on the definition of handler::ext_oneapi_graph() that:

Only one instance of graph will execute at any time. If graph is submitted multiple times, dependencies are automatically added by the runtime to prevent concurrent executions of an identical graph.

Such usage results in multiple calls by the SYCL runtime to urEnqueueCommandBufferExp with the same UR command-buffer and event dependencies to prevent concurrent execution. Without support for simultaneous-use the OpenCL adapter code cannot guarantee that the first command-buffer submission has finished execution before it makes following clEnqueueCommandBufferKHR calls with the cl_event decencies. If the first submission is still executing, then an error will be reported.

Workarounds like adding blocking host waits to the OpenCL UR adapter are possible, but requiring simultaneous use reflects the vendor requirements as they are for the currently implementation. I've tried to document this all in the UR spec and SYCL-Graph design docs, which also includes a couple of cleanups I found along the way.

Note that the new CTS test fails for Level-Zero adapter, which I've created #17734 to resolve.

@EwanC EwanC force-pushed the ewan/graph_ocl_sim-use branch from 5a7ffa6 to 858ff7e Compare March 26, 2025 14:49
@EwanC EwanC temporarily deployed to WindowsCILock March 26, 2025 14:50 — with GitHub Actions Inactive
@EwanC EwanC temporarily deployed to WindowsCILock March 26, 2025 15:44 — with GitHub Actions Inactive
@EwanC EwanC temporarily deployed to WindowsCILock March 26, 2025 15:44 — with GitHub Actions Inactive
@EwanC EwanC force-pushed the ewan/graph_ocl_sim-use branch from 858ff7e to f9b1937 Compare March 28, 2025 13:03
@EwanC EwanC temporarily deployed to WindowsCILock March 28, 2025 13:03 — with GitHub Actions Inactive
@EwanC EwanC marked this pull request as ready for review March 28, 2025 13:04
@EwanC EwanC requested review from a team as code owners March 28, 2025 13:04
@EwanC EwanC requested a review from konradkusiak97 March 28, 2025 13:04
@EwanC EwanC temporarily deployed to WindowsCILock March 28, 2025 13:23 — with GitHub Actions Inactive
@EwanC EwanC temporarily deployed to WindowsCILock March 28, 2025 13:23 — with GitHub Actions Inactive
@EwanC EwanC temporarily deployed to WindowsCILock March 28, 2025 15:14 — with GitHub Actions Inactive
@EwanC EwanC temporarily deployed to WindowsCILock March 28, 2025 16:50 — with GitHub Actions Inactive
@EwanC EwanC temporarily deployed to WindowsCILock March 28, 2025 16:50 — with GitHub Actions Inactive
@EwanC EwanC force-pushed the ewan/graph_ocl_sim-use branch from c65615f to 6fb031f Compare March 28, 2025 21:11
@EwanC EwanC temporarily deployed to WindowsCILock March 28, 2025 21:11 — with GitHub Actions Inactive
@EwanC EwanC temporarily deployed to WindowsCILock March 28, 2025 21:26 — with GitHub Actions Inactive
@EwanC EwanC temporarily deployed to WindowsCILock March 28, 2025 21:26 — with GitHub Actions Inactive
@EwanC EwanC force-pushed the ewan/graph_ocl_sim-use branch from 6fb031f to d610ee8 Compare March 31, 2025 07:57
@EwanC EwanC temporarily deployed to WindowsCILock March 31, 2025 08:42 — with GitHub Actions Inactive
@EwanC EwanC temporarily deployed to WindowsCILock March 31, 2025 09:17 — with GitHub Actions Inactive
@EwanC EwanC temporarily deployed to WindowsCILock March 31, 2025 09:17 — with GitHub Actions Inactive
@EwanC EwanC force-pushed the ewan/graph_ocl_sim-use branch from d610ee8 to 19a961e Compare March 31, 2025 14:12
@EwanC EwanC temporarily deployed to WindowsCILock March 31, 2025 14:12 — with GitHub Actions Inactive
@EwanC EwanC temporarily deployed to WindowsCILock March 31, 2025 14:26 — with GitHub Actions Inactive
@EwanC EwanC temporarily deployed to WindowsCILock March 31, 2025 14:26 — with GitHub Actions Inactive
Copy link
Contributor

@konradkusiak97 konradkusiak97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just small nit.

EwanC and others added 3 commits April 1, 2025 17:27
To support the SYCL-Graph extension on an OpenCL backend, we currently
only require the presence of the `cl_khr_command_buffer` extension. This
PR introduces an extra requirement on the
[CL_COMMAND_BUFFER_SIMULTANEOUS_USE_KHR](https://registry.khronos.org/OpenCL/specs/3.0-unified/html/OpenCL_API.html#CL_COMMAND_BUFFER_SIMULTANEOUS_USE_KHR) capability being present.

This is based on the [graph execution
wording](https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/experimental/sycl_ext_oneapi_graph.asciidoc#765-new-handler-member-functions)
on the definition of `handler::ext_oneapi_graph()` that:

>  Only one instance of graph will execute at any time. If graph is submitted multiple times, dependencies are automatically added by the runtime to prevent concurrent executions of an identical graph.

Such usage results in multiple calls by the SYCL runtime to
`urEnqueueCommandBufferExp` with the same UR command-buffer and event
dependencies to prevent concurrent execution. Without support for
simultaneous-use the OpenCL adapter code cannot guarantee that the
first command-buffer submission has finished execution before it
makes following `clEnqueueCommandBufferKHR` calls with the `cl_event`
decencies. If the first submission is still executing, then an error
will be reported.

Workarounds like adding blocking host waits to the OpenCL UR adapter
are possible, but requiring simultaneous use reflects the vendor
requirements as they are for the currently implementation.

I've tried to document this all in the UR spec and SYCL-Graph design
docs, which also includes a couple of cleanups I found along the way.
Taken from intel#17709

Co-authored-by:  Mikołaj Komar <mikolaj.komar@intel.com>
@EwanC EwanC force-pushed the ewan/graph_ocl_sim-use branch from f879d7b to 7fab3b7 Compare April 1, 2025 16:27
@EwanC EwanC temporarily deployed to WindowsCILock April 1, 2025 16:41 — with GitHub Actions Inactive
@EwanC EwanC temporarily deployed to WindowsCILock April 1, 2025 17:22 — with GitHub Actions Inactive
@EwanC EwanC temporarily deployed to WindowsCILock April 1, 2025 17:22 — with GitHub Actions Inactive
Copy link
Contributor

@Bensuo Bensuo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@EwanC
Copy link
Contributor Author

EwanC commented Apr 3, 2025

@intel/llvm-gatekeepers This is good to merge thanks

@kbenzie kbenzie merged commit 7e7dffc into intel:sycl Apr 3, 2025
95 of 108 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants