Remove Thrust #809

elalish · 2024-05-11T05:56:48Z

Thrust is now deprecated, and we've been wanting to move off it for awhile anyway since we're no longer using CUDA. Thrust is turning in CCCL - getting more integrated with CUDA, so we don't need that. Thrust gave birth to PSTL, which is pretty widely supported now (C++17). PSTL appears to be backed by TBB and/or OpenMP in most compiler's standard libraries.

I think the big question is: do we switch to PSTL or TBB? What's your opinion, @pca006132? Related: #520

My impression is PSTL might be easier to switch to since the API shape is close to Thrust.

On the other hand, TBB is lower-level and so may have more performance, and we already have a little TBB code.

@fire @kintel thoughts on what would be easiest to consume as far as dependencies from a downstream perspective?

fire · 2024-05-11T06:10:39Z

Background notes

Building TBB as a static library is not recommened and is only supported because Intel has a "bigiron" business requirement. https://github.com/jckarter/tbb/blob/master/build/big_iron.inc

Godot Engine doesn't use openmp because that requires a "MSVC redistributable". https://learn.microsoft.com/en-us/cpp/windows/latest-supported-vc-redist?view=msvc-170

Edited:

As far as I know openmp degrades nicely though, but it's also different from C++17 https://stackoverflow.com/questions/67848884/c-compiler-support-for-stdexecution-parallel-stl-algorithms

pca006132 · 2024-05-11T11:02:36Z

I think PSTL is easier to switch to. It lacks some special APIs, but we can probably implement our own. The main issue here is compiler support, e.g. we need GCC 13 or libc++ to properly use it with onetbb.

Using TBB directly will require a lot of work. Some algorithms are not that easy to implement efficiently.

OpenMP is probably not an option. We tried that before and the performance is not that good, at least for thrust impelmentation.

kintel · 2024-05-11T12:21:27Z

I'm thoroughly confused about PSTL and TBB, so I cannot really comment here. ..but if PSTL is part of c++17 that will get my vote. We already package TBB, so that should be easy to keep supporting. But searching around give me the feeling that PSTL and TBB are not particularly compatible? https://community.intel.com/t5/Intel-oneAPI-Threading-Building/Is-PSTL-still-supported-by-TBB/m-p/1487798

pca006132 · 2024-05-11T12:29:38Z

@kintel They are compatible, but it depends on the versions... uxlfoundation/oneTBB#332

Basically:

The old version (before onetbb) seems to be compatible with every version of PSTL. But this is no longer maintained, and I think distros are moving towards onetbb?
When libstdc++ is used, onetbb is compatible with the libstdc++ in GCC 13+. Note that even when you compile with clang, by default it is linking against libstdc++.
When libc++ is used (e.g. on Mac or on Linux using clang with some additional parameters), it is fine with onetbb.
For windows, I haven't checked.

Note that for 3, I only tested relatively new LLVM version. Not sure about which version is the oldest supported version. Probably require a relatively new version (https://reviews.llvm.org/D141779). And it seems that the PSTL support on libc++ is quite incomplete (https://libcxx.llvm.org/Status/PSTL.html), but that may be about PSTL support with other backends?

kintel · 2024-05-11T15:36:29Z

This all sounds like a bit of a nightmare if targeting Linux distro packaging though, but perhaps that shouldn't be driving design decisions too much..

fire · 2024-05-11T17:33:30Z

Here's a decision table for compiler support godotengine/godot#91833

t-paul · 2024-05-11T17:39:39Z

gcc-13
Expanding a bit on the Linux topic, requiring gcc-13 will not be much of an issue for official distro packaging as that essentially only goes forward. Classic distros hardly backport packages to already released distro version and rolling distros are also moving along with recent versions of applications and tools.

It becomes a huge issue for providing recent application versions, a.k.a. dev builds though:

AppImages are by design built on older distros, OpenSCAD currently uses Ubuntu 20.04. but even upgrading to 22.04 would only bring gcc-11 as default compiler
People trying to self building applications on their not so recent installations. Unfortunately that's also pretty common where people are behind 2 or more LTS releases which amounts to about 5 years or so

Long story short, if gcc-13 will be a requirement, that will kill almost all OpenSCAD dev builds we currently provide for older distributions and will make AppImages impossible for a couple of years.

c++17
Things are a bit more relaxed on that, even Ubuntu 20.04 has some support for c++17 features, it would be nice to delay moving to 22.04 level a bit more, but that's not a showstopper in my opinion.

pca006132 · 2024-05-12T00:08:39Z

I'm curious what other libraries use. We should not be the first one hitting this compatibility issue?

And yeah, I don't think we want to make gcc-13 a requirement. I don't think we can rely solely on PSTL for now.

fire · 2024-05-12T00:14:20Z

@pca006132 do you have a listing of all the thrust apis we use? It'll help us select another option.

pca006132 · 2024-05-12T00:15:38Z

https://github.com/elalish/manifold/blob/master/src/utilities/include/par.h#L163-L193

pca006132 · 2024-05-12T00:20:12Z

I think the slightly trickier ones to implement are things like copy_if, remove_if, that requires the final result to have the same ordering as the input.

elalish · 2024-05-12T02:32:01Z

Regarding compatibility - my impression is PSTL and TBB are related a little like Thrust and CUB. They have slightly different APIs and TBB and CUB are slightly lower-level. But mostly: PSTL/Thrust is really just APIs, while TBB and CUB have actual parallel algorithm implementations. So I think TBB using PSTL was probably a bootstrap to get some OpenMP support before TBB was finished or something. Nowadays it seems we're in a PSTL calls TBB (or OpenMP) under the hood kind of situation, which is much how we currently use Thrust.

fire · 2024-05-12T02:35:59Z

Gathered by chatgpt4 from par.h

THRUST_DYNAMIC_BACKEND(copy_if, void)
THRUST_DYNAMIC_BACKEND_VOID(exclusive_scan)
THRUST_DYNAMIC_BACKEND_VOID(for_each)
THRUST_DYNAMIC_BACKEND_VOID(for_each_n)
THRUST_DYNAMIC_BACKEND(gather_if, void)
THRUST_DYNAMIC_BACKEND_VOID(gather)
THRUST_DYNAMIC_BACKEND(reduce_by_key, void)
THRUST_DYNAMIC_BACKEND_VOID(scatter)
THRUST_DYNAMIC_BACKEND_VOID(sequence)
THRUST_DYNAMIC_BACKEND(transform_reduce, void)

STL_DYNAMIC_BACKEND(all_of, bool)
STL_DYNAMIC_BACKEND(count_if, int)
STL_DYNAMIC_BACKEND_VOID(copy)
STL_DYNAMIC_BACKEND_VOID(copy_n)
STL_DYNAMIC_BACKEND(find_if, void)
STL_DYNAMIC_BACKEND(find, void)
STL_DYNAMIC_BACKEND(fill, void)
STL_DYNAMIC_BACKEND(inclusive_scan, void)
STL_DYNAMIC_BACKEND(is_sorted, bool)
STL_DYNAMIC_BACKEND(remove_if, void)
STL_DYNAMIC_BACKEND(remove, void)
STL_DYNAMIC_BACKEND(reduce, void)
STL_DYNAMIC_BACKEND_VOID(stable_sort)
STL_DYNAMIC_BACKEND_VOID(transform)
STL_DYNAMIC_BACKEND_VOID(uninitialized_copy)
STL_DYNAMIC_BACKEND_VOID(uninitialized_fill)

Feel free to edit my list.

elalish · 2024-05-12T02:46:04Z

I think regarding old compilers that have bugs or lack support for certain PSTL algorithms, we should just let those fall back to single-threaded. Then it should work everywhere, but it'll be fastest on the latest platforms. That feels like a reasonable compromise regarding maintainability and compatibility. I don't think we can afford to optimize performance heavily for every old platform.

elalish · 2024-05-12T03:22:46Z

Besides, I feel like on average we only get ~2x speedup for parallel over single-threaded anyway. CPU pipelining is pretty good when your algorithms are parallelized!

pca006132 · 2024-05-12T04:00:11Z

I think there can be 4x speedup, and probably more if we can optimize mesh simplification better.

The major issue with old vs new platform is that people like to have a single binary, e.g. appimage for openscad, and that means they need to use the single threaded version for several years.

pca006132 · 2024-05-12T05:23:39Z

@elalish btw, by "So I think TBB using PSTL was probably a bootstrap to get some OpenMP support before TBB was finished or something." do you mean "So I think PSTL using TBB was probably a bootstrap to get some OpenMP support before PSTL was finished or something"? TBB does not depend on PSTL.
Also, I don't feel that PSTL wants to get rid of tbb later.

elalish · 2024-05-12T05:29:08Z

Maybe I misunderstood what you were saying earlier. Either way it's confusing enough we should probably chat about it sometime face-to-face.

pca006132 · 2024-05-24T15:28:03Z

For the record, our current goal is to get rid of thrust and use PSTL for parallelization. Users with GCC 12 or older will hit #787. They can either disable multicore (it is slower, but typically not that slow) or accept the leak. Considering other users, e.g. openscad, did not report such leak causing an issue, this should be acceptable.

And if needed, we can have some intermediate option, where we use tbb for_each directly but no PSTL algorithms. This will be slower than using every parallel APIs, but the user can still get some multicore performance improvement without having to live with memory leak.

pca006132 mentioned this issue May 24, 2024

start removing thrust #823

Closed

elalish added this to the v3.0 milestone Jun 16, 2024

elalish mentioned this issue Jul 7, 2024

Bye-bye thrust #856

Merged

elalish closed this as completed in #856 Jul 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove Thrust #809

Remove Thrust #809

elalish commented May 11, 2024 •

edited

Loading

fire commented May 11, 2024 •

edited

Loading

pca006132 commented May 11, 2024

kintel commented May 11, 2024

pca006132 commented May 11, 2024

kintel commented May 11, 2024

fire commented May 11, 2024

t-paul commented May 11, 2024

pca006132 commented May 12, 2024

fire commented May 12, 2024

pca006132 commented May 12, 2024

pca006132 commented May 12, 2024

elalish commented May 12, 2024

fire commented May 12, 2024 •

edited

Loading

elalish commented May 12, 2024

elalish commented May 12, 2024

pca006132 commented May 12, 2024

pca006132 commented May 12, 2024

elalish commented May 12, 2024

pca006132 commented May 24, 2024

Remove Thrust #809

Remove Thrust #809

Comments

elalish commented May 11, 2024 • edited Loading

fire commented May 11, 2024 • edited Loading

Background notes

pca006132 commented May 11, 2024

kintel commented May 11, 2024

pca006132 commented May 11, 2024

kintel commented May 11, 2024

fire commented May 11, 2024

t-paul commented May 11, 2024

pca006132 commented May 12, 2024

fire commented May 12, 2024

pca006132 commented May 12, 2024

pca006132 commented May 12, 2024

elalish commented May 12, 2024

fire commented May 12, 2024 • edited Loading

elalish commented May 12, 2024

elalish commented May 12, 2024

pca006132 commented May 12, 2024

pca006132 commented May 12, 2024

elalish commented May 12, 2024

pca006132 commented May 24, 2024

elalish commented May 11, 2024 •

edited

Loading

fire commented May 11, 2024 •

edited

Loading

fire commented May 12, 2024 •

edited

Loading