Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP][SYCL] Implement sycl_special_class attribute #2091

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions clang/include/clang/Basic/Attr.td
Original file line number Diff line number Diff line change
Expand Up @@ -1135,6 +1135,20 @@ def SYCLKernel : InheritableAttr {
let Documentation = [SYCLKernelDocs];
}

def SYCLSpecialClass: InheritableAttr {
Copy link
Contributor

@bader bader Sep 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: @erichkeane recently added SYCLRequiresDecomposition attribute, which probably can be re-used instead of SYCLSpecialClass.
15e62c2

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SYCLRequiresDecomposition attribute appears on user-defined struct kernel arguments if they should be passed to the kernel field-by-field, then these structs are constructed back on kernel side without init method usage. Classes which we want to identify with SYCLSpecialClass (accessor, sampler, stream) are handled specially with init method. So, technically right now SYCLRequiresDecomposition attribute works in a bit other use case. To re-use this attribute we will need to change some logic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

presumably SYCLSpecialClass is a subset of the SYCLRequiresDecomposition. I think we still would need BOTH, as the Special-class needs to understand how to deal with the 'init' function, but it should also be translated to the requires-decomp.

let Spellings = [Clang<"sycl_special_class">];
let Subjects = SubjectList<[CXXRecord]>;
let LangOpts = [SYCLIsDevice, SYCLIsHost];
// TODO: Add doc
let Documentation = [Undocumented];
let Args = [
EnumArgument<"SpecialClassKind", "SpecialClassKind",
[ "accessor", "sampler", "stream", "" ],
[ "Accessor", "Sampler", "Stream", "Generic" ], 1>
Comment on lines +1145 to +1147
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have mainly looked at the attribute definition here.

This seems to me to be very specific to SYCL and does not cover every special types (like specialization constant). Also if you want to add a new one, you need to change the compiler.

One of the idea I had was to make the attribute sensitive to __init and __finalize (if present) and maintain the derivation of the kernel arguments from the __init argument list. This allow to add new special types without the need to touch the compiler and you are actually SYCL agnostic. As this is a special class, the runtime should be able to query what the class is and act accordingly. I never had the time to fully explore and test the idea, so it might be flawed. There is also the case of the specialization constant, where you may or may not want to pass the value as an argument, but maybe this could simply be handled using macros.

One drawback I see from a "generic attribute" point of view, this is C++ specific as __init and __finalize are expected to be member functions. But the functions could be specify as part of the argument.

I think this might be OpenCL/SPIR-V specific attribute instead of SYCL specific attribute

I disagree, OpenCL doesn't need to apply any kind of processing to build the entry point. SPIR-V is neutral to that regard as well.

This is to allow regularization to an underlying programming model. So OpenCL or CUDA at the moment, but we could very well imagine something OpenMP or Vulkan compatible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to me to be very specific to SYCL and does not cover every special types (like specialization constant). Also if you want to add a new one, you need to change the compiler.

Yes, it doesn't cover specialization constant, I marked it as unfinished TODO in the PR description. In case of new special type... If the new special type needs some special handling that differs from others, you need to change the compiler anyway, otherwise "generic" value of the attribute can be used.

One of the idea I had was to make the attribute sensitive to __init and __finalize (if present) and maintain the derivation of the kernel arguments from the __init argument list.

Could you please describe how it would look like? I'm not sure that I got the idea.

As this is a special class, the runtime should be able to query what the class is and act accordingly.

With the current implementation the runtime takes the address of kernel object and uses offsets encoded by the integration header. It also uses information about kernel object field types to cast area of memory accessed by using of address and offset to proper type and perform handling, it cannot query the type from raw memory. In this case we probably could teach integration header to emit information that some kernel argument is a 'special class' and add some base class for each class that is handled specially by the compiler, so this base class will hold kind of special class and it will be possible to query it, I guess. But I'm not really sure since I don't have a lot of experience with runtime.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the new special type needs some special handling that differs from others, you need to change the compiler anyway, otherwise "generic" value of the attribute can be used.

This attribute doesn't do much. Special types processing boils down to fully or partially breakdown the structure into valid fields for your programming model. The rest is the associated processing in the runtime.

Could you please describe how it would look like? I'm not sure that I got the idea.

I'll take the accessor as an example.
ATM (and last time I checked) the processing of the special type sycl::accessor is roughly done as follow:

  • Check if sycl special type sycl::accessor, if yes
  • Look up for the member function sycl::accessor::__init
  • Copy the sycl::accessor::__init arguments into the kernel function argument list
  • Allocate a sycl::accessor object and call sycl::accessor::__init with the kernel argument

This is spread out in different places, but that's the basic idea. The important thing to notice is if you replace sycl::accessor by sycl::sampler, sycl::stream or other, the processing remains the same. The type sycl::stream has an extra step which is to call __finalize at the end of the kernel.

So in a more generic way, you could have __attribute__((sycl_special_class)) (leaving the renaming exercise for later) that actually trigger the process above in a more generic way:

  • Check if <typename> has attribute sycl_special_class, if yes
  • Look up for the member function <typename>::__init
  • Add kernel function has friend of the class
  • Copy the <typename>::__init arguments into the kernel function argument list
  • Allocate a <typename> object and call <typename>::__init with the kernel argument
  • Look up for the member function <typename>::__finalize, if does have the function
  • Call <typename>::__finalize at the end of the kernel

So for example:

class __attribute__((sycl_special_class)) MySpecialType {
  int Field1;
  int Field2;
  void __init(int F1) {
    Field1 = F1;
    Field2 = F1;
  }
  void __finalize() {}
public:
  MySpecialType() = default;
  int getF2() const { return Field2; }
};

If used in a kernel argument

MySpecialType T;
cgh.single_task([=]() {
  T.getF2();
});

This would trigger the following kernel entry point in the AST:

void __sycl_kernel(int F1) {
  MySpecialType T;
  T.__init(F1);
  // finish rebuilding the lambda + call
  T.__finalize()
}

With the current implementation the runtime takes the address of kernel object and uses offsets encoded by the integration header. It also uses information about kernel object field types to cast area of memory accessed by using of address and offset to proper type and perform handling, it cannot query the type from raw memory. In this case we probably could teach integration header to emit information that some kernel argument is a 'special class' and add some base class for each class that is handled specially by the compiler, so this base class will hold kind of special class and it will be possible to query it, I guess. But I'm not really sure since I don't have a lot of experience with runtime.

So the integration header already emits what kind of fields it is processing. Changing to this would only require the capability to query it in the runtime. Since the std-layout requirement was lifted, a simple base class and reinterpret cast should do the job.

Another approach is to make the attribute take an ID as argument (the ID is chosen by the runtime) and forward it in the integration header (so in place of the field currently emitted). The runtime then only have to read-it directly. Either way, the compiler remains agnostic to the field it is processing.

Hopes this make the idea clearer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it definitely makes the idea clearer. Thanks for so detailed explanation!
I'll give it a try. It will require more changes across front-end and runtime library, so it will take some time. I will update this PR once I have something ready.

];
let PragmaAttributeSupport = 0;
}

// Marks functions which must not be vectorized via horizontal SIMT widening,
// e.g. because the function is already vectorized. Used to mark SYCL
// explicit SIMD kernels and functions.
Expand Down
28 changes: 28 additions & 0 deletions clang/lib/Sema/SemaDeclAttr.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -7219,6 +7219,31 @@ static void handleSYCLKernelAttr(Sema &S, Decl *D, const ParsedAttr &AL) {
handleSimpleAttribute<SYCLKernelAttr>(S, D, AL);
}

static void handleSYCLSpecialClassAttr(Sema &S, Decl *D, const ParsedAttr &AL) {
if (S.LangOpts.SYCLIsHost)
return;

SYCLSpecialClassAttr::SpecialClassKind Kind;
if (AL.getNumArgs() == 0)
Kind = SYCLSpecialClassAttr::Generic;
else {
// Check the attribute arguments.
if (!AL.isArgIdent(0)) {
S.Diag(AL.getLoc(), diag::err_attribute_argument_n_type)
<< AL << 0 << AANT_ArgumentIdentifier;
return;
}

IdentifierInfo *II = AL.getArgAsIdent(0)->Ident;
if (!SYCLSpecialClassAttr::ConvertStrToSpecialClassKind(II->getName(),
Kind)) {
S.Diag(AL.getLoc(), diag::warn_attribute_type_not_supported) << AL << II;
return;
}
}
D->addAttr(::new (S.Context) SYCLSpecialClassAttr(S.Context, AL, Kind));
}

static void handleDestroyAttr(Sema &S, Decl *D, const ParsedAttr &A) {
if (!cast<VarDecl>(D)->hasGlobalStorage()) {
S.Diag(D->getLocation(), diag::err_destroy_attr_on_non_static_var)
Expand Down Expand Up @@ -7600,6 +7625,9 @@ static void ProcessDeclAttribute(Sema &S, Scope *scope, Decl *D,
case ParsedAttr::AT_SYCLSimd:
handleSimpleAttribute<SYCLSimdAttr>(S, D, AL);
break;
case ParsedAttr::AT_SYCLSpecialClass:
handleSYCLSpecialClassAttr(S, D, AL);
break;
case ParsedAttr::AT_SYCLDevice:
handleSYCLDeviceAttr(S, D, AL);
break;
Expand Down
23 changes: 20 additions & 3 deletions clang/lib/Sema/SemaSYCL.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2692,17 +2692,33 @@ SYCLIntegrationHeader::SYCLIntegrationHeader(DiagnosticsEngine &_Diag,
// -----------------------------------------------------------------------------

bool Util::isSyclAccessorType(const QualType &Ty) {
return isSyclType(Ty, "accessor", true /*Tmpl*/);
const CXXRecordDecl *RecTy = Ty->getAsCXXRecordDecl();
if (!RecTy)
return false; // only classes/structs supported
if (const auto *A = RecTy->getAttr<SYCLSpecialClassAttr>())
return A->getSpecialClassKind() == SYCLSpecialClassAttr::Accessor;
return false;
}

bool Util::isSyclSamplerType(const QualType &Ty) {
return isSyclType(Ty, "sampler");
const CXXRecordDecl *RecTy = Ty->getAsCXXRecordDecl();
if (!RecTy)
return false; // only classes/structs supported
if (const auto *A = RecTy->getAttr<SYCLSpecialClassAttr>())
return A->getSpecialClassKind() == SYCLSpecialClassAttr::Sampler;
return false;
}

bool Util::isSyclStreamType(const QualType &Ty) {
return isSyclType(Ty, "stream");
const CXXRecordDecl *RecTy = Ty->getAsCXXRecordDecl();
if (!RecTy)
return false; // only classes/structs supported
if (const auto *A = RecTy->getAttr<SYCLSpecialClassAttr>())
return A->getSpecialClassKind() == SYCLSpecialClassAttr::Stream;
return false;
}

// TODO: Remove this once structs decomposing is optimized
bool Util::isSyclHalfType(const QualType &Ty) {
const StringRef &Name = "half";
std::array<DeclContextDesc, 5> Scopes = {
Expand All @@ -2714,6 +2730,7 @@ bool Util::isSyclHalfType(const QualType &Ty) {
return matchQualifiedTypeName(Ty, Scopes);
}

// TODO: Do we need an attribute for this one as well?
bool Util::isSyclSpecConstantType(const QualType &Ty) {
const StringRef &Name = "spec_constant";
std::array<DeclContextDesc, 4> Scopes = {
Expand Down
8 changes: 4 additions & 4 deletions clang/test/CodeGenSYCL/Inputs/sycl.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ struct sampler_impl {
#endif
};

class sampler {
class __attribute__((sycl_special_class(sampler))) sampler {
struct sampler_impl impl;
#ifdef __SYCL_DEVICE_ONLY__
void __init(__ocl_sampler_t Sampler) { impl.m_Sampler = Sampler; }
Expand Down Expand Up @@ -128,7 +128,7 @@ struct _ImplT {
template <typename dataT, int dimensions, access::mode accessmode,
access::target accessTarget = access::target::global_buffer,
access::placeholder isPlaceholder = access::placeholder::false_t>
class accessor {
class __attribute__((sycl_special_class(accessor))) accessor {

public:
void use(void) const {}
Expand Down Expand Up @@ -189,7 +189,7 @@ struct _ImageImplT {
};

template <typename dataT, int dimensions, access::mode accessmode>
class accessor<dataT, dimensions, accessmode, access::target::image, access::placeholder::false_t> {
class __attribute__((sycl_special_class(accessor))) accessor<dataT, dimensions, accessmode, access::target::image, access::placeholder::false_t> {
public:
void use(void) const {}
template <typename... T>
Expand Down Expand Up @@ -310,7 +310,7 @@ class handler {
}
};

class stream {
class __attribute__((sycl_special_class(stream))) stream {
public:
stream(unsigned long BufferSize, unsigned long MaxStatementSize,
handler &CGH) {}
Expand Down
6 changes: 3 additions & 3 deletions clang/test/SemaSYCL/Inputs/sycl.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@ struct DeviceValueType<dataT, access::target::local> {
template <typename dataT, int dimensions, access::mode accessmode,
access::target accessTarget = access::target::global_buffer,
access::placeholder isPlaceholder = access::placeholder::false_t>
class accessor {
class __attribute__((sycl_special_class(accessor))) accessor {

public:
void use(void) const {}
Expand Down Expand Up @@ -146,7 +146,7 @@ struct _ImageImplT {
};

template <typename dataT, int dimensions, access::mode accessmode>
class accessor<dataT, dimensions, accessmode, access::target::image, access::placeholder::false_t> {
class __attribute__((sycl_special_class(accessor))) accessor<dataT, dimensions, accessmode, access::target::image, access::placeholder::false_t> {
public:
void use(void) const {}
template <typename... T>
Expand All @@ -165,7 +165,7 @@ struct sampler_impl {
#endif
};

class sampler {
class __attribute__((sycl_special_class(sampler))) sampler {
struct sampler_impl impl;
#ifdef __SYCL_DEVICE_ONLY__
void __init(__ocl_sampler_t Sampler) { impl.m_Sampler = Sampler; }
Expand Down
14 changes: 7 additions & 7 deletions sycl/include/CL/sycl/accessor.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -730,7 +730,7 @@ class __image_array_slice__ {
/// \ingroup sycl_api_acc
template <typename DataT, int Dimensions, access::mode AccessMode,
access::target AccessTarget, access::placeholder IsPlaceholder>
class accessor :
class __SYCL_SPECIAL_CLASS(accessor) accessor :
#ifndef __SYCL_DEVICE_ONLY__
public detail::AccessorBaseHost,
#endif
Expand Down Expand Up @@ -1350,8 +1350,8 @@ accessor(buffer<DataT, Dimensions, AllocatorT>, handler, Type1, Type2, Type3,
/// \ingroup sycl_api_acc
template <typename DataT, int Dimensions, access::mode AccessMode,
access::placeholder IsPlaceholder>
class accessor<DataT, Dimensions, AccessMode, access::target::local,
IsPlaceholder> :
class __SYCL_SPECIAL_CLASS(accessor) accessor<
DataT, Dimensions, AccessMode, access::target::local, IsPlaceholder> :
#ifndef __SYCL_DEVICE_ONLY__
public detail::LocalAccessorBaseHost,
#endif
Expand Down Expand Up @@ -1516,8 +1516,8 @@ class accessor<DataT, Dimensions, AccessMode, access::target::local,
/// \ingroup sycl_api_acc
template <typename DataT, int Dimensions, access::mode AccessMode,
access::placeholder IsPlaceholder>
class accessor<DataT, Dimensions, AccessMode, access::target::image,
IsPlaceholder>
class __SYCL_SPECIAL_CLASS(accessor) accessor<
DataT, Dimensions, AccessMode, access::target::image, IsPlaceholder>
: public detail::image_accessor<DataT, Dimensions, AccessMode,
access::target::image, IsPlaceholder> {
public:
Expand Down Expand Up @@ -1580,8 +1580,8 @@ class accessor<DataT, Dimensions, AccessMode, access::target::host_image,
/// \ingroup sycl_api_acc
template <typename DataT, int Dimensions, access::mode AccessMode,
access::placeholder IsPlaceholder>
class accessor<DataT, Dimensions, AccessMode, access::target::image_array,
IsPlaceholder>
class __SYCL_SPECIAL_CLASS(accessor) accessor<
DataT, Dimensions, AccessMode, access::target::image_array, IsPlaceholder>
: public detail::image_accessor<DataT, Dimensions + 1, AccessMode,
access::target::image, IsPlaceholder> {
#ifdef __SYCL_DEVICE_ONLY__
Expand Down
6 changes: 6 additions & 0 deletions sycl/include/CL/sycl/detail/defines.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -59,3 +59,9 @@
#else
#define __SYCL_INLINE_CONSTEXPR static constexpr
#endif

#if __has_attribute(sycl_special_class)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor. Suggest adding check: #ifdef SYCL_DEVICE_ONLY

#define __SYCL_SPECIAL_CLASS(kind) __attribute__((sycl_special_class(kind)))
#else
#define __SYCL_SPECIAL_CLASS(kind)
#endif
2 changes: 1 addition & 1 deletion sycl/include/CL/sycl/sampler.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ class image_accessor;
/// \sa sycl_api_acc
///
/// \ingroup sycl_api
class __SYCL_EXPORT sampler {
class __SYCL_EXPORT __SYCL_SPECIAL_CLASS(sampler) sampler {
public:
sampler(coordinate_normalization_mode normalizationMode,
addressing_mode addressingMode, filtering_mode filteringMode);
Expand Down
2 changes: 1 addition & 1 deletion sycl/include/CL/sycl/stream.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,7 @@ inline __width_manipulator__ setw(int Width) {
/// vector and SYCL types to the console.
///
/// \ingroup sycl_api
class __SYCL_EXPORT stream {
class __SYCL_EXPORT __SYCL_SPECIAL_CLASS(stream) stream {
public:
stream(size_t BufferSize, size_t MaxStatementSize, handler &CGH);

Expand Down