Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement type imports and exports #7330

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

vouillon
Copy link
Contributor

Implementation of the Type Imports and Exports proposal, which allows Wasm module to export and import types:

        (type $File (export "File") (struct ...))
        (import "file" "File" (type $File (sub any)))

wasm-merge can combine several modules using this proposal, connecting type imports to corresponding type exports. To produce an output compatible with existing tools, an option --strip-type-exports can be used to omit any type export from the output.

From an implementation point of view, for imports, a new kind of heap type is used:

        (import "module" "base" (sub absheaptype))

Including the module and base names in the type definition works well with type canonicalization.

For exports, we separate type exports from other exports, since they don't have an internal name but a heap type:

        class TypeExport {
          Name name;
          HeapType heaptype; // exported type
        };

This is somewhat error-prone: to check for repeated exports, we need to check two tables. But the alternatives do not seem much better. If we put all exports together, then we have to make sure that we are never trying to access the internal name of a type import.

Implementation of the Type Imports and Exports proposal, which allows
Wasm module to export and import types:

        (type $File (export "File") (struct ...))
        (import "file" "File" (type $File (sub any)))

wasm-merge can be used to combine several modules using this proposal,
connecting type imports to corresponding type exports. To produce an
output compatible with existing tools, an option --strip-type-exports
can be use to omit any type export from the output.

From an implementation point of view, for imports, a new kind of heap
type is used:

        (import "module" "base" (sub absheaptype))

Including the module and base names in the type definition works well
with type canonicalisation.

For exports, we separate type exports from other exports, since they
don't have an internal name but a heap type:

        class TypeExport {
          Name name;
          HeapType heaptype; // exported type
        };

This is somewhat error-prone. Typically, to check for repeated exports,
we need to check two tables. But the alternatives do not seem much
better. If we put all export together, then we have to make sure that we
are never trying to access the internal name of a type import.
Copy link
Member

@tlively tlively left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, this looks fantastic. In addition to the comments below, it would be good to add unit tests for type canonicalization and subtyping relationships in test/gtest/type-builder.cpp. It would also be good to add tests for MinifyImportsAndExports, StripTypeExports, and TypeMerging passes with type imports to check that they do the right thing, as well as some tests for errors such as trying to declare a subtype of an imported type.

Comment on lines +878 to +880
if (!ctx.in.takeSExprStart("sub"sv)) {
return ctx.makeAnyType(Unshared);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't look like the (sub absheaptype) clause is optional in the proposal overview. Is this shorthand documented elsewhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is mentioned in the Proposal Summary

The text format allows to omit the constraint, in which case it defaults to (sub any).

Comment on lines 3005 to 3007
if (inRecGroup) {
return ctx.in.err("type import not allowed in recursive group");
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless this is part of the proposed text format extension, what do you think about deferring this check to TypeBuilder::build? Since all type definitions that are not in rec groups desugar to singleton rec groups, I think it would be more natural to allow type imports only in singleton rec groups, but also allow the rec group to be explicit in the text.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proposal currently does not say anything about the abbreviation (type $t (import "foo" "bar").
I'll make the change.

@@ -935,6 +935,10 @@ struct ParseDeclsCtx : NullTypeParserCtx, NullInstrParserCtx {
std::vector<DefPos> dataDefs;
std::vector<DefPos> tagDefs;

// Type imports: name, positions of type and import names.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Type imports: name, positions of type and import names.
// Type imports: name, export names, import names, and type index.

@@ -572,6 +573,8 @@ bool shapeEq(HeapType a, HeapType b) {
return shapeEq(a.getArray(), b.getArray());
case HeapTypeKind::Cont:
WASM_UNREACHABLE("TODO: cont");
case HeapTypeKind::Import:
return false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to merge duplicate imports here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate imports are merged by the type canonicalization.

@@ -242,6 +246,11 @@ struct RecGroupHasher {
wasm::rehash(digest, 2381496927);
hash_combine(digest, hash(type.getContinuation()));
return digest;
case HeapTypeKind::Import:
assert(type.isContinuation());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy-paste error?

@@ -266,6 +275,8 @@ struct RecGroupHasher {

size_t hash(Continuation cont) { return hash(cont.type); }

size_t hash(Import import) { return hash(import.bound); }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same, this should include the names.

@@ -2225,6 +2315,7 @@ bool isValidSupertype(const HeapTypeInfo& sub, const HeapTypeInfo& super) {
return typer.isSubType(sub.struct_, super.struct_);
case HeapTypeKind::Array:
return typer.isSubType(sub.array, super.array);
case HeapTypeKind::Import:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should return false here. It seems possible for someone to accidentally set a supertype on an import entry in a TypeBuilder. We should also check whether the supertype is an import here.

Comment on lines 2368 to 2371
if (!info.import.bound.isShared()) {
return TypeBuilder::ErrorReason::InvalidBoundType;
}
break;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might make sense to just inherit sharedness from the bound rather than requiring it to be set separately and match. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, we should do that.

(import "second" "g" (func $g (param (ref $t2))))

;; Check that the function parameters are updated
(func (export "f") (param (ref $t1) (ref $t2) (ref $t3) (ref $t4))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you give these functions $names, then the test output update script should be able to match them up with functions in the output and put them next to each other.

@tlively
Copy link
Member

tlively commented Feb 28, 2025

Please re-request review from me when you're ready :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants