Abris #239 part 2 - Upgrade to spark 3.2 #245
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Upgrade to Spark 3.2.0. Second part of a fix for #239.
This PR builds on top of the changes already requested in PR #244. Since the Spark upgrade fails without those changes, I have included them in this PR too, in order to be able to check out the changes and run tests. If you prefer to have just the Spark change in this PR, let me know and I'll remove the commits from #244.
Currently, the changes suggested only works for Spark 3.2. Due to changes in Spark, the
AvroDataToCatalyst
andCatalystDataToAvro
classes needs to override a newwithNewChildInternal
method, which becomes an issue when simultaneously supporting Spark 3.2 and older versions: Spark 3.2 compilation will fail and require the Catalyst classes to be abstract without, and 3.1 and older will fail because the new method isn't overriding anything. There might be an elegant way to solve this with traits or similar, but with my limited Scala experience I haven't managed to. This would likely be a decider for whether Spark 3.2 support needs to be maintained in a separate branch or not.AvroDeserializer
contains some throws of a genericscala.Exception
instead ofIncompatibleSchemaException
, since that one is not longer accessible. It's not great, but once #240 is merged the ABRiSAvroDeserializer
will be replaced with the one from Spark, so that is either fine or these changes can be piggybacked with #240.I have run tests for Spark 3.2 and Scala 2.12 and they are green. I haven't tested Scala 2.11, since Spark 3.2 does not support support it (and all the older Spark versions obviously fail due to
withNewChildInternal
not overriding anything).