-
Notifications
You must be signed in to change notification settings - Fork 300
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Performance] Use a faster serialization protocol within security plugin #2780
Comments
Discovered an issue with deserialization of |
This works with FST and JDK Serialization as they have a check for the presence of Protostuff, while constructing the RuntimeSchema for |
Started an issue with protostuff - protostuff/protostuff#349 to here from what the maintainers have to say. |
Updated the issue description with more accurate benchmarks results, and highlighted the protostuff issue. |
Will a delegate schema help here with proto issue? https://protostuff.github.io/documentation/runtime-schema-delegate? |
Delegate will enable us to specify how we want to write and read a certain object to/from the I/O. It'd have been helpful if the objects that we are dealing with are maintained by us. In this case For eg. |
Hi @parasjain1, thank you for taking the time to file this issue. Your findings look convincing and the maintainers would be happy to review a pull request swapping to a different serialization method. |
Sure @scrawfor99. I'm still evaluating the best of the available alternatives and will be working on the PR. Thanks! |
Revisited the |
Updated the issue description after further deep dive and exploration. To conclude, we'll be using custom serialization built within OpenSearch for serializing headers in security plugin. |
Raised a PR against main of OpenSearch to add serialization support for |
Drafted PR for changes in security plugin - |
PR ready for review - #2802 |
Why not? This proposal seems to be about the serialized artifacts, not about the format of the data. So long as the data unchanged this seems like a great change to migrate into 2.X release line. |
It's because of the complication with the rolling upgrade scenario. During rolling upgrade we'll have a mixed cluster with the old nodes understanding JDK serialization and the new ones custom. To establish communication b/w the two types of nodes we depend on the remote node version. Say we release this change with V3.0.0, the code changes assume that a node running OS version prior to V3.0.0 understands JDK serialization and will (de)serialize the headers using JDK serialization only. This will not work if we backport it to 2.x as the assumption made will no longer hold. |
Isn't the format of the data the same? Help me understand how this wouldn't be compatible in a rolling upgrade. |
Double checking - if we are changing the protocol of the traffic itself, isn't this change part of an overall OpenSearch effort? Are those performance numbers up to inclusive of changes in all of OpenSearch or only changes in how the security plugin operates? If we are scoped to changes in the security plugin's serialization components, can we leverage information about the cluster state to know if we are running in a mixed mode? |
The change is in how security plugin serializes certain security headers. The scope is limited to security plugin only.
Can you please help with specific cluster service methods that can help here? Also, is this suggestion in the direction to make it easier for backport? |
The format is going to change only for the security headers that were earlier being serialized via JDK serialization protocol in security plugin's |
Use custom serialization in security plugin. - Resolves #2780 Signed-off-by: Paras Jain <parasjaz@amazon.com> Signed-off-by: Peter Nied <peternied@hotmail.com> Co-authored-by: Paras Jain <parasjaz@amazon.com> Co-authored-by: Peter Nied <peternied@hotmail.com>
Use custom serialization in security plugin. - Resolves opensearch-project#2780 Signed-off-by: Paras Jain <parasjaz@amazon.com> Signed-off-by: Peter Nied <peternied@hotmail.com> Co-authored-by: Paras Jain <parasjaz@amazon.com> Co-authored-by: Peter Nied <peternied@hotmail.com> Signed-off-by: Paras Jain <parasjaz@amazon.com>
Problem
JDK serialisation used by security plugin to serialize and deserialize various headers is slow.
Proposal
This is a proposal to change the implementation of
Base64Helper::serializeObject
andBase64Helper::deserializeObject
to use a faster serialization protocol. I explored Fast Serialization, Protostuff, Kryo, Avro, and OpenSearch's Custom Serialization as alternatives to JDK serialization and ran a few benchmarks. Results are attached below.Benchmarking Environment
Framework used - JMH, 1000 warm-up iterations, 30000 test iterations
EC2 InstanceType - c5.2xlarge
JDK - Corretto JDK 11
OS - Amazon Linux 2 x86_64
java.util.Collections$SynchronizedMap
we'll have to register separate serializers. There's a repo kryo-serializers that has many such serializers that we can use. Given we already have highly optimised custom serialization framework (StreamOutput
,StreamInput
) within OpenSearch, expending effort to integrate with another library seems unnecessary.BytesStreamOutput
andBytesStreamInput
classes is a promising approach. It too is highly performant. For the classes that are defined within security plugin such asUser
,SourceFieldsContext
-Writeable
interface can be implemented. For classes such asInetSocketAddress
which we cannot change, we'll have to add Writers and Read methods to theStreamOutput
andStreamInput
classes to be able to usewriteGenericObject
andreadGenericObject
methods. This is inline with how OpenSearch deals with third party classes today. [source code]To conclude, we propose to use custom serialization for headers in security plugin.
Solution
This change is to proposed to be introduced with OS 3.0 with no intention to backport this. We can break down the solution into following action items -
StreamInput
,StreamOutput
classes to add Writers and Read methods respectively for third party classes directly involved in serialization within security plugin. [will update the list below]Base4Helper::serialize
andBase64Helper.deserialize
methods to use custom serialization.I've raised an initial draft PR for serialization using protostuff and working towards testing the version upgrade scenario (from OS2.5 to OS2.7). Currently, the change is assumed to be introduced as part of OS2.7 release for testing purpose. We may need to bump up this version.Will raise another PR with custom serialization.
Next Steps
The text was updated successfully, but these errors were encountered: