Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix memory leak in HkdfStreamingPrf #52

Closed
wants to merge 1 commit into from

Conversation

behrooz-stripe
Copy link
Contributor

@behrooz-stripe behrooz-stripe commented Feb 28, 2025

Repetitive calls to prf.compute() cause a memory leak. For example the following toy app will drain memory quickly:

public class Main {
  public static void main(String[] args) throws GeneralSecurityException, IOException {
    System.out.println("Starting..." + ProcessHandle.current().pid());
    String secretRawString = "...a sample HkdfPrfKey..."
    PrfConfig.register();
    KeysetHandle tinkHandle = CleartextKeysetHandle.read(JsonKeysetReader.withString(secretRawString));
    for (Long i = 0L; i < 1_000_000_000L; i++) {
      PrfSet prf = tinkHandle.getPrimitive(PrfSet.class);
      prf.computePrimary(i.toString().getBytes(), 32);
    }
    System.out.println("Done. Press any key to exit...");
    System.in.read();
  }
}

this is because in HkdfInputStream class, the allocated ByteBuffer is a direct buffer vs. a heap buffer. Direct buffers are allocated "by calling through to native OS allocs, bypassing the standard JVM heap... memory-storage areas of direct buffers are not subject to garbage collection because they are outside the standard JVM heap." See here for more info.
Now the question might be the capacity is set to zero (buffer = ByteBuffer.allocateDirect(0);) however the overhead involved the pointer itself adds up on MANY calls. This basically means repetitive calls to prf.compute are same as this toy app:

public class Main {
  public static void main(String[] args) {
    System.out.println("Starting..." + ProcessHandle.current().pid());
    for (Long i = 0L; i < 1_000_000_000L; i++) {
      ByteBuffer byteBuffer = ByteBuffer.allocate(0);
      byteBuffer.mark();
    }
    System.out.println("Done. Press any key to exit...");
  }
}

Which sure enough will drain memory in matter of seconds. In practice since prf compute takes more time and is not called in such aggressive for loops; the leak is a lot slower and harder to notice. In our application, we had many repetitive calls resulting in an eventual drainage of machine memory. I have run these apps under a memory profiler and have confirmed the same findings. Stacktrace:

	at jdk.internal.misc.Unsafe.allocateMemory0(java.base@17.0.14/Native Method)
	at jdk.internal.misc.Unsafe.allocateMemory(java.base@17.0.14/Unsafe.java:630)
	at java.nio.DirectByteBuffer.<init>(java.base@17.0.14/DirectByteBuffer.java:125)
	at java.nio.ByteBuffer.allocateDirect(java.base@17.0.14/ByteBuffer.java:332)
	at com.google.crypto.tink.subtle.prf.HkdfStreamingPrf$HkdfInputStream.initialize(HkdfStreamingPrf.java:88)
	at com.google.crypto.tink.subtle.prf.HkdfStreamingPrf$HkdfInputStream.read(HkdfStreamingPrf.java:129)
	at com.google.crypto.tink.subtle.prf.PrfImpl.readBytesFromStream(PrfImpl.java:45)
	at com.google.crypto.tink.subtle.prf.PrfImpl.compute(PrfImpl.java:67)
	at com.google.crypto.tink.prf.PrfSet.computePrimary(PrfSet.java:44)
...

The fix is simply to swap allocateDirect with allocate which is heap managed and garbage collected. I've verified the fix does indeed solve the leak.

Copy link

google-cla bot commented Feb 28, 2025

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@behrooz-stripe behrooz-stripe mentioned this pull request Feb 28, 2025
fix memory leak in HkdfStreamingPrf
copybara-service bot pushed a commit that referenced this pull request Mar 7, 2025
Fix memory leak in HkdfStreamingPrf

Repetitive calls to `prf.compute()` cause a memory leak. For example the following toy app will drain memory quickly:
```
public class Main {
  public static void main(String[] args) throws GeneralSecurityException, IOException {
    System.out.println("Starting..." + ProcessHandle.current().pid());
    String secretRawString = "...a sample HkdfPrfKey..."
    PrfConfig.register();
    KeysetHandle tinkHandle = CleartextKeysetHandle.read(JsonKeysetReader.withString(secretRawString));
    for (Long i = 0L; i < 1_000_000_000L; i++) {
      PrfSet prf = tinkHandle.getPrimitive(PrfSet.class);
      prf.computePrimary(i.toString().getBytes(), 32);
    }
    System.out.println("Done. Press any key to exit...");
    System.in.read();
  }
}
```
this is because in `HkdfInputStream` class, the [allocated ByteBuffer](https://github.com/tink-crypto/tink-java/blob/1240f412cce662317f7fe9130a6c3a0e1fa6916a/src/main/java/com/google/crypto/tink/subtle/prf/HkdfStreamingPrf.java#L115) is a `direct` buffer vs. a `heap buffer`. Direct buffers are allocated _"by calling through to native OS allocs, bypassing the standard JVM heap... memory-storage areas of direct buffers are not subject to garbage collection because they are outside the standard JVM heap."_ See [here for more info](https://stackoverflow.com/questions/5670862/bytebuffer-allocate-vs-bytebuffer-allocatedirect).
Now the question might be the capacity is set to zero (`buffer = ByteBuffer.allocateDirect(0);`) however the overhead involved the pointer itself adds up on MANY calls. This basically means repetitive calls to `prf.compute` are same as this toy app:
```
public class Main {
  public static void main(String[] args) {
    System.out.println("Starting..." + ProcessHandle.current().pid());
    for (Long i = 0L; i < 1_000_000_000L; i++) {
      ByteBuffer byteBuffer = ByteBuffer.allocate(0);
      byteBuffer.mark();
    }
    System.out.println("Done. Press any key to exit...");
  }
}
```
Which sure enough will drain memory in matter of seconds. In practice since prf compute takes more time and is not called in such aggressive for loops; the leak is a lot slower and harder to notice. In our application, we had many repetitive calls resulting in an eventual drainage of machine memory. I have run these apps under a memory profiler and have confirmed the same findings. Stacktrace:
```
	at jdk.internal.misc.Unsafe.allocateMemory0(java.base@17.0.14/Native Method)
	at jdk.internal.misc.Unsafe.allocateMemory(java.base@17.0.14/Unsafe.java:630)
	at java.nio.DirectByteBuffer.<init>(java.base@17.0.14/DirectByteBuffer.java:125)
	at java.nio.ByteBuffer.allocateDirect(java.base@17.0.14/ByteBuffer.java:332)
	at com.google.crypto.tink.subtle.prf.HkdfStreamingPrf$HkdfInputStream.initialize(HkdfStreamingPrf.java:88)
	at com.google.crypto.tink.subtle.prf.HkdfStreamingPrf$HkdfInputStream.read(HkdfStreamingPrf.java:129)
	at com.google.crypto.tink.subtle.prf.PrfImpl.readBytesFromStream(PrfImpl.java:45)
	at com.google.crypto.tink.subtle.prf.PrfImpl.compute(PrfImpl.java:67)
	at com.google.crypto.tink.prf.PrfSet.computePrimary(PrfSet.java:44)
...
```
The fix is simply to swap `allocateDirect` with `allocate` which is heap managed and garbage collected. I've verified the fix does indeed solve the leak.

Closes ##52

PiperOrigin-RevId: 734509742
Change-Id: Id149e7dd44c941cb6c04e7cba80ff6e0234639f5
@behrooz-stripe
Copy link
Contributor Author

behrooz-stripe commented Mar 7, 2025

I guess this doesn't auto close but merged via this commit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant