-
Notifications
You must be signed in to change notification settings - Fork 67
SeekableByteChannel that doesn't use a temp file #103
Comments
I think that an implementation such as the one on epam/htsjdk-s3-plugin S3SeekableStream is the way to go. It is similar as what I'm doing in my implementation of an HTTP/S filesystem.... |
I learned about this issue the hard way on a 6GB file just now! IMO throwing |
I did some work to fix this a while back (that I forgot to post at the time). Sorry you suffered from this too @ryan-williams! Here's the branch: https://github.com/tomwhite/Amazon-S3-FileSystem-NIO2/tree/read-seeks It lacks unit tests, but I did test it manually. |
That's great @tomwhite, I got most of the way through my own implementation but yours is cleaner. I patched a few more fixes (to #106 and #108) into it and released Lots of tests fail, as you mentioned, but it seems like a mix of mocks that I don't really care to unbreak, and ITs that aren't configured in a way I know how to run |
The current implementation of S3SeekableByteChannel makes use of a temp file. This has a couple of drawbacks. One of the primary reasons to use a SeekableByteChannel is that you do not want to 'download' an entire (2GB) file if you are only interested in a specific range of bytes within the file.
Why do we have a separate S3SeekableByteChannel next to S3FileChannel? S3FileChannel extends java.nio.channels.FileChannel which in turn implements java.nio.channels.SeekableByteChannel.
Another problem is the use of Files.createTempFile(). This generally creates a file in /tmp (on linux e.g.) which can be seen as a security risk if the file contains sensitive data.
The com.amazonaws.services.s3.model.GetObjectRequest supports a private long[] range; which enables the retrieval of a limited range of bytes. Perhaps this could be used to create a better SeekableByteChannel implementation.
However I can't find any option to write a limited range of bytes to an S3 Object... Perhaps we'll have to wait and see that the 2.0 version of the AWS SDK offers us.
The text was updated successfully, but these errors were encountered: