-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support creating smaller eStargz images (--estargz-external-toc
and --estargz-min-chunk-size
)
#956
Conversation
58bbb4f
to
3dc794f
Compare
docs/smaller-estargz.md
Outdated
stargz on /var/lib/containerd-stargz-grpc/snapshotter/snapshots/1/fs type fuse.rawBridge (rw,nodev,relatime,user_id=0,group_id=0,allow_other) | ||
``` | ||
|
||
> NOTE: This flag creates an eStargz image with newly-added `innerOffset` funtionality of eStargz. Older version of Stargz Snapshotter cannot perform lazy pulling for the images created with this flag. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/Older version/Versions before vX.Y.Z/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
estargz/estargz.go
Outdated
ent.Name = cleanEntryName(ent.Name) | ||
if ent.Type == "reg" || ent.Type == "chunk" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better to use switch{}
, but can be another PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed to use switch
.
estargz/estargz.go
Outdated
if in, err := io.CopyN(io.Discard, dr, e.InnerOffset-nr); err != nil || in != e.InnerOffset-nr { | ||
return 0, fmt.Errorf("discard of remaining %d bytes != %v, %v", e.InnerOffset-nr, in, err) | ||
} | ||
nr += e.InnerOffset - nr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean nr = nr + e.InnerOffset - nr
, i.e., nr = e.InnterOffset
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Refactored this to use nr = e.InnterOffset
.
estargz/estargz.go
Outdated
|
||
// MinChunkSize optionally controls the minimum number of bytes | ||
// of data must be written in one gzip stream before a new gzip | ||
// NOTE: This adds a TOC property that old reader doesn't understand. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/old/prior to vX.Y.Z/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
estargz/externaltoc/externaltoc.go
Outdated
subfield := "STARGZEXTERNALTOC" // len("STARGZEXTERNALTOC") = 17 | ||
binary.LittleEndian.PutUint16(header[2:4], uint16(len(subfield))) // little-endian per RFC1952 | ||
gz.Header.Extra = append(header, []byte(subfield)...) | ||
gz.Close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
catch err
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
metadata/testutil/testutil.go
Outdated
"zstd-fastest": tutil.ZstdCompressionWithLevel(zstd.SpeedFastest), | ||
"zstd-default": tutil.ZstdCompressionWithLevel(zstd.SpeedDefault), | ||
"zstd-bettercompression": tutil.ZstdCompressionWithLevel(zstd.SpeedBetterCompression), | ||
"gzip-nocompression": tutil.GzipCompressionWithLevel(gzip.NoCompression), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe:
"gzip-nocompression": tutil.GzipCompressionWithLevel(gzip.NoCompression), | |
"gzip-no-compression": tutil.GzipCompressionWithLevel(gzip.NoCompression), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
util/testutil/util.go
Outdated
|
||
var runes = []rune("1234567890abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ") | ||
|
||
func RandomContents(n int) string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
func RandomContents(n int) string { | |
func RandomString(n int) string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
b := make([]rune, n) | ||
for i := range b { | ||
b[i] = runes[rand.Intn(len(runes))] | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be more efficient https://pkg.go.dev/crypto/rand#Read
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Maybe use base64 for asciization)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the suggestion. Fixed.
util/testutil/util.go
Outdated
return string(b) | ||
} | ||
|
||
func ViewContents(c []byte) string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a helper function to get a omitted string from a long string for printing.
But the implementation looked wrong, so fixed.
cmd/ctr-remote/commands/convert.go
Outdated
Usage: "Separate TOC JSON into another image (called \"TOC image\"). The name of TOC image is the original + \"-esgztoc\" suffix. Both eStargz and the TOC image should be pushed to the same registry. stargz-snapshotter refers to the TOC image when it pulls the result eStargz image.", | ||
}, | ||
cli.BoolFlag{ | ||
Name: "estargz-lossless", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Lossless images" may sound like graphic images as in PNG images 😆
estargz-keep-diff-id
might be less confusing, but no strong opinion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the suggestion. Renamed to flag to estargz-keep-diff-id
.
## eStargz image with an external TOC | ||
|
||
eStargz supports separating TOC into another image called *TOC image*. | ||
This type of eStargz is the same as the normal eStargz but doesn't contain TOC JSON file (`stargz.index.json`) in the layer blob and has a special footer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be helpful to explain the motivation of this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the motivation.
docs/smaller-estargz.md
Outdated
Pull it lazily: | ||
|
||
```console | ||
# ctr-remote i rpull --plain-http registry2:5000/ubuntu:22.04-ex |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo:
# ctr-remote i rpull --plain-http registry2:5000/ubuntu:22.04-ex | |
# ctr-remote i rpull --plain-http registry2:5000/ubuntu:22.04-chunk50000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
3dc794f
to
3eca60a
Compare
util/testutil/util.go
Outdated
if _, err := rand.Read(b); err != nil { | ||
t.Fatalf("failed rand.Read: %v", err) | ||
} | ||
return string(b) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be non-ascii
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the review. I think it's fine that they're non-ascii. Fixed the function name to RandomBytes
.
3eca60a
to
ff81952
Compare
Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
ff81952
to
f904915
Compare
This commit allows users optionally creating smaller eStargz images using the following flags to
ctr-remote i convert
orctr-remote i optimize
:--estargz-external-toc
: Separates TOC JSON into another image (called "TOC image"). The result eStargz doesn't contain TOC so we can expect a smaller size than normal eStargz.--estargz-min-chunk-size
: Specifies the minimal number of bytes of data must be written in one gzip stream. If it's > 0, multiple files and chunks can be written into one gzip stream. Smaller number of gzip header and smaller size of the result blob can be expected.About
--estargz-external-toc
eStargz supports separating TOC into an external image called TOC image. This type of eStargz is the same as the normal eStargz but doesn't contain TOC JSON file (
stargz.index.json
) in the layer blob and has a special footer.TOC image is an OCI image containing TOC. Each layer contains a TOC JSON file (
stargz.index.json
) in the root directory. Layer descriptors in the manifest of the TOC image must contain an annotationcontainerd.io/snapshot/stargz/layer.digest
. The value of this is the digest of the eStargz layer corresponding to that TOC.Stargz snapshotter uses this annotation for searching TOC JSON of the mounting eStargz layer. Currently, it assumes the TOC image has the reference name same as the eStargz with
-esgztoc
suffix. For example, if an eStargz image is namedghcr.io/stargz-containers/ubuntu:22.04-esgz
, stargz snapshotter acquires the TOC image fromghcr.io/stargz-containers/ubuntu:22.04-esgz-esgztoc
. Note that future versions of stargz snapshotter will support more ways to search the TOC image (e.g. allowing custom suffix, using OCI Reference Type, etc.)Usage
convert:
Layers in eStargz (
registry2:5000/ubuntu:22.04-ex
) don't contain TOC JSON. TOC image (registry2:5000/ubuntu:22.04-ex-esgztoc
) contains TOC of all layers of the eStargz image.Push them to the same registry:
Pull it lazily:
Optional
--estargz-lossless
flag for lossless conversionctr-remote i convert
supports an optional flag--estargz-lossless
specified with--estargz-external-toc
. This converts an image to eStargz without changing the diffID (uncompressed digest) so even eStargz-agnostic gzip decompressor (e.g. gunzip) can restore the original tar blob.--estargz-record-in
can't be used with this flag.About
--estargz-min-chunk-size
This option allows writing multiple small files into a single gzip stream. This can hopefully reduce the number of gzip headers and the size of the result blob.
To implement this, we need to introduce a new field
innerOffset
to TOC.This field allows the following structure.
Unfortunatelly older version of stargz-snapshotter can't correctly understand the image created with
innerOffset
and can show a broken view of the filesystem.Usage
conversion:
Pull it lazily:
Comparison
Comparison of image size and extraction time of a single file
This compares the size of images among different configurations. This also compares the avarage time to take for extracting a single file from the result blob. Note that we use the highest compression level of gzip (
gzip.BestCompression
) for eStargz by default. TOC image isn't included in the image size.kdeneon/plasma:latest
python:3.9-org
ubuntu:22.04
Image can be smaller using larger number to min-chunk-size and/or using exernal TOC. It's even possible to make eStargz smaller than the original image.
We need to be careful about that when we specify larger number to min-chunk-size, the time for extracting a file can be larger as well. One of the possible reasons is that files in one gzip stream isn't seekable so when we extract a file from the blob we also need to extract the neighboring files in the same stream.
To mitigate this performance drawback of min-chunk-size, stargz-snapshotter introduces a logic to aggressively and automatically cache files in a stream. When stargz-snapshotter extracts a file from one gzip stream and that stream contains multiple files, it automatically caches all of the neighboring files for speeding up accessing them in the future.
Comparison of file read throughput (
tar
-ing the rootfs of the container)This compares time to take
tar
-ing rootfs of the container running on stargz-snapshotter's FUSE.Command:
/bin/bash -c 'time ( tar --exclude=/sys --exclude=/proc --exclude=/dev -cf - / | cat > /dev/null )'
(average of 3 times)
No performance drawback has been observed for min-chunk-size-enabled images.
HelloBench
No performance drawback has been observed for min-chunk-size-enabled images.
TODOs
-esgztoc
is hardcoded)