Skip to content

Commit b74bac8

Browse files
committed
remote_exec.proto: add blob split and splice API
Depending on the software project, possibly large artifacts need to be downloaded from or uploaded to the remote CAS such as executables with debug information, comprehensive libraries, or even whole file system images. Such artifacts generate a lot of traffic when downloaded or uploaded. The blob-split API allows to split such artifacts into chunks at the remote side, to fetch only those parts that are locally missing, and finally to locally assemble the requested blob from its chunks. The blob-splice API allows to split such artifacts into chunks locally, to upload only those parts that are remotely missing, and finally to remotely splice the requested blob from its chunks. Since only the binary differences from the last download/upload are fetched/uploaded, the blob split and splice API can save a lot of network traffic between server and client.
1 parent 6c32c3b commit b74bac8

File tree

1 file changed

+134
-0
lines changed

1 file changed

+134
-0
lines changed

build/bazel/remote/execution/v2/remote_execution.proto

+134
Original file line numberDiff line numberDiff line change
@@ -430,6 +430,79 @@ service ContentAddressableStorage {
430430
rpc GetTree(GetTreeRequest) returns (stream GetTreeResponse) {
431431
option (google.api.http) = { get: "/v2/{instance_name=**}/blobs/{root_digest.hash}/{root_digest.size_bytes}:getTree" };
432432
}
433+
434+
// Split a blob into chunks.
435+
//
436+
// This splitting API aims to reduce download traffic between client and
437+
// server, e.g., if a client needs to fetch a large blob that just has been
438+
// modified slightly since the last built. In this case, there is no need to
439+
// fetch the entire blob data, but just the binary differences between the two
440+
// blob versions, which can be amongst other techniques determined by
441+
// content-defined chunking.
442+
//
443+
// Clients can use this API before downloading a blob to determine which parts
444+
// of the blob are already present locally and do not need to be downloaded
445+
// again. The server splits the blob into chunks according to a
446+
// content-defined chunking algorithm and returns a list of the chunk digests
447+
// in the order in which the chunks have to be concatenated to assemble the
448+
// requested blob.
449+
//
450+
// A client can expect the following guarantees from the server if a split
451+
// request is answered successfully:
452+
// 1. The blob chunks are stored in CAS.
453+
// 2. Concatenating the blob chunks in the order of the digest list returned
454+
// by the server results in the original blob.
455+
//
456+
// The usage of this API is optional for clients but it allows them to
457+
// download only the missing parts of a large blob instead of the entire blob
458+
// data, which in turn can considerably reduce download network traffic.
459+
//
460+
// Servers are free to implement this functionality, but they need to declare
461+
// whether they support it or not by setting the
462+
// [CacheCapabilities.blob_split_support][build.bazel.remote.execution.v2.CacheCapabilities.blob_split_support]
463+
// field accordingly.
464+
//
465+
// Errors:
466+
//
467+
// * `NOT_FOUND`: The requested blob is not present in the CAS.
468+
// * `RESOURCE_EXHAUSTED`: There is insufficient disk quota to store the blob
469+
// chunks.
470+
rpc SplitBlob(SplitBlobRequest) returns (SplitBlobResponse) {
471+
option (google.api.http) = { get: "/v2/{instance_name=**}/blobs/{blob_digest.hash}/{blob_digest.size_bytes}:splitBlob" };
472+
}
473+
474+
// Splice a blob from chunks.
475+
//
476+
// This is the complementary operation to the
477+
// [ContentAddressableStorage.SplitBlob][build.bazel.remote.execution.v2.ContentAddressableStorage.SplitBlob]
478+
// function to handle the splitted upload of large blobs to save upload
479+
// traffic.
480+
//
481+
// If a client needs to upload a large blob and is able to split a blob into
482+
// chunks locally according to some content-defined chunking algorithm, it can
483+
// first determine which parts of the blob are already available in the remote
484+
// CAS and upload the missing chunks, and then use this API to instruct the
485+
// server to splice the original blob from the remotely available blob chunks.
486+
//
487+
// The usage of this API is optional for clients but it allows them to upload
488+
// only the missing parts of a large blob instead of the entire blob data,
489+
// which in turn can considerably reduce upload network traffic.
490+
//
491+
// Servers are free to implement this functionality, but they need to declare
492+
// whether they support it or not by setting the
493+
// [CacheCapabilities.blob_splice_support][build.bazel.remote.execution.v2.CacheCapabilities.blob_splice_support]
494+
// field accordingly.
495+
//
496+
// Errors:
497+
//
498+
// * `NOT_FOUND`: At least one of the blob chunks is not present in the CAS.
499+
// * `RESOURCE_EXHAUSTED`: There is insufficient disk quota to store the
500+
// spliced blob.
501+
// * `INVALID_ARGUMENT`: The digest of the spliced blob is different from the
502+
// provided expected digest.
503+
rpc SpliceBlob(SpliceBlobRequest) returns (SpliceBlobResponse) {
504+
option (google.api.http) = { post: "/v2/{instance_name=**}/blobs:spliceBlob" body: "*" };
505+
}
433506
}
434507

435508
// The Capabilities service may be used by remote execution clients to query
@@ -1778,6 +1851,53 @@ message GetTreeResponse {
17781851
string next_page_token = 2;
17791852
}
17801853

1854+
// A request message for
1855+
// [ContentAddressableStorage.SplitBlob][build.bazel.remote.execution.v2.ContentAddressableStorage.SplitBlob].
1856+
message SplitBlobRequest {
1857+
// The instance of the execution system to operate against. A server may
1858+
// support multiple instances of the execution system (with their own workers,
1859+
// storage, caches, etc.). The server MAY require use of this field to select
1860+
// between them in an implementation-defined fashion, otherwise it can be
1861+
// omitted.
1862+
string instance_name = 1;
1863+
1864+
// The digest of the blob to be splitted.
1865+
Digest blob_digest = 2;
1866+
}
1867+
1868+
// A response message for
1869+
// [ContentAddressableStorage.SplitBlob][build.bazel.remote.execution.v2.ContentAddressableStorage.SplitBlob].
1870+
message SplitBlobResponse {
1871+
// The ordered list of digests of the chunks into which the blob was splitted.
1872+
// The original blob is assembled by concatenating the chunk data according to
1873+
// the order of the digests given by this list.
1874+
repeated Digest chunk_digests = 1;
1875+
}
1876+
1877+
// A request message for
1878+
// [ContentAddressableStorage.SpliceBlob][build.bazel.remote.execution.v2.ContentAddressableStorage.SpliceBlob].
1879+
message SpliceBlobRequest {
1880+
// The instance of the execution system to operate against. A server may
1881+
// support multiple instances of the execution system (with their own workers,
1882+
// storage, caches, etc.). The server MAY require use of this field to select
1883+
// between them in an implementation-defined fashion, otherwise it can be
1884+
// omitted.
1885+
string instance_name = 1;
1886+
1887+
// Expected digest of the spliced blob.
1888+
Digest blob_digest = 2;
1889+
1890+
// The ordered list of digests of the chunks which need to be concatenated to
1891+
// assemble the original blob.
1892+
repeated Digest chunk_digests = 3;
1893+
}
1894+
1895+
// A response message for
1896+
// [ContentAddressableStorage.SpliceBlob][build.bazel.remote.execution.v2.ContentAddressableStorage.SpliceBlob].
1897+
message SpliceBlobResponse {
1898+
// Intentionally empty for now, but might need to be extended in future.
1899+
}
1900+
17811901
// A request message for
17821902
// [Capabilities.GetCapabilities][build.bazel.remote.execution.v2.Capabilities.GetCapabilities].
17831903
message GetCapabilitiesRequest {
@@ -1997,6 +2117,20 @@ message CacheCapabilities {
19972117
// [BatchUpdateBlobs][build.bazel.remote.execution.v2.ContentAddressableStorage.BatchUpdateBlobs]
19982118
// requests.
19992119
repeated Compressor.Value supported_batch_update_compressors = 7;
2120+
2121+
// Whether blob splitting is supported for the particular server/instance. If
2122+
// yes, the server/instance implements the specified behavior for blob
2123+
// splitting and a meaningful result can be expected from the
2124+
// [ContentAddressableStorage.SplitBlob][build.bazel.remote.execution.v2.ContentAddressableStorage.SplitBlob]
2125+
// operation.
2126+
bool blob_split_support = 8;
2127+
2128+
// Whether blob splicing is supported for the particular server/instance. If
2129+
// yes, the server/instance implements the specified behavior for blob
2130+
// splicing and a meaningful result can be expected from the
2131+
// [ContentAddressableStorage.SpliceBlob][build.bazel.remote.execution.v2.ContentAddressableStorage.SpliceBlob]
2132+
// operation.
2133+
bool blob_splice_support = 9;
20002134
}
20012135

20022136
// Capabilities of the remote execution system.

0 commit comments

Comments
 (0)