-
Notifications
You must be signed in to change notification settings - Fork 220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(storage): record first_key in block_index_builder #598
Conversation
aa51882
to
bf661d9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, first_key is only useful when the column is fully sorted. So we can have a parameter on column builder whether to store the first_key or not.
@@ -83,4 +87,8 @@ impl BlockIndexBuilder { | |||
pub fn into_index(self) -> Vec<BlockIndex> { | |||
self.indexes | |||
} | |||
|
|||
pub fn set_first_key(&mut self, first_key: Vec<u8>) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer having the signature as
pub fn set_first_key(&mut self, first_key: &[u8]) {
self.first_key.clear();
self.first_key.extend(first_key);
}
So as to reduce allocation as much as possible.
By the way, it seems that first_key
will only be used on finish_block
. Is it possible to make it a parameter of finish_block
instead of having a separate function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, first_key is only useful when the column is fully sorted. So we can have a parameter on column builder whether to store the first_key or not.
Agree. But I am curious about where can we know that column need sort or not? Maybe pass from ColumnBuilderOptions
later?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer having the signature as
pub fn set_first_key(&mut self, first_key: &[u8]) { self.first_key.clear(); self.first_key.extend(first_key); }
So as to reduce allocation as much as possible.
By the way, it seems that
first_key
will only be used onfinish_block
. Is it possible to make it a parameter offinish_block
instead of having a separate function?
This means that we need to modify PrimitiveColumnBuilder<T>::finish_builder()
interface, that's ok...
@@ -121,7 +121,7 @@ pub fn append_one_by_one<'a, A: Array>( | |||
impl<T: PrimitiveFixedWidthEncode> ColumnBuilder<T::ArrayType> for PrimitiveColumnBuilder<T> { | |||
fn append(&mut self, array: &T::ArrayType) { | |||
let mut iter = array.iter().peekable(); | |||
|
|||
let mut first_flag = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
first_flag should be an inner state of ColumnBuilder
instead of a local variable. One append doesn't necessarily produce a block.
I think a better way is to store the first_key in the struct on switching builder (when self.current_builder.is_none()
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
first_flag should be an inner state of
ColumnBuilder
instead of a local variable. One append doesn't necessarily produce a block.I think a better way is to store the first_key in the struct on switching builder (when
self.current_builder.is_none()
)?
Yes, the condition self.current_builder.is_none()
can be reused.
By the way, should we apply this feature to |
Yes, indeed. But we can work on primitive types first, and add support for other columns later. |
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
bf661d9
to
5301767
Compare
Hi, now we can use a new option |
I think |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally LGTM, but there're still some misc fixes to do. Thanks for your contribution!
) { | ||
self.indexes.push(BlockIndex { | ||
offset: column_data.len() as u64, | ||
length: block_data.len() as u64 + BLOCK_HEADER_SIZE as u64, | ||
first_rowid: self.last_row_count as u32, | ||
row_count: (self.row_count - self.last_row_count) as u32, | ||
/// TODO(chi): support sort key | ||
first_key: "".into(), | ||
first_key: first_key.drain(..).collect(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to get a full ownership of first_key
in this function. Otherwise this interface would look confusing 🤣
&mut self.data, | ||
&mut block_data, | ||
stats, | ||
&mut self.first_key, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's drain here: self.first_key.drain(..).collect_vec()
@@ -80,6 +89,11 @@ impl ColumnBuilder<BlobArray> for BlobColumnBuilder { | |||
PlainBlobBlockBuilder::new(self.options.target_block_size - 16), | |||
)); | |||
} | |||
if let Some(to_be_appended) = iter.peek() { | |||
if self.options.record_first_key && to_be_appended.is_some() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
None
is also a valid first_key. I think we can decide on encoding of None keys.
Maybe we can use \0
to represent null keys, and \1
+ BlobEncode to represent non-null keys?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also for the encoding of the key, we can use utilities provided in https://github.com/risinglightdb/risinglight/blob/main/src/storage/secondary/encode.rs
For blob types, simply call to_be_appended.to_byte_slice().to_vec()
, and store it in first_key
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can also assert!(self.first_key.is_empty())
here, for an extra layer of sanity check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
None
is also a valid first_key. I think we can decide on encoding of None keys.Maybe we can use
\0
to represent null keys, and\1
+ BlobEncode to represent non-null keys?
So we use \1
for CharColumnBuilder
too, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we use this as a temporary encoding for nullable elements, for both char, blob and primitive types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or we can have two fields: is_first_key_null
and first_key
in proto. Maybe this could be more intuitive? If first_key is null, first_key field is set to empty, and is_first_key_null
is set to true?
By the way, please have some basic unit tests, at least for primitive type builders 🥰 |
ef44350
to
107f9da
Compare
Hi, I have followed your suggestions to add |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still some small issues. Thanks again for your contribution!
if self.options.record_first_key && to_be_appended.is_some() { | ||
self.is_first_key_null = false; | ||
self.first_key = to_be_appended.unwrap().as_bytes().to_vec(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that once is_first_key_null
is set to false
, it will never be set back to true
?
We can define self.first_key
as Option<Vec<u8>>
, so as to make the logic clearer.
if self.options.record_first_key {
self.first_key = to_be_appended.map(|x| x.as_bytes().to_vec());
}
And we can modify finish_block
to accept first_key: Option<Vec<u8>>
. Inside the block builder, we match first_key
. If null, set is_first_key_null
to true. Otherwise, set first_key
to the inner value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still some small issues. Thanks again for your contribution!
Thanks also for your review too! 😭
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that once
is_first_key_null
is set tofalse
, it will never be set back totrue
?We can define
self.first_key
asOption<Vec<u8>>
, so as to make the logic clearer.if self.options.record_first_key { self.first_key = to_be_appended.map(|x| x.as_bytes().to_vec()); }And we can modify
finish_block
to acceptfirst_key: Option<Vec<u8>>
. Inside the block builder, we matchfirst_key
. If null, setis_first_key_null
to true. Otherwise, setfirst_key
to the inner value.
Maybe finish_block()
will be like this?
pub fn finish_block(
&mut self,
block_type: BlockType,
column_data: &mut Vec<u8>,
block_data: &mut Vec<u8>,
stats: Vec<BlockStatistics>,
first_key: &mut Option<Vec<u8>>,
) {
self.indexes.push(BlockIndex {
offset: column_data.len() as u64,
length: block_data.len() as u64 + BLOCK_HEADER_SIZE as u64,
first_rowid: self.last_row_count as u32,
row_count: (self.row_count - self.last_row_count) as u32,
/// TODO(chi): support sort key
first_key: match first_key {
Some(x) => x.drain(..).collect(),
None => Default::default(),
},
stats,
is_first_key_null: first_key.is_none(),
});
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly! And I’d prefer to have Option<Vec<u8>>
as parameter instead of &mut. As block is typically large, this re-allocation is acceptable.
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
107f9da
to
8bcc46e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, good work!
Thanks a lot! @skyzh |
This patch want to complete #589 first step, which record first_key in order to support seek.
Signed-off-by: ZheNing Hu adlternative@gmail.com