Skip to content

Commit 300f8b3

Browse files
committed
address comment
Signed-off-by: Connor1996 <zbk602423539@gmail.com>
1 parent 8e6e698 commit 300f8b3

File tree

1 file changed

+10
-10
lines changed

1 file changed

+10
-10
lines changed

text/2018-10-25-batch-split.md

+10-10
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
11
# Summary
22

3-
Support `BatchSplit` feature that split one Region into multiple Regions at a time if the size or the number of keys exceeds a threshold. This includes modifications of both TiKV and PD. For TiKV, every round of split-check produces multiple split keys instead of one and change inner split related interface into batch style. For PD, add RPC `AskBatchSplit` and `ReportBatchSplit` to permit TiKV asking for `region_id` and `peer_id` in batch.
3+
Support `BatchSplit` feature that splits one Region into multiple Regions at a time if the size or the number of keys exceeds a threshold. This includes modifications of both TiKV and PD. For TiKV, every round of split-check produces multiple split keys instead of one and changes inner split related interface into batch style. For PD, add RPC `AskBatchSplit` and `ReportBatchSplit` to allow TiKV to ask for `region_id` and `peer_id` in batch.
44

55
# Motivation
66

7-
Current split only split one Region at a time. It may be very slow when sequential write is too fast, namely, split speed can not keep up with write speed. Slow split can lead to large region. In this case, if a snapshot is triggered, it will occupy a lot of IO and make everything slow. Also, large region is hard for scheduling hotspot, so it makes performance even worse.
7+
Current split only splits one Region at a time. It may be very slow when sequential write is too fast, namely, split speed can not keep up with write speed. Slow split can lead to large region. In this case, if a snapshot is triggered, it will occupy a lot of IO and make everything slow. Also, it is hard to schedule hotspots for a large Region, so it makes performance even worse.
88

99
# Detailed design
1010

@@ -54,9 +54,9 @@ message ReportBatchSplitResponse {
5454
}
5555
```
5656

57-
Add `AskBatchSplit` to replace `AskSplit` , it is called when TiKV produces some split keys for one Region and asks PD to allocate new `region_id` and `peer_id` for that Region. `split_count` in `AskBatchSplitRequest` indicates the number of Region to be generated, and `AskBatchSplitResponse` returns all new allocated IDs to TiKV.
57+
Add `AskBatchSplit` to replace `AskSplit`. It is called when TiKV produces some split keys for one Region and asks PD to allocate new `region_id` and `peer_id` for that Region. `split_count` in `AskBatchSplitRequest` indicates the number of Regions to be generated, and `AskBatchSplitResponse` returns all new allocated IDs to TiKV.
5858

59-
Add `ReportBatchSplit` to replace `ReportBatchSplit`, it is called when TiKV finish splitting Region. `ReportBatchSplitRequest` takes all metas of new generated Region for PD to update PD's related information.
59+
Add `ReportBatchSplit` to replace `ReportBatchSplit`. It is called when TiKV finishes splitting a Region. `ReportBatchSplitRequest` takes all metas of a new generated Region for PD to update PD's related information.
6060

6161
For compatibility issue, the old interface is not deleted but set to deprecated.
6262

@@ -65,7 +65,7 @@ For compatibility issue, the old interface is not deleted but set to deprecated.
6565
```protobuf
6666
message SplitRequest {
6767
// ...
68-
// Will be ignored in batch split, use `BatchSplitRequest::right_derive` instead.
68+
// Will be ignored in batch split. Use `BatchSplitRequest::right_derive` instead.
6969
bool right_derive = 4 [deprecated=true];
7070
}
7171
@@ -133,7 +133,7 @@ Now we have four split-checkers: half, keys, size and table. SizeChecker and Key
133133
The general logic of SizeChecker and KeysChecker are similiar, the only difference between them is one splits Region based on size and the other splits Region based on the number of keys. So here we mainly describe the logic of SizeChecker:
134134

135135
- before: it scans key-value pairs in a Region's range sequentially to accumlate their size as `total_size` and stops once the size reachs to `region_max_size` or scans to the end of range. If `total_size` is smaller than `region_max_size` at the end, checker wouldn't produce any split key; if not, it regards the very key at which `total_size` reachs to `region_split_size` as split key.
136-
- after: it scans key-value pairs in a Region's range sequentially to accumlate their size as `total_size` and stops once the size reachs to `region_split_size * (batch_split_limit-1) + region_max_size` or scans to the end of range. During the scan process, it reocrds the key as split key every `region_split_size`, but after finishing scanning, it may discards the last split key if the size of rest Region doesn't over `region_max_size - region_split_size`. With this algorithm, if `batch_split_limit` is set to 1, TiKV can perfectly behave as before that split without batch.
136+
- after: it scans key-value pairs in a Region's range sequentially to accumulate their size as `total_size` and stops once the size reaches `region_split_size * (batch_split_limit-1) + region_max_size` or scans to the end of the range. During the scan process, it records the key as split key every `region_split_size`, but after finishing scanning, it may discard the last split key if the size of rest Region is not bigger than `region_max_size - region_split_size`. With this algorithm, if `batch_split_limit` is set to 1, TiKV can perfectly behave the same way as the split without batch.
137137

138138
### Compatibility concern
139139

@@ -146,19 +146,19 @@ enum ErrorType {
146146
}
147147
```
148148

149-
So once TiKV gets `AskBatchSplitResponse` with `ErrorType::INCOMPATIBLE_VERSION`, it uses original `AskSplit` instead of `AskBatchSplit`, and all following processes will degrade to original way. So original code path is not deleted.
149+
So once TiKV gets `AskBatchSplitResponse` with `ErrorType::INCOMPATIBLE_VERSION`, it uses the original `AskSplit` instead of `AskBatchSplit`, and all the following processes will degrade to the original way. So the original code path is not deleted.
150150

151151
### Approximate split key
152152

153-
What we said above can ease the problem, however scanning a large Region can also consume a lot of time and CPU. Test shows that large Region can still easily show up even with batch split implemented, although split is speeded up.
153+
What we said above can ease the problem. However, scanning a large Region can also consume a lot of time and CPU. The test shows that large Regions can still easily show up even with batch split implemented, although split is speeded up.
154154

155155
When a Region becomes large enough, it's more practical to divide it into smaller chunks quickly. This can be achieved via size estimation, which can be calculated from SST properties. Although it may not be accurate enough, it's okay for a large Region.
156156

157157
So if the size of Region is larger than `region_max_size * batch_split_limit * 2`, TiKV uses approximate way to produce split keys. The approximate way is quite similar to the algorithm we describe above, but to estimate TiKV uses approximate size of the Region and the number of keys in the Region's range to calculate the average distance between two SST property keys, and produces a split key every `region_split_size / distance` keys.
158158

159159
# Drawbacks
160160

161-
- When use approximate way, Region may split into several disproportional Regions due to size estimation.
161+
- When the approximate way is used, Region may split into several disproportional Regions due to size estimation.
162162

163163
# Alternatives
164164

@@ -167,4 +167,4 @@ None
167167

168168
# Unresolved questions
169169

170-
A large Region is usually more emergent to be split, so we can change the split check queue from a naive FIFO queue to a priority queue so that large Region can be split early and quickly.
170+
Generally, it is more urgent to split a large Region, so we can change the split check queue from a naive FIFO queue to a priority queue so that a large Region can be split early and quickly.

0 commit comments

Comments
 (0)