-
Notifications
You must be signed in to change notification settings - Fork 410
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refine logical op in TiFlash #9523
Conversation
/test pull-integration-test |
for (size_t i = 0; i < n; i++) | ||
vec_res_not_null[i] |= Impl::resNotNull(data[i], false); | ||
} | ||
AssociativeOperationImpl<Impl, 6>::execute(not_null_uint8_columns, vec_res); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keep previous comments to explain why uses 6 here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
converted_columns.emplace_back(std::move(converted_column)); | ||
while (nullable_uint8_columns.size() > 1) | ||
{ | ||
NullableAssociativeOperationImpl<Impl, 6>::execute( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Columns converted_columns; | ||
|
||
for (size_t index = 0; index < in.size(); index++) | ||
void handleMultipleInputColumns( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better add one full explanation for all the parameters, and describe the differences for other two handleXXXX methods if any.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
} | ||
} | ||
|
||
void convertAllInputToUInt8( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better add comments to decribe the relations among output columnPtrs(not_null_uint8_columns, nullable_uint8_columns, converted_columns)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Signed-off-by: xufei <xufeixw@mail.ustc.edu.cn>
558749d
to
6846ce6
Compare
/test pull-integration-test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
{ | ||
while (nullable_uint8_columns.size() > 1) | ||
{ | ||
// micro benchmark shows 4 << 6, 8 outperforms 6 slightly, and performance may decline when set to 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting results. Would you be willing to add some concrete numbers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is one of the benchmark results, block row size is 10000
Run on (72 X 3300 MHz CPU s)
CPU Caches:
L1 Data 32 KiB (x36)
L1 Instruction 32 KiB (x36)
L2 Unified 1024 KiB (x36)
L3 Unified 25344 KiB (x2)
Load Average: 14.19, 14.67, 14.33
LogicalOpBench/optLogicalMultiParam_And6_not_null10/iterations:1000 30615 ns 30481 ns 1000
LogicalOpBench/optLogicalMultiParam_And6_not_null8/iterations:1000 30045 ns 29925 ns 1000
LogicalOpBench/optLogicalMultiParam_And6_not_null6/iterations:1000 34859 ns 34705 ns 1000
LogicalOpBench/optLogicalMultiParam_And6_not_null4/iterations:1000 48457 ns 48274 ns 1000
LogicalOpBench/optLogicalMultiParam_And7_not_null10/iterations:1000 46847 ns 46678 ns 1000
LogicalOpBench/optLogicalMultiParam_And7_not_null8/iterations:1000 49235 ns 49053 ns 1000
LogicalOpBench/optLogicalMultiParam_And7_not_null6/iterations:1000 44633 ns 44456 ns 1000
LogicalOpBench/optLogicalMultiParam_And7_not_null4/iterations:1000 72414 ns 72134 ns 1000
LogicalOpBench/optLogicalMultiParam_And8_not_null10/iterations:1000 64318 ns 64091 ns 1000
LogicalOpBench/optLogicalMultiParam_And8_not_null8/iterations:1000 61097 ns 60847 ns 1000
LogicalOpBench/optLogicalMultiParam_And8_not_null6/iterations:1000 62362 ns 62137 ns 1000
LogicalOpBench/optLogicalMultiParam_And8_not_null4/iterations:1000 84139 ns 83801 ns 1000
LogicalOpBench/optLogicalMultiParam_And9_not_null10/iterations:1000 70640 ns 70362 ns 1000
LogicalOpBench/optLogicalMultiParam_And9_not_null8/iterations:1000 77622 ns 77322 ns 1000
LogicalOpBench/optLogicalMultiParam_And9_not_null6/iterations:1000 88216 ns 87870 ns 1000
LogicalOpBench/optLogicalMultiParam_And9_not_null4/iterations:1000 98208 ns 97835 ns 1000
LogicalOpBench/optLogicalMultiParam_And10_not_null10/iterations:1000 79753 ns 79401 ns 1000
LogicalOpBench/optLogicalMultiParam_And10_not_null8/iterations:1000 97168 ns 96756 ns 1000
LogicalOpBench/optLogicalMultiParam_And10_not_null6/iterations:1000 111726 ns 111300 ns 1000
LogicalOpBench/optLogicalMultiParam_And10_not_null4/iterations:1000 114488 ns 114067 ns 1000
LogicalOpBench/optLogicalMultiParam_And11_not_null10/iterations:1000 91365 ns 91036 ns 1000
LogicalOpBench/optLogicalMultiParam_And11_not_null8/iterations:1000 109042 ns 108619 ns 1000
LogicalOpBench/optLogicalMultiParam_And11_not_null6/iterations:1000 124121 ns 123599 ns 1000
LogicalOpBench/optLogicalMultiParam_And11_not_null4/iterations:1000 127939 ns 127453 ns 1000
LogicalOpBench/optLogicalMultiParam_And6_nullable10/iterations:1000 98956 ns 98600 ns 1000
LogicalOpBench/optLogicalMultiParam_And6_nullable8/iterations:1000 101464 ns 101110 ns 1000
LogicalOpBench/optLogicalMultiParam_And6_nullable6/iterations:1000 110970 ns 110589 ns 1000
LogicalOpBench/optLogicalMultiParam_And6_nullable4/iterations:1000 112874 ns 112427 ns 1000
LogicalOpBench/optLogicalMultiParam_And7_nullable10/iterations:1000 109919 ns 109538 ns 1000
LogicalOpBench/optLogicalMultiParam_And7_nullable8/iterations:1000 109301 ns 108919 ns 1000
LogicalOpBench/optLogicalMultiParam_And7_nullable6/iterations:1000 123537 ns 123085 ns 1000
LogicalOpBench/optLogicalMultiParam_And7_nullable4/iterations:1000 135217 ns 134737 ns 1000
LogicalOpBench/optLogicalMultiParam_And8_nullable10/iterations:1000 123250 ns 122759 ns 1000
LogicalOpBench/optLogicalMultiParam_And8_nullable8/iterations:1000 136376 ns 135838 ns 1000
LogicalOpBench/optLogicalMultiParam_And8_nullable6/iterations:1000 140589 ns 140092 ns 1000
LogicalOpBench/optLogicalMultiParam_And8_nullable4/iterations:1000 148977 ns 148454 ns 1000
LogicalOpBench/optLogicalMultiParam_And9_nullable10/iterations:1000 141426 ns 140945 ns 1000
LogicalOpBench/optLogicalMultiParam_And9_nullable8/iterations:1000 148438 ns 147214 ns 1000
LogicalOpBench/optLogicalMultiParam_And9_nullable6/iterations:1000 207442 ns 206634 ns 1000
LogicalOpBench/optLogicalMultiParam_And9_nullable4/iterations:1000 173037 ns 172412 ns 1000
LogicalOpBench/optLogicalMultiParam_And10_nullable10/iterations:1000 164605 ns 164018 ns 1000
LogicalOpBench/optLogicalMultiParam_And10_nullable8/iterations:1000 168543 ns 167935 ns 1000
LogicalOpBench/optLogicalMultiParam_And10_nullable6/iterations:1000 168199 ns 167608 ns 1000
LogicalOpBench/optLogicalMultiParam_And10_nullable4/iterations:1000 194473 ns 193743 ns 1000
LogicalOpBench/optLogicalMultiParam_And11_nullable10/iterations:1000 180895 ns 180236 ns 1000
LogicalOpBench/optLogicalMultiParam_And11_nullable8/iterations:1000 178976 ns 178328 ns 1000
LogicalOpBench/optLogicalMultiParam_And11_nullable6/iterations:1000 191045 ns 190356 ns 1000
LogicalOpBench/optLogicalMultiParam_And11_nullable4/iterations:1000 204776 ns 204063 ns 1000
optLogicalMultiParam_And6_not_null10 means 6 not null input columns, and template argument is 10
optLogicalMultiParam_And11_nullable10 means 11 nullable input columns, and template argument is 10
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: gengliqi, yibin87 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@windtalker: Your PR was out of date, I have automatically updated it for you. At the same time I will also trigger all tests for you: /run-all-tests
If the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
Co-authored-by: Liqi Geng <gengliqiii@gmail.com>
/run-all-tests |
What problem does this PR solve?
Issue Number: close #9146
Problem Summary:
According to #9146, the logical operator in TiFlash is kind of low efficient.
What is changed and how it works?
This pr refine logical operator using the following ways
NullableAssociativeOperationImpl
to handle nullable column specifically, so the originalAssociativeOperationImpl
don't need to consider null valueNullableAssociativeOperationImpl/AssociativeOperationImpl
for the case that input column is 2benchmark results
Run on (72 X 3300 MHz CPU s)
CPU Caches:
L1 Data 32 KiB (x36)
L1 Instruction 32 KiB (x36)
L2 Unified 1024 KiB (x36)
L3 Unified 25344 KiB (x2)
Load Average: 23.98, 20.10, 14.61
For binary logical operators
For multiple columns input logical operator speed by 2.x - 3.x times
For the same query in #9146
before this pr: 0.42s
after this pr: 0.26s
The original implementation of logical functions is renamed to
FunctionsLegacyLogical.h
for benchmark tests.Check List
Tests
Side effects
Documentation
Release note