Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added BITXOR to Doris parsing support (#31508) #33258

Merged
merged 2 commits into from
Oct 16, 2024

Conversation

danigiri
Copy link
Contributor

@danigiri danigiri commented Oct 15, 2024

As part of hacktoberfest, this is a POC for adding more parsing logic to Doris support. If this is successful, I will look at the other Doris SQL keywords in subsequent PRs. Starts to address #31508.

I understand that I cannot add the labels to the PR and maintainers have to do it.

Changes proposed in this pull request:

  • Made g4 changes, with a structure in preparation for adding more functions like unary bitwise
  • Includes test case additions to XML
  • Not sure if the change to getAllDatabaseTypes is really needed, can roll it back
  • BITXOR sits in identifierKeywordsUnambiguous and not customKeyword at the moment

Next:

  • Just realised I modified the BIT_XOR aggregation keyword instead of creating a new one. I can fix this quickly in a new commit in this PR (fixed in an additional commit in the PR)

Comments:

  • Doris requires AVX2 hardware which I do not have local access to (only AVX), it lets you compile without AVX2 support, but it's been 24h compilation time and not finished yet (or in an infinite loop).
  • I launched a DigitalOcean droplet ($) instead which let me test the SQL.
  • I am not sure if other backend changes are required once parsing support is added
  • Not sure which doc changes are needed or where

Before committing this PR, I'm sure that I have checked the following options:

  • [ ✅] My code follows the code of conduct of this project. As much as I understood it
  • [ ✅] I have self-reviewed the commit code.
  • [ ✅] I have (or in comment I request) added corresponding labels for the pull request.
  • [✅] I have passed maven check locally : ./mvnw clean install -B -T1C -Dmaven.javadoc.skip -Dmaven.jacoco.skip -e.
  • [] I have made corresponding changes to the documentation.
  • [✅] I have added corresponding unit tests for my changes.
  • [✅] mvn spotless:apply -Pcheck
  • [✅] mvn test -Dtest=InternalDorisParserIT in shardingsphere/parser/sql/dialect/doris
  • [✅] Executed test case in Doris v2.0.3 SELECT BITXOR(3,5)

Includes test case and change to `getAllDatabaseTypes`
Note currently BITXOR sits under `identifierKeywordsUnambiguous` and not `customKeyword`.
@iamhucong
Copy link
Contributor

LGTM.

@iamhucong iamhucong merged commit 0141cad into apache:master Oct 16, 2024
141 checks passed
@iamhucong iamhucong added this to the 5.5.2 milestone Oct 16, 2024
@strongduanmu
Copy link
Member

Hi @danigiri, can you update the release note for 5.5.2-SNAPSHOT?

@iamhucong
Copy link
Contributor

https://shardingsphere.apache.org/community/en/involved/conduct/code/#maintenance-conduct
@danigiri Could you please verify if MySQL supports BITXOR? You can refer to this document for further information.

@danigiri
Copy link
Contributor Author

Thanks for the review and comments, will do next:

  • update release notes
  • check MySQL BITXOR support

Also

@danigiri danigiri deleted the dev-doris branch October 16, 2024 09:59
danigiri added a commit to danigiri/shardingsphere that referenced this pull request Oct 16, 2024
@danigiri
Copy link
Contributor Author

looks like MySQL implements the same operation with ^ (https://dev.mysql.com/doc/refman/8.4/en/bit-functions.html)

SELECT 3 ^ 5;
+-------+
| 3 ^ 5    |
+-------+
|          6 |
+-------+
SELECT BITXOR(3,5);
ERROR 1305 (42000): FUNCTION mysql.BITXOR does not exist

which is already present in /dialect_mysql/src/main/antlr4/imports/mysql/BaseRule.g4

    | bitExpr CARET_ bitExpr

From what I understand, this means no further action is needed?

I will look at the INSTR in the meantime

@iamhucong
Copy link
Contributor

It looks like MySQL only supports BIT_XOR() and is slightly different from Doris. Maybe you can add a mark with the differential codes.

@danigiri
Copy link
Contributor Author

It looks like MySQL only supports BIT_XOR() and is slightly different from Doris. Maybe you can add a mark with the differential codes.

Not exactly, they are in fact relatively different, here's a breakdown of the situation:

MySQL BIT_XOR()

This is an aggregation function https://dev.mysql.com/doc/refman/8.4/en/aggregate-functions.html

SELECT BIT_XOR(value) FROM (VALUES ROW(3), ROW(5)) AS t(value);                                                                                                                                                                         
+----------------+
| BIT_XOR(value) |
+----------------+
|              6 |
+----------------+
1 row in set (0.01 sec)

^ in MySQL

This is a bitwise expression https://dev.mysql.com/doc/refman/8.4/en/bit-functions.html

SELECT 3^5;
+-----+
| 3^5 |
+-----+
|   6 |
+-----+
1 row in set (0.00 sec)

Doris BITXOR()

This is a bitwise function https://doris.apache.org/docs/sql-manual/sql-functions/bitwise-functions/bitxor/ which is equivalent to the MySQL ^ expression

SELECT BITXOR(3,5);
+---------+
|  3 ^ 5) |
+---------+
|       6 |
+---------+
1 row in set (0.05 sec)

Doris GROUP_BIT_XOR

This is an aggregation function https://doris.apache.org/docs/1.2/sql-manual/sql-functions/aggregate-functions/group_bit_xor/ equivalent to the MySQL bit_xor aggregation function

WITH values_cte AS (     
SELECT 3 AS value  UNION ALL  SELECT 5 AS value ) 
SELECT GROUP_BIT_XOR(value) FROM values_cte;
+----------------------+
| group_bit_xor(value) |
+----------------------+
|                    6 |
+----------------------+
1 row in set (0.30 sec)

In summary:

MySQL Doris
Aggr BIT_XOR GROUP_BIT_XOR
Expr/func ^ BITXOR

Given this, does it change the approach?

I am not sure what you mean by

add a mark with the differential codes.

if you have a commit / PR I could use as an example that would be super. Do you mean a comment in .g4 or something else?

@danigiri
Copy link
Contributor Author

cc @iamhucong (see my reply above, thx)

danigiri added a commit to danigiri/shardingsphere that referenced this pull request Oct 17, 2024
danigiri added a commit to danigiri/shardingsphere that referenced this pull request Oct 17, 2024
@iamhucong
Copy link
Contributor

Great, thanks for checking that out!

https://shardingsphere.apache.org/community/en/involved/conduct/code/#maintenance-conduct

Difference code markup syntax, when adding, replace {DatabaseType} with the uppercase name of the database type, for example: DORIS.

New syntax: //{DatabaseType} ADDED BEGIN and // {DatabaseType} ADDED END;
Modified syntax: //{DatabaseType} CHANGED BEGIN and // {DatabaseType} CHANGED END.

I mean this comment mark. Try searching for // DORIS ADDED BEGIN as a reference.

@danigiri
Copy link
Contributor Author

Cool, thx. I am going to give it a go in my upcoming PR, thanks for the clrifications.

danigiri added a commit to danigiri/shardingsphere that referenced this pull request Oct 17, 2024
danigiri added a commit to danigiri/shardingsphere that referenced this pull request Oct 17, 2024
danigiri added a commit to danigiri/shardingsphere that referenced this pull request Oct 17, 2024
strongduanmu pushed a commit that referenced this pull request Oct 20, 2024
* Added support for INSTR to Doris parsing

- related to #31508
- includes test
- `mvn spotless:apply -Pcheck` and `mvn test -Dtest=InternalDorisParserIT`
- will update release notes in next commit
- will check if MySQL supports INSTR in next commit
- verified query in Doris

* Updated release notes to reflect #33258

* Added comment markers for INSTR and BIT_XOR

Small additional changes:
- Also moved the java code to match the order in g4 more.
- Added missing import as well.

* Added release notes

Also added missing `// DORIS ADDED BEGIN|END` marker
@danigiri
Copy link
Contributor Author

Compiling the just released doris 3.0 and will look at the next keywords

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants