Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update ucaps.h to contain proper case matchings #90726

Merged
merged 1 commit into from
Jan 29, 2025

Conversation

Chubercik
Copy link
Contributor

@Chubercik Chubercik commented Apr 16, 2024

@Chubercik Chubercik requested a review from a team as a code owner April 16, 2024 00:08
@Calinou Calinou added this to the 4.3 milestone Apr 16, 2024
@akien-mga akien-mga requested a review from bruvzg April 16, 2024 07:54
Copy link
Member

@bruvzg bruvzg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally this should be generated from UCD (https://www.unicode.org/reports/tr21/tr21-5.html#UnicodeData, for these functions only singe character mappings are useful, TextServer already do the rest via ICU), not from python (which might use different UCD versions).

@Chubercik
Copy link
Contributor Author

Why version 3.2.0 specifically?

@bruvzg
Copy link
Member

bruvzg commented Apr 16, 2024

Why version 3.2.0 specifically?

It should be the latest data, not 3.2,0 specifically, but seems like it was incorporated into other parts of the standard. But we need only legacy handling here (without string length changes).

@Chubercik
Copy link
Contributor Author

Chubercik commented Dec 5, 2024

Ideally this should be generated from UCD (https://www.unicode.org/reports/tr21/tr21-5.html#UnicodeData, for these functions only singe character mappings are useful, TextServer already do the rest via ICU), not from python (which might use different UCD versions).

Python version UCD version
3.8 12.1.0 (2019)
3.9 (this PR) 13.0.0 (2020)
3.10 13.0.0 (2020)
3.11 14.0.0 (2021)
3.12 15.0.0 (2022)
3.13 15.1.0 (2023)
dev (3.14) 16.0.0 (2024)

Different UCD versions when using different Python versions is a problem only if we see it as such - if we want to have the latest case matchings, we can always run the script using the latest Python version; however, if this is a dealbreaker, I could write a script that parses UnicodeData.txt (UCD 16.0.0) (also see: UnicodeData.txt's Property Table, UnicodeData File Format).


Edit:

[...] I could write a script that parses UnicodeData.txt (UCD 16.0.0) (also see: UnicodeData.txt's Property Table, UnicodeData File Format).

Done, PR is ready for review :)

@Chubercik Chubercik marked this pull request as draft December 5, 2024 13:52
@Chubercik Chubercik changed the title Update ucaps.h to contain proper case matchings (Python parity) Update ucaps.h to contain proper case matchings Dec 5, 2024
@Chubercik Chubercik marked this pull request as ready for review December 5, 2024 22:38
@Chubercik Chubercik requested a review from a team as a code owner December 5, 2024 22:38
@Chubercik Chubercik requested a review from bruvzg December 13, 2024 14:58
@Ivorforce
Copy link
Member

There are also caps tables in char_range.inc. I don't know where those are from, but it would make sense to have them be up-to-date, and if they were created automatically, do both in the same script.

@Chubercik
Copy link
Contributor Author

Chubercik commented Jan 21, 2025

Will look into that later today, thanks for the heads-up.


Edit:

Copy link
Member

@bruvzg bruvzg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, I would apply the same changes proposed in #101878 (review) (header generation) to this script as well.

@Chubercik
Copy link
Contributor Author

Chubercik commented Jan 23, 2025

Sure thing, I'll look into it later today.


Edit:

Done, though I couldn't place the // This file was generated using the `ucaps_fetch.py` script. notice above the include guard; it might have something to do with Godot's code style enforced by pre-commit, idk.

@Chubercik Chubercik force-pushed the ucaps-exorcism branch 2 times, most recently from 1048576 to 1048576 Compare January 23, 2025 11:43
@Repiteo Repiteo merged commit 4244878 into godotengine:master Jan 29, 2025
19 checks passed
@Repiteo
Copy link
Contributor

Repiteo commented Jan 29, 2025

Thanks!

@Chubercik Chubercik deleted the ucaps-exorcism branch January 29, 2025 23:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants