Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update language list, add new locale selection dialog and support for custom codes. #103498

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

bruvzg
Copy link
Member

@bruvzg bruvzg commented Mar 3, 2025

Fixes godotengine/godot-proposals#11755
Supersede #102787

  • Updates language list to include all 639-3 codes. (removed for now).
  • Add support for adding per project custom codes.
  • Replaces old locale selection dialog, huge and barely usable full lists are replaced with
    • Current project locales
    • Favorites (saved in the editor settings)
    • Search results.
    • List of custom codes (per project, editable).
Screenshot 2025-03-03 at 09 42 02

@bruvzg bruvzg added this to the 4.5 milestone Mar 3, 2025
@bruvzg bruvzg force-pushed the ex_locs branch 2 times, most recently from 30aadf4 to 4f09adc Compare March 3, 2025 08:13
Copy link
Member

@Calinou Calinou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested locally, it works as expected.

Unfortunately, the larger bundled language list increases binary size quite a bit. This is the size of a stripped Linux x86_64 release export template with production=yes lto=full:

72,081,176 -> 72,371,992 bytes (+290 KB)

@bruvzg
Copy link
Member Author

bruvzg commented Mar 7, 2025

Unfortunately, the larger bundled language list increases binary size quite a bit. This is the size of a stripped Linux x86_64 release export template with production=yes lto=full

I guess we can try storing it compressed, like we do with editor docs. I will try it later.

@akien-mga
Copy link
Member

akien-mga commented Mar 7, 2025

I wonder whether we need all these languages in core, or if we could make it editor only. Especially names of languages and countries, etc.

This also brings up the question of localizing all these strings, and honestly I wouldn't want to add thousands of languages and country names to our docs to localize. It's probably best left to users to do on a case by case basis, as the vast majority of projects would support max 10 languages, so they can put those strings in their own UI without relying on the ones we hardcode.

For core we could possibly just the use of any arbitrary language code and just fail if it doesn't exist? But the editor tooling would help users actually select language codes and countries that are recognized.

@bruvzg
Copy link
Member Author

bruvzg commented Mar 7, 2025

I wonder whether we need all these languages in core, or if we could make it editor only. Especially names of languages and countries, etc.

Functions for getting names from codes are public TranslationServer API, so removing it from code will be breaking change.

This also brings up the question of localizing all these strings, and honestly I wouldn't want to add thousands of languages and country names to our docs to localize. It's probably best left to users to do on a case by case basis, as the vast majority of projects would support max 10 languages, so they can put those strings in their own UI without relying on the ones we hardcode.

If we want to have it localized, it's more reasonable to bring it the full ICU localization database (which is 35 MB).

@akien-mga
Copy link
Member

akien-mga commented Mar 7, 2025

I wonder whether we need all these languages in core, or if we could make it editor only. Especially names of languages and countries, etc.

Functions for getting names from codes are public TranslationServer API, so removing it from code will be breaking change.

We could maybe leave that one unchanged (not add more languages to it, unless explicitly requested), and have another version that's complete and editor-only.

We could maybe also provide a way for users to add their own mappings, so if they want to support a language or country code that's not part of our database, they can add it. So we don't need to play catch with standards (and possibly geopolitics), and users can decide for themselves what they want to support.

Overall, my main concern is on the significant increase in binary size for the export template (that affects everyone) for a use case that's so far highly theoretical and only affects one user who wants to use Toki Pona that's not in the list.

@bruvzg
Copy link
Member Author

bruvzg commented Mar 7, 2025

With compression, it's about 125 KiB difference.

We could maybe also provide a way for users to add their own mappings, so if they want to support a language or country code that's not part of our database, they can add it. So we don't need to play catch with standards (and possibly geopolitics), and users can decide for themselves what they want to support.

This is already part of this PR, custom codes can be added in the project setting (and via locale selection dialog).

We could maybe leave that one unchanged (not add more languages to it, unless explicitly requested), and have another version that's complete and editor-only.

Maintaining two lists will be a mess, so not sure about it. But it's probably reasonable to remove all 639-3 codes and keep only 639-2 by default, since any codes can be added for a specific project, it won't be an issue.

@bruvzg bruvzg force-pushed the ex_locs branch 2 times, most recently from 937cc93 to f95e5fc Compare March 7, 2025 13:07
@bruvzg
Copy link
Member Author

bruvzg commented Mar 7, 2025

Let's split it for now, I have reverted lists set (+ "tok", compression, and both alpha 2 and 3 codes stored), but with new UI and ability to add custom codes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add ISO 639-3 Code "tok" (Toki Pona) to engine for Translation support.
3 participants