Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flattens .yml files for i18n #10503

Merged
merged 50 commits into from
May 9, 2024
Merged

Flattens .yml files for i18n #10503

merged 50 commits into from
May 9, 2024

Conversation

zachmargolis
Copy link
Contributor

@zachmargolis zachmargolis commented Apr 24, 2024

What's in the Branch

Based on the challenges of merging multi-line, nested YML files for translation, this PR proposes flattening them to files with flat keys, no indentation needed.

  • scripts/yml_to_flat_yml takes a batch of yml files and converts them to txt
  • I18nFlatYmlBackend implements loading these files for the I18n gem

Next Steps

If we wanted to land this, we'd need to:

  • update our normalize-yaml JS script to have an option to parse/fix these files as well
  • update our use of i18n-tasks (need to make a compatible parser for it, it doesn't reuse i18n ones 🙄 )

Example

Example of how to generate one of these .txt files:

find config/locales -type f -name '*es.yml' | xargs ./scripts/yml_to_txt > config/locales/es.txt

Format specification

Uses JSON to encode one-line forms of strings. If we need newlines inside strings, we'll use \n

string.key.with.parts: "JSON string version of value"

To support arrays, I went with "if all keys of a hash are numeric, it should be an array" (we have one case of a hash with mixed numbers and strings)

Alternative to handle arrays

Another approach could be a special symbol for arrays like:

a.b.c.#0: "first item"
a.b.c.#1: "second item"

- scripts/yml_to_txt takes a batch of yml files and converts them to txt
- I18nTxtBackend implements loading these files for the I18n gem
@zachmargolis zachmargolis requested a review from a team April 24, 2024 22:15
# @param filename [String] filename, assumed to have the locale slug in the filename such as "en.txt"
# @return [Array(Hash, Boolean)] tuple of a hash and keys_symbolized
def load_txt(filename)
locale = File.basename(filename, '.txt')
Copy link
Contributor Author

@zachmargolis zachmargolis Apr 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The locale is guessed from the filename because it lets us drop the locale from the key in the file. Without this, the files would be like:

en.foo.bar.baz: "test"

and that would mean keys would be harder to copy-paste across files.

@aduth
Copy link
Contributor

aduth commented Apr 25, 2024

I expect the identity-rails-i18n-webpack-plugin JavaScript package will need some updates here as well, since it's currently implemented to read the YAML files.

@zachmargolis zachmargolis changed the title Proof of Concept: Adds .txt for i18n files Proof of Concept: Flattens .yml for i18n Apr 25, 2024
@matthinz
Copy link
Contributor

It seems like we don't have that many strings that include newlines. I wonder if the "values" here could just be raw text to avoid having to deal with JSON encoding and make them easier to edit by hand. This would encourage splitting large blocks up into multiple strings and handling formatting concerns in the views instead.

Alternately, raw text with \n supported for a newline would probably be fine, since it's unlikely the actual text would ever contain that? You could even allow a literal \n via \\n or something. What I'm saying is we probably don't have to be perfect here and that maybe the ergonomics of a simpler editing format outweigh the need to allow strings that include the literal text "\n".

@zachmargolis
Copy link
Contributor Author

It seems like we don't have that many strings that include newlines. I wonder if the "values" here could just be raw text to avoid having to deal with JSON encoding and make them easier to edit by hand. This would encourage splitting large blocks up into multiple strings and handling formatting concerns in the views instead.

Alternately, raw text with \n supported for a newline would probably be fine, since it's unlikely the actual text would ever contain that? You could even allow a literal \n via \\n or something. What I'm saying is we probably don't have to be perfect here and that maybe the ergonomics of a simpler editing format outweigh the need to allow strings that include the literal text "\n".

After some messing around, it turns out that this format is valid YAML, so I switched it around and tools like prettier can work with it. I pulled the telephony files back out into their own separate YMLs since those do have explicit newlines and are a bit easier

changelog: Internal, Source code, Reformat i18n files to simplify merges
@zachmargolis zachmargolis changed the title Proof of Concept: Flattens .yml for i18n Flattens .yml files for i18n May 6, 2024
@zachmargolis zachmargolis marked this pull request as ready for review May 6, 2024 20:56
@aduth
Copy link
Contributor

aduth commented May 7, 2024

i18n_spec.rb includes a flatten_hash helper which I assume will be unnecessary after these changes. Could save for a follow-on pull request to reduce the scope here.

@@ -1,5 +1,6 @@
{
"singleQuote": true,
"trailingComma": "all",
"printWidth": 100
"printWidth": 100,
"proseWrap": "never"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will affect all Markdown formatted by Prettier. I don't have a strong feeling one way or the other, just noting it has a more widespread impact than just i18n strings.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to keep these .yml as un-wrapped as possible, so I could back this change out but add it to normalize-yaml script as an override?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine if we acknowledge this as a project-wide default, which seems reasonable enough?

But yeah, if needed, I think this could also go back to being able to configure normalize-yaml with a specific Prettier configuration if we wanted to behave differently specific for normalize-yaml / specific for specific sets of files normalized by normalize-yaml.

Co-authored-by: Andrew Duthie <1779930+aduth@users.noreply.github.com>

# Custom i18n backend that parse our "flat_yml" files into the nested
# hash structure that i18n works with
class I18nFlatYmlBackend < I18n::Backend::Simple
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to keep the locale at the top of the file and avoid the separate backend?

en:
  key: value

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be in favor of keeping the backend if it saves a huge amount of unnecessary indentation in those files.

Also based on comment in #10503 (comment), I think we could take this even further to customize the backend to consume the keys more directly (avoid backend structured as a nested hash)

@zachmargolis zachmargolis merged commit a79bba6 into main May 9, 2024
2 checks passed
@zachmargolis zachmargolis deleted the margolis-i18n-txt-backend branch May 9, 2024 18:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants