Reports in scripts like Chinese and Japanese should support wrapping by characters instead of words #433

carlhiggs · 2024-05-18T12:34:15Z

The PDF templating software we use, FPDF2, has functionality for wrapping text both by word and by character. However, to date only wrapping by word has been implemented. This means that even though our Chinese language translations have been validated, the wrapping using words means that formatting looks a bit odd. Often, lines will wrap where an English language number or phrase happens to be. For example, after '2022' or '25' in the following section on the 1000 cities challenge, line breaks have been incorrectly induced due to wrapping by word instead of by character:

This issue was reported by our colleagues Kimihiro Hino and @shiqin-liu respectively for Japanese and Chinese translations they were validating (thank you for bringing this to my attention!).

I lodged an issue on the FPDF2 software to request this feature be added: py-pdf/fpdf2#1159

The maintainers provided guidance on contributing this change myself, and I have just lodged a pull request: py-pdf/fpdf2#1176

I believe I have correctly implemented the functionality in the templates; preliminary checks (using English; but the principle is the same) suggest it should work.

Fingers crossed the maintainers review the pull request positively and it can be incorporated; then we can update our own Docker image with a newer version of fpdf once it is available on Conda forge. Once we've confirmed it does what we want with our own project, we can close this issue!

carlhiggs · 2024-05-27T22:06:34Z

I made upstream changes to fpdf2 to support this as per above link; these have now been merged, although it will take some time to be included in a formal release and then included in condaforge for us to use as we currently do in our Docker image.

One option could be to install from main fpdf2 GitHub branch until available on condaforge... Not ideal, but could fast track resolution of this issue for us.

…cter wrapping in templates, addressing #433

carlhiggs · 2024-06-03T03:31:02Z

As per above, I updated Docker image to use the main FPDF2 branch containing the fixes I added to optionally wrap text on Characters instead of Words for languages using ideogram characters (e.g. Chinese and Japanese) rather than words (like in English).

The result is that instead of incorrect wrapping that occurs when wrapping is induced during usage of an English term with spaces like the following in a Chinese Simplified rendition of the example report,

text will instead wrap on characters with the result of a more justified paragraph appearance, without unintended gaps where spacing happens to occur:

Once this is merged into main, this issue may be closed.

carlhiggs added a commit that referenced this issue Jun 3, 2024

updated code to use newly implemented fpdf2 features to support chara…

5078592

…cter wrapping in templates, addressing #433

carlhiggs mentioned this issue Jun 3, 2024

Enhancements #440

Merged

carlhiggs closed this as completed Jun 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reports in scripts like Chinese and Japanese should support wrapping by characters instead of words #433

Reports in scripts like Chinese and Japanese should support wrapping by characters instead of words #433

carlhiggs commented May 18, 2024

carlhiggs commented May 27, 2024

carlhiggs commented Jun 3, 2024

Reports in scripts like Chinese and Japanese should support wrapping by characters instead of words #433

Reports in scripts like Chinese and Japanese should support wrapping by characters instead of words #433

Comments

carlhiggs commented May 18, 2024

carlhiggs commented May 27, 2024

carlhiggs commented Jun 3, 2024