Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reports in scripts like Chinese and Japanese should support wrapping by characters instead of words #433

Closed
carlhiggs opened this issue May 18, 2024 · 2 comments

Comments

@carlhiggs
Copy link
Collaborator

The PDF templating software we use, FPDF2, has functionality for wrapping text both by word and by character. However, to date only wrapping by word has been implemented. This means that even though our Chinese language translations have been validated, the wrapping using words means that formatting looks a bit odd. Often, lines will wrap where an English language number or phrase happens to be. For example, after '2022' or '25' in the following section on the 1000 cities challenge, line breaks have been incorrectly induced due to wrapping by word instead of by character:

image

This issue was reported by our colleagues Kimihiro Hino and @shiqin-liu respectively for Japanese and Chinese translations they were validating (thank you for bringing this to my attention!).

I lodged an issue on the FPDF2 software to request this feature be added: py-pdf/fpdf2#1159

The maintainers provided guidance on contributing this change myself, and I have just lodged a pull request: py-pdf/fpdf2#1176

I believe I have correctly implemented the functionality in the templates; preliminary checks (using English; but the principle is the same) suggest it should work.

Fingers crossed the maintainers review the pull request positively and it can be incorporated; then we can update our own Docker image with a newer version of fpdf once it is available on Conda forge. Once we've confirmed it does what we want with our own project, we can close this issue!

@carlhiggs
Copy link
Collaborator Author

I made upstream changes to fpdf2 to support this as per above link; these have now been merged, although it will take some time to be included in a formal release and then included in condaforge for us to use as we currently do in our Docker image.

One option could be to install from main fpdf2 GitHub branch until available on condaforge... Not ideal, but could fast track resolution of this issue for us.

carlhiggs added a commit that referenced this issue Jun 3, 2024
@carlhiggs
Copy link
Collaborator Author

As per above, I updated Docker image to use the main FPDF2 branch containing the fixes I added to optionally wrap text on Characters instead of Words for languages using ideogram characters (e.g. Chinese and Japanese) rather than words (like in English).

The result is that instead of incorrect wrapping that occurs when wrapping is induced during usage of an English term with spaces like the following in a Chinese Simplified rendition of the example report,
image
text will instead wrap on characters with the result of a more justified paragraph appearance, without unintended gaps where spacing happens to occur:
image

Once this is merged into main, this issue may be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant