Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update_page_form_field_values wrong encoding #2035

Closed
coolbombom opened this issue Jul 28, 2023 · 1 comment · Fixed by #2047
Closed

update_page_form_field_values wrong encoding #2035

coolbombom opened this issue Jul 28, 2023 · 1 comment · Fixed by #2047
Labels
workflow-forms From a users perspective, forms is the affected feature/workflow

Comments

@coolbombom
Copy link

coolbombom commented Jul 28, 2023

Hi

when using update_page_form_field_values special characters (æ,ø,å in denmark) is not correct encoded. ø bcomes ø

I tried to use the function update_page_form_field_values from PyPDF2 version 3.0.0 (https://pypdf2.readthedocs.io/en/3.0.0/_modules/PyPDF2/_writer.html#PdfWriter.update_page_form_field_values) inside version pypdf 3.13.0. Replaced the function from the old version inside the new. Then æ,ø,å worked fine, so there is an issue with the new update_page_form_field_values function.

The old update_page_form_field_values function has other flaws though, like already filled form fields not being updated. see my issue here: #2034

Environment

Which environment were you using when you encountered the problem?

$ python -m platform
Linux-6.2.12-arch1-1-x86_64-with-glibc2.37

$ python -c "import pypdf;print(pypdf.__version__)"
3.13.0

Code

from PyPDF2 import PdfWriter

dst_file = "test_output.pdf"
writer = PdfWriter()
writer.append("test.pdf")
form_fields = {"Text Box 1":"test æ ø å"}
for idx,page in enumerate(writer.pages):
     writer.update_page_form_field_values(writer.pages[idx], form_fields)
with open(dst_file, "wb") as output_stream:
     writer.write(output_stream)
writer.close()

test.pdf

Traceback

no traceback

@pubpub-zz
Copy link
Collaborator

@coolbombom
the old code was not generating text appearance : it was the pdf viewer who was generating them.
However you are right there is a improvement to be done to cope with your case

pubpub-zz added a commit to pubpub-zz/pypdf that referenced this issue Jul 30, 2023
coolbombom added a commit to coolbombom/pypdf that referenced this issue Jul 31, 2023
Added encoding ability to update_page_form_field_values so functionality uses that encoding to populate from fields.

Fixes issue that i had:
py-pdf#2035
@MartinThoma MartinThoma added the workflow-forms From a users perspective, forms is the affected feature/workflow label Aug 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
workflow-forms From a users perspective, forms is the affected feature/workflow
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants