Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: allow CCITT image encoding #691

Closed
eroux opened this issue Feb 15, 2023 · 4 comments · Fixed by #695
Closed

Feature request: allow CCITT image encoding #691

eroux opened this issue Feb 15, 2023 · 4 comments · Fixed by #695

Comments

@eroux
Copy link

eroux commented Feb 15, 2023

I would like to be able to use the CCITT (TIFF) encoding for bitonal images. Currently these seem to be internally converted to grayscale pngs which produce a bigger output and requires more processing time. Now, for one or two files that is fine, but I'm in the process of creating PDFs for a few million images and the difference become really noticeable.

Here's an example:

from fpdf import FPDF

pdf = FPDF()
pdf.set_margin(0)
pdf.add_page(format=(2142,507))
pdf.image("020610001.tiff", x=0, y=0, w=2142, h=507)
pdf.output("test-tiff.pdf")

producing the following pdf (10,985 bytes):

test-tiff.pdf

while I would expect the following one (produced with itext, 7,126 bytes, 35% smaller):

test-tiff-expected.pdf

The transformation into PNG can be seen through pdfimages -all test-tiff.pdf test-tiff that produces the following test-tiff-000.png:

test-tiff-000

@eroux
Copy link
Author

eroux commented Feb 15, 2023

@Lucas-C
Copy link
Member

Lucas-C commented Feb 16, 2023

Hi @eroux!

Have you tried configuring image compression?
cf. https://pyfpdf.github.io/fpdf2/Images.html#image-compression

If that does not fit your need, the feature you suggest would make for a great addition to fpdf2 😊
Would you like to contribute a PR implementing this?

Based on the 1.7 PDF spec, this would mean implementing support for LZWDecode filter here: https://github.com/PyFPDF/fpdf2/blob/master/fpdf/image_parsing.py#L21

LZW (Lempel-Ziv-Welch) is a variable-length, adaptive compression method that has been adopted as one
of the standard compression methods in the Tag Image File Format (TIFF) standard. For details on LZW
encoding see 7.4.4.2, "Details of LZW Encoding."

@eroux
Copy link
Author

eroux commented Feb 16, 2023

In the PDF spec the encoding I'm thinking of is CCITTFaxDecode, which is clearly the best for bitonal pictures. I might create a PR, let's see

@Lucas-C Lucas-C added the image label Feb 16, 2023
@Lucas-C Lucas-C changed the title allow CCITT image encoding Feature request: allow CCITT image encoding Feb 16, 2023
@eroux
Copy link
Author

eroux commented Feb 16, 2023

PR on #695

Lucas-C added a commit that referenced this issue Feb 17, 2023
Co-authored-by: Lucas Cimon <925560+Lucas-C@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants