Feature request: allow CCITT image encoding #691

eroux · 2023-02-15T13:51:47Z

I would like to be able to use the CCITT (TIFF) encoding for bitonal images. Currently these seem to be internally converted to grayscale pngs which produce a bigger output and requires more processing time. Now, for one or two files that is fine, but I'm in the process of creating PDFs for a few million images and the difference become really noticeable.

Here's an example:

from fpdf import FPDF

pdf = FPDF()
pdf.set_margin(0)
pdf.add_page(format=(2142,507))
pdf.image("020610001.tiff", x=0, y=0, w=2142, h=507)
pdf.output("test-tiff.pdf")

producing the following pdf (10,985 bytes):

test-tiff.pdf

while I would expect the following one (produced with itext, 7,126 bytes, 35% smaller):

test-tiff-expected.pdf

The transformation into PNG can be seen through pdfimages -all test-tiff.pdf test-tiff that produces the following test-tiff-000.png:

The text was updated successfully, but these errors were encountered:

eroux · 2023-02-15T14:03:14Z

https://gitlab.mister-muffin.de/josch/img2pdf/src/branch/main/src/img2pdf.py#L804 seems to do that for example

Lucas-C · 2023-02-16T00:05:30Z

Hi @eroux!

Have you tried configuring image compression?
cf. https://pyfpdf.github.io/fpdf2/Images.html#image-compression

If that does not fit your need, the feature you suggest would make for a great addition to fpdf2 😊
Would you like to contribute a PR implementing this?

Based on the 1.7 PDF spec, this would mean implementing support for LZWDecode filter here: https://github.com/PyFPDF/fpdf2/blob/master/fpdf/image_parsing.py#L21

LZW (Lempel-Ziv-Welch) is a variable-length, adaptive compression method that has been adopted as one
of the standard compression methods in the Tag Image File Format (TIFF) standard. For details on LZW
encoding see 7.4.4.2, "Details of LZW Encoding."

eroux · 2023-02-16T07:10:31Z

In the PDF spec the encoding I'm thinking of is CCITTFaxDecode, which is clearly the best for bitonal pictures. I might create a PR, let's see

eroux · 2023-02-16T15:45:45Z

PR on #695

Co-authored-by: Lucas Cimon <925560+Lucas-C@users.noreply.github.com>

eroux added the enhancement label Feb 15, 2023

Lucas-C mentioned this issue Feb 16, 2023

Feature request: avoid altering images passed to FPDF.image() if there is no need to #693

Closed

Lucas-C added the image label Feb 16, 2023

Lucas-C changed the title ~~allow CCITT image encoding~~ Feature request: allow CCITT image encoding Feb 16, 2023

Lucas-C added the up-for-grabs label Feb 16, 2023

eroux mentioned this issue Feb 16, 2023

Bitonal images are now encoded using CCITTFaxDecode - fix #691 #695

Merged

4 tasks

Lucas-C closed this as completed in #695 Feb 17, 2023

Lucas-C added a commit that referenced this issue Feb 17, 2023

fix for #691 (#695)

b113acb

Co-authored-by: Lucas Cimon <925560+Lucas-C@users.noreply.github.com>

MartinThoma mentioned this issue Sep 29, 2024

TST: Add LzwCodec for encoding py-pdf/pypdf#2883

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request: allow CCITT image encoding #691

Feature request: allow CCITT image encoding #691

eroux commented Feb 15, 2023

eroux commented Feb 15, 2023

Lucas-C commented Feb 16, 2023

eroux commented Feb 16, 2023

eroux commented Feb 16, 2023

Feature request: allow CCITT image encoding #691

Feature request: allow CCITT image encoding #691

Comments

eroux commented Feb 15, 2023

eroux commented Feb 15, 2023

Lucas-C commented Feb 16, 2023

eroux commented Feb 16, 2023

eroux commented Feb 16, 2023