Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEV: Unify code between PdfReader and PdfWriter #2497

Merged
merged 54 commits into from
Mar 25, 2024

Conversation

pubpub-zz
Copy link
Collaborator

@pubpub-zz pubpub-zz commented Mar 2, 2024

requires #2495
closes #2531

commonalize functions between pdfreader and pdfwriter
caution pdf_header had to change in PdfWriter
@pubpub-zz
Copy link
Collaborator Author

@MartinThoma / @stefan6419846
I've started to prepare commonilisation but I've met a discrepancy in return types from pdf_header between PdfReader and PdfWriter.
I consider that this is acceptable to align PdfWriter onto PdfReader. Your opinions?

Copy link

codecov bot commented Mar 2, 2024

Codecov Report

Attention: Patch coverage is 94.46023% with 39 lines in your changes are missing coverage. Please review.

Project coverage is 94.67%. Comparing base (f8edf3c) to head (7c8c168).

❗ Current head 7c8c168 differs from pull request most recent head cd6d07b. Consider uploading reports for the commit cd6d07b to get more accurate results

Files Patch % Lines
pypdf/_doc_common.py 94.83% 12 Missing and 19 partials ⚠️
pypdf/_writer.py 85.71% 6 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2497      +/-   ##
==========================================
+ Coverage   94.48%   94.67%   +0.19%     
==========================================
  Files          49       50       +1     
  Lines        8181     8230      +49     
  Branches     1660     1646      -14     
==========================================
+ Hits         7730     7792      +62     
+ Misses        280      268      -12     
+ Partials      171      170       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@stefan6419846
Copy link
Collaborator

I consider that this is acceptable to align PdfWriter onto PdfReader. Your opinions?

How likely do you consider this to be an issue on the user side, id est how likely is it that this change is going to break user code? We do not have a real policy for these cases (where a deprecation process is not really feasible) at the moment and I do not see an easy solution for it, but I do not like knowingly breaking user code without a major release either, especially when we have a deprecation policy.

@pubpub-zz
Copy link
Collaborator Author

@stefan6419846 / @MartinThoma
still progressing on my dev, facing new questions 😉
Although we have now a good test bank, I'm wondering if we should not propose a first beta release. Do you have any idea how we could propose that ?

@stefan6419846
Copy link
Collaborator

Although we have now a good test bank, I'm wondering if we should not propose a first beta release. Do you have any idea how we could propose that ?

In theory there a pre-releases on GitHub itself and https://test.pypi.org, but this has been uncommon for pypdf until now and I suspect that general attention/adaption will be rather limited.

@MartinThoma
Copy link
Member

MartinThoma commented Mar 9, 2024

I haven't seen test.pypi.org being used.

I did see pre-releases (e.g. https://pypi.org/project/eth-tester/0.1.0b7 or https://pypi.org/project/pydantic/2.6.0b1/ )

However, I'm uncertain if they are worth the effort.

@pubpub-zz pubpub-zz marked this pull request as draft March 23, 2024 10:49
@pubpub-zz pubpub-zz marked this pull request as ready for review March 24, 2024 12:02
@pubpub-zz
Copy link
Collaborator Author

@stefan6419846
I propose the PR with the current coverage. improvement could be identified later

@pubpub-zz
Copy link
Collaborator Author

@stefan6419846
are you expecting any further work before merging?

@stefan6419846
Copy link
Collaborator

@pubpub-zz No, I just did not have enough time this morning to finish this regarding incorporating the latest changes from main.

@stefan6419846 stefan6419846 merged commit 24709a3 into py-pdf:main Mar 25, 2024
13 checks passed
stefan6419846 added a commit that referenced this pull request Apr 7, 2024
REL: 4.2.0

## What's new

### New Features (ENH)
- Allow multiple charsets for NameObject.read_from_stream (#2585) by @pubpub-zz
- Add support for /Kids in page labels (#2562) by @stefan6419846
- Allow to update fields on many pages (#2571) by @pubpub-zz
- Tolerate PDF with invalid xref pointed objects (#2335) by @pubpub-zz
- Add Enforce from PDF2.0 in viewer_preferences (#2511) by @pubpub-zz
- Add += and -= operators to ArrayObject (#2510) by @pubpub-zz

### Bug Fixes (BUG)
- Fix merge_page sometimes generating unknown operator 'QQ' (#2588) by @rfotino
- Fix fields update where annotations are kids of field (#2570) by @pubpub-zz
- Process CMYK images without a filter correctly (#2557) by @pubpub-zz
- Extract text in layout mode without finding resources (#2555) by @pubpub-zz
- Prevent recursive loop in some PDF files (#2505) by @pubpub-zz

### Robustness (ROB)
- Tolerate "truncated" xref (#2580) by @pubpub-zz
- Replace error by warning for EOD in RunLengthDecode/ASCIIHexDecode (#2334) by @pubpub-zz
- Rebuild xref table if one entry is invalid (#2528) by @pubpub-zz
- Robustify stream extraction (#2526) by @pubpub-zz

### Documentation (DOC)
- Update release process for latest changes (#2564) by @stefan6419846
- Encryption/decryption: Clone document instead of copying all pages (#2546) by @redfast00
- Minor improvements (#2542) by @j-t-1
- Update annotation list (#2534) by @j-t-1
- Update references and formatting (#2529) by @j-t-1
- Correct threads reference, plus minor changes (#2521) by @j-t-1
- Minor readability increases (#2515) by @j-t-1
- Simplify PaperSize examples (#2504) by @j-t-1
- Minor improvements (#2501) by @j-t-1

### Developer Experience (DEV)
- Remove unused dependencies (#2572) by @stefan6419846
- Remove page labels PR link from message (#2561) by @stefan6419846
- Fix changelog generator regarding whitespace and handling of "Other" group (#2492) by @stefan6419846
- Add REL to known PR prefixes (#2554) by @stefan6419846
- Release using the REL commit instead of git tag (#2500) by @MartinThoma
- Unify code between PdfReader and PdfWriter (#2497) by @pubpub-zz
- Bump softprops/action-gh-release from 1 to 2 (#2514) by @dependabot[bot]

### Maintenance (MAINT)
- Ressources → Resources (and internal name childs) (#2550) by @pubpub-zz
- Fix typos found by codespell (#2549) by @stefan6419846
- Update Read the Docs configuration (#2538) by @j-t-1
- Add root_object, _info and _ID to PdfReader (#2495) by @pubpub-zz

### Testing (TST)
- Allow loading truncated images if required (#2586) by @stefan6419846
- Fix download issues from #2562 (#2578) by @pubpub-zz
- Improve test_get_contents_from_nullobject to show real use-case (#2524) by @stefan6419846
- Add missing test annotations (#2507) by @stefan6419846

[Full Changelog](4.1.0...4.2.0)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants