Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: add capability to remove /Info from pypdf #2820

Merged
merged 50 commits into from
Sep 14, 2024

Conversation

pubpub-zz
Copy link
Collaborator

to be merged after #2811

Copy link

codecov bot commented Aug 28, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 95.99%. Comparing base (c4e95bd) to head (be48872).
Report is 95 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2820      +/-   ##
==========================================
+ Coverage   95.94%   95.99%   +0.04%     
==========================================
  Files          51       51              
  Lines        8758     8780      +22     
  Branches     1748     1753       +5     
==========================================
+ Hits         8403     8428      +25     
  Misses        209      209              
+ Partials      146      143       -3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@pubpub-zz
Copy link
Collaborator Author

This PR is ready for review (after #2811)
The uncoverd line will be addressed in another PR

@stefan6419846 stefan6419846 merged commit a790532 into py-pdf:main Sep 14, 2024
13 checks passed
@pubpub-zz pubpub-zz mentioned this pull request Sep 15, 2024
pubpub-zz added a commit that referenced this pull request Sep 17, 2024
## Version 5.0.0, 2024-09-15

This version drops support for Python 3.7 (not maintained since July 2023), PdfMerger (use PdfWriter instead) and AnnotationBuilder (use annotations instead).


### Deprecations (DEP)
- Remove the deprecated PfdMerger and AnnotationBuilder classes and other deprecations cleanup (#2813)
- Drop Python 3.7 support (#2793)

### New Features (ENH)
- Add capability to remove /Info from PDF (#2820)
- Add incremental capability to PdfWriter (#2811)
- Add UniGB-UTF16 encodings (#2819)
- Accept utf strings for metadata (#2802)
- Report PdfReadError instead of RecursionError (#2800)
- Compress PDF files merging identical objects (#2795)

### Bug Fixes (BUG)
- Fix sheared image (#2801)

### Robustness (ROB)
- Robustify .set_data() (#2821)
- Raise PdfReadError when missing /Root in trailer (#2808)
- Fix extract_text() issues on damaged PDFs (#2760)
- Handle images with empty data when processing an image from bytes (#2786)

### Developer Experience (DEV)
- Fix coverage uploads (#2832)
- Test against Python 3.13 (#2776)


[Full Changelog](4.3.1...5.0.0)
xmo-odoo added a commit to odoo-dev/odoo that referenced this pull request Jan 9, 2025
Even if 5.1 is not technically supported, debuntu might switch to it
any moment and we don't know if / when upstream will fix it, so it's
not worth the time bomb.

While technically it was always typed such, since py-pdf/pypdf#2820 it
looks like `_info` is a lot more likely to be `None` as
e.g. `clone_reader_document_root` now starts with unsetting
`_info_obj` and never re-sets it.

Except `add_metadata` was not updated to handle this case, likely
because mypy interprets `assert isinstance(self._info,
DictionaryObject)` as a type narrowing and trusts the developer, thus
does not report the type mismatch... and the assertion ends up blowing
in the user's face at runtime with a simple

    >>> r = pypdf.PdfReader(some_pdf_document)
    >>> w = pypdf.PdfWriter()
    >>> w.clone_reader_document_root(r)
    >>> w.add_metadata({"/foo": "bar"})
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "lib/python3.12/site-packages/pypdf/_writer.py", line 1622, in add_metadata
        assert isinstance(self._info, DictionaryObject)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    AssertionError

This is rather inconsiderate, so monkeypatch `add_metadata` to handle
the case where `_info` exists and is `None`.

Most of the credit goes to juwu for uncovering the issue.

opw-4372052
opw-4426881
Fixes odoo#185673

Co-Authored-By: Junqi <juwu@odoo.com>
robodoo pushed a commit to odoo/odoo that referenced this pull request Jan 9, 2025
Even if 5.1 is not technically supported, debuntu might switch to it
any moment and we don't know if / when upstream will fix it, so it's
not worth the time bomb.

While technically it was always typed such, since py-pdf/pypdf#2820 it
looks like `_info` is a lot more likely to be `None` as
e.g. `clone_reader_document_root` now starts with unsetting
`_info_obj` and never re-sets it.

Except `add_metadata` was not updated to handle this case, likely
because mypy interprets `assert isinstance(self._info,
DictionaryObject)` as a type narrowing and trusts the developer, thus
does not report the type mismatch... and the assertion ends up blowing
in the user's face at runtime with a simple

    >>> r = pypdf.PdfReader(some_pdf_document)
    >>> w = pypdf.PdfWriter()
    >>> w.clone_reader_document_root(r)
    >>> w.add_metadata({"/foo": "bar"})
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "lib/python3.12/site-packages/pypdf/_writer.py", line 1622, in add_metadata
        assert isinstance(self._info, DictionaryObject)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    AssertionError

This is rather inconsiderate, so monkeypatch `add_metadata` to handle
the case where `_info` exists and is `None`.

Most of the credit goes to juwu for uncovering the issue.

opw-4372052
opw-4426881
Fixes #185673

closes #193086

Signed-off-by: Xavier Morel (xmo) <xmo@odoo.com>
Co-authored-by: Junqi <juwu@odoo.com>
fw-bot pushed a commit to odoo-dev/odoo that referenced this pull request Jan 9, 2025
Even if 5.1 is not technically supported, debuntu might switch to it
any moment and we don't know if / when upstream will fix it, so it's
not worth the time bomb.

While technically it was always typed such, since py-pdf/pypdf#2820 it
looks like `_info` is a lot more likely to be `None` as
e.g. `clone_reader_document_root` now starts with unsetting
`_info_obj` and never re-sets it.

Except `add_metadata` was not updated to handle this case, likely
because mypy interprets `assert isinstance(self._info,
DictionaryObject)` as a type narrowing and trusts the developer, thus
does not report the type mismatch... and the assertion ends up blowing
in the user's face at runtime with a simple

    >>> r = pypdf.PdfReader(some_pdf_document)
    >>> w = pypdf.PdfWriter()
    >>> w.clone_reader_document_root(r)
    >>> w.add_metadata({"/foo": "bar"})
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "lib/python3.12/site-packages/pypdf/_writer.py", line 1622, in add_metadata
        assert isinstance(self._info, DictionaryObject)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    AssertionError

This is rather inconsiderate, so monkeypatch `add_metadata` to handle
the case where `_info` exists and is `None`.

Most of the credit goes to juwu for uncovering the issue.

opw-4372052
opw-4426881
Fixes odoo#185673

X-original-commit: 1fb35ad
Co-authored-by: Junqi <juwu@odoo.com>
fw-bot pushed a commit to odoo-dev/odoo that referenced this pull request Jan 9, 2025
Even if 5.1 is not technically supported, debuntu might switch to it
any moment and we don't know if / when upstream will fix it, so it's
not worth the time bomb.

While technically it was always typed such, since py-pdf/pypdf#2820 it
looks like `_info` is a lot more likely to be `None` as
e.g. `clone_reader_document_root` now starts with unsetting
`_info_obj` and never re-sets it.

Except `add_metadata` was not updated to handle this case, likely
because mypy interprets `assert isinstance(self._info,
DictionaryObject)` as a type narrowing and trusts the developer, thus
does not report the type mismatch... and the assertion ends up blowing
in the user's face at runtime with a simple

    >>> r = pypdf.PdfReader(some_pdf_document)
    >>> w = pypdf.PdfWriter()
    >>> w.clone_reader_document_root(r)
    >>> w.add_metadata({"/foo": "bar"})
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "lib/python3.12/site-packages/pypdf/_writer.py", line 1622, in add_metadata
        assert isinstance(self._info, DictionaryObject)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    AssertionError

This is rather inconsiderate, so monkeypatch `add_metadata` to handle
the case where `_info` exists and is `None`.

Most of the credit goes to juwu for uncovering the issue.

opw-4372052
opw-4426881
Fixes odoo#185673

X-original-commit: 1fb35ad
Co-authored-by: Junqi <juwu@odoo.com>
fw-bot pushed a commit to odoo-dev/odoo that referenced this pull request Jan 9, 2025
Even if 5.1 is not technically supported, debuntu might switch to it
any moment and we don't know if / when upstream will fix it, so it's
not worth the time bomb.

While technically it was always typed such, since py-pdf/pypdf#2820 it
looks like `_info` is a lot more likely to be `None` as
e.g. `clone_reader_document_root` now starts with unsetting
`_info_obj` and never re-sets it.

Except `add_metadata` was not updated to handle this case, likely
because mypy interprets `assert isinstance(self._info,
DictionaryObject)` as a type narrowing and trusts the developer, thus
does not report the type mismatch... and the assertion ends up blowing
in the user's face at runtime with a simple

    >>> r = pypdf.PdfReader(some_pdf_document)
    >>> w = pypdf.PdfWriter()
    >>> w.clone_reader_document_root(r)
    >>> w.add_metadata({"/foo": "bar"})
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "lib/python3.12/site-packages/pypdf/_writer.py", line 1622, in add_metadata
        assert isinstance(self._info, DictionaryObject)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    AssertionError

This is rather inconsiderate, so monkeypatch `add_metadata` to handle
the case where `_info` exists and is `None`.

Most of the credit goes to juwu for uncovering the issue.

opw-4372052
opw-4426881
Fixes odoo#185673

X-original-commit: 1fb35ad
Co-authored-by: Junqi <juwu@odoo.com>
fw-bot pushed a commit to odoo-dev/odoo that referenced this pull request Jan 9, 2025
Even if 5.1 is not technically supported, debuntu might switch to it
any moment and we don't know if / when upstream will fix it, so it's
not worth the time bomb.

While technically it was always typed such, since py-pdf/pypdf#2820 it
looks like `_info` is a lot more likely to be `None` as
e.g. `clone_reader_document_root` now starts with unsetting
`_info_obj` and never re-sets it.

Except `add_metadata` was not updated to handle this case, likely
because mypy interprets `assert isinstance(self._info,
DictionaryObject)` as a type narrowing and trusts the developer, thus
does not report the type mismatch... and the assertion ends up blowing
in the user's face at runtime with a simple

    >>> r = pypdf.PdfReader(some_pdf_document)
    >>> w = pypdf.PdfWriter()
    >>> w.clone_reader_document_root(r)
    >>> w.add_metadata({"/foo": "bar"})
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "lib/python3.12/site-packages/pypdf/_writer.py", line 1622, in add_metadata
        assert isinstance(self._info, DictionaryObject)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    AssertionError

This is rather inconsiderate, so monkeypatch `add_metadata` to handle
the case where `_info` exists and is `None`.

Most of the credit goes to juwu for uncovering the issue.

opw-4372052
opw-4426881
Fixes odoo#185673

X-original-commit: 1fb35ad
Co-authored-by: Junqi <juwu@odoo.com>
fw-bot pushed a commit to odoo-dev/odoo that referenced this pull request Jan 9, 2025
Even if 5.1 is not technically supported, debuntu might switch to it
any moment and we don't know if / when upstream will fix it, so it's
not worth the time bomb.

While technically it was always typed such, since py-pdf/pypdf#2820 it
looks like `_info` is a lot more likely to be `None` as
e.g. `clone_reader_document_root` now starts with unsetting
`_info_obj` and never re-sets it.

Except `add_metadata` was not updated to handle this case, likely
because mypy interprets `assert isinstance(self._info,
DictionaryObject)` as a type narrowing and trusts the developer, thus
does not report the type mismatch... and the assertion ends up blowing
in the user's face at runtime with a simple

    >>> r = pypdf.PdfReader(some_pdf_document)
    >>> w = pypdf.PdfWriter()
    >>> w.clone_reader_document_root(r)
    >>> w.add_metadata({"/foo": "bar"})
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "lib/python3.12/site-packages/pypdf/_writer.py", line 1622, in add_metadata
        assert isinstance(self._info, DictionaryObject)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    AssertionError

This is rather inconsiderate, so monkeypatch `add_metadata` to handle
the case where `_info` exists and is `None`.

Most of the credit goes to juwu for uncovering the issue.

opw-4372052
opw-4426881
Fixes odoo#185673

X-original-commit: 1fb35ad
Co-authored-by: Junqi <juwu@odoo.com>
robodoo pushed a commit to odoo/odoo that referenced this pull request Jan 10, 2025
Even if 5.1 is not technically supported, debuntu might switch to it
any moment and we don't know if / when upstream will fix it, so it's
not worth the time bomb.

While technically it was always typed such, since py-pdf/pypdf#2820 it
looks like `_info` is a lot more likely to be `None` as
e.g. `clone_reader_document_root` now starts with unsetting
`_info_obj` and never re-sets it.

Except `add_metadata` was not updated to handle this case, likely
because mypy interprets `assert isinstance(self._info,
DictionaryObject)` as a type narrowing and trusts the developer, thus
does not report the type mismatch... and the assertion ends up blowing
in the user's face at runtime with a simple

    >>> r = pypdf.PdfReader(some_pdf_document)
    >>> w = pypdf.PdfWriter()
    >>> w.clone_reader_document_root(r)
    >>> w.add_metadata({"/foo": "bar"})
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "lib/python3.12/site-packages/pypdf/_writer.py", line 1622, in add_metadata
        assert isinstance(self._info, DictionaryObject)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    AssertionError

This is rather inconsiderate, so monkeypatch `add_metadata` to handle
the case where `_info` exists and is `None`.

Most of the credit goes to juwu for uncovering the issue.

opw-4372052
opw-4426881
Fixes #185673

closes #193129

X-original-commit: 1fb35ad
Signed-off-by: Xavier Morel (xmo) <xmo@odoo.com>
Co-authored-by: Junqi <juwu@odoo.com>
robodoo pushed a commit to odoo/odoo that referenced this pull request Jan 10, 2025
Even if 5.1 is not technically supported, debuntu might switch to it
any moment and we don't know if / when upstream will fix it, so it's
not worth the time bomb.

While technically it was always typed such, since py-pdf/pypdf#2820 it
looks like `_info` is a lot more likely to be `None` as
e.g. `clone_reader_document_root` now starts with unsetting
`_info_obj` and never re-sets it.

Except `add_metadata` was not updated to handle this case, likely
because mypy interprets `assert isinstance(self._info,
DictionaryObject)` as a type narrowing and trusts the developer, thus
does not report the type mismatch... and the assertion ends up blowing
in the user's face at runtime with a simple

    >>> r = pypdf.PdfReader(some_pdf_document)
    >>> w = pypdf.PdfWriter()
    >>> w.clone_reader_document_root(r)
    >>> w.add_metadata({"/foo": "bar"})
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "lib/python3.12/site-packages/pypdf/_writer.py", line 1622, in add_metadata
        assert isinstance(self._info, DictionaryObject)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    AssertionError

This is rather inconsiderate, so monkeypatch `add_metadata` to handle
the case where `_info` exists and is `None`.

Most of the credit goes to juwu for uncovering the issue.

opw-4372052
opw-4426881
Fixes #185673

closes #193155

X-original-commit: 1fb35ad
Signed-off-by: Xavier Morel (xmo) <xmo@odoo.com>
Co-authored-by: Junqi <juwu@odoo.com>
robodoo pushed a commit to odoo/odoo that referenced this pull request Jan 10, 2025
Even if 5.1 is not technically supported, debuntu might switch to it
any moment and we don't know if / when upstream will fix it, so it's
not worth the time bomb.

While technically it was always typed such, since py-pdf/pypdf#2820 it
looks like `_info` is a lot more likely to be `None` as
e.g. `clone_reader_document_root` now starts with unsetting
`_info_obj` and never re-sets it.

Except `add_metadata` was not updated to handle this case, likely
because mypy interprets `assert isinstance(self._info,
DictionaryObject)` as a type narrowing and trusts the developer, thus
does not report the type mismatch... and the assertion ends up blowing
in the user's face at runtime with a simple

    >>> r = pypdf.PdfReader(some_pdf_document)
    >>> w = pypdf.PdfWriter()
    >>> w.clone_reader_document_root(r)
    >>> w.add_metadata({"/foo": "bar"})
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "lib/python3.12/site-packages/pypdf/_writer.py", line 1622, in add_metadata
        assert isinstance(self._info, DictionaryObject)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    AssertionError

This is rather inconsiderate, so monkeypatch `add_metadata` to handle
the case where `_info` exists and is `None`.

Most of the credit goes to juwu for uncovering the issue.

opw-4372052
opw-4426881
Fixes #185673

closes #193158

X-original-commit: 1fb35ad
Signed-off-by: Xavier Morel (xmo) <xmo@odoo.com>
Co-authored-by: Junqi <juwu@odoo.com>
robodoo pushed a commit to odoo/odoo that referenced this pull request Jan 10, 2025
Even if 5.1 is not technically supported, debuntu might switch to it
any moment and we don't know if / when upstream will fix it, so it's
not worth the time bomb.

While technically it was always typed such, since py-pdf/pypdf#2820 it
looks like `_info` is a lot more likely to be `None` as
e.g. `clone_reader_document_root` now starts with unsetting
`_info_obj` and never re-sets it.

Except `add_metadata` was not updated to handle this case, likely
because mypy interprets `assert isinstance(self._info,
DictionaryObject)` as a type narrowing and trusts the developer, thus
does not report the type mismatch... and the assertion ends up blowing
in the user's face at runtime with a simple

    >>> r = pypdf.PdfReader(some_pdf_document)
    >>> w = pypdf.PdfWriter()
    >>> w.clone_reader_document_root(r)
    >>> w.add_metadata({"/foo": "bar"})
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "lib/python3.12/site-packages/pypdf/_writer.py", line 1622, in add_metadata
        assert isinstance(self._info, DictionaryObject)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    AssertionError

This is rather inconsiderate, so monkeypatch `add_metadata` to handle
the case where `_info` exists and is `None`.

Most of the credit goes to juwu for uncovering the issue.

opw-4372052
opw-4426881
Fixes #185673

closes #193159

X-original-commit: 1fb35ad
Signed-off-by: Xavier Morel (xmo) <xmo@odoo.com>
Co-authored-by: Junqi <juwu@odoo.com>
robodoo pushed a commit to odoo/odoo that referenced this pull request Jan 10, 2025
Even if 5.1 is not technically supported, debuntu might switch to it
any moment and we don't know if / when upstream will fix it, so it's
not worth the time bomb.

While technically it was always typed such, since py-pdf/pypdf#2820 it
looks like `_info` is a lot more likely to be `None` as
e.g. `clone_reader_document_root` now starts with unsetting
`_info_obj` and never re-sets it.

Except `add_metadata` was not updated to handle this case, likely
because mypy interprets `assert isinstance(self._info,
DictionaryObject)` as a type narrowing and trusts the developer, thus
does not report the type mismatch... and the assertion ends up blowing
in the user's face at runtime with a simple

    >>> r = pypdf.PdfReader(some_pdf_document)
    >>> w = pypdf.PdfWriter()
    >>> w.clone_reader_document_root(r)
    >>> w.add_metadata({"/foo": "bar"})
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "lib/python3.12/site-packages/pypdf/_writer.py", line 1622, in add_metadata
        assert isinstance(self._info, DictionaryObject)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    AssertionError

This is rather inconsiderate, so monkeypatch `add_metadata` to handle
the case where `_info` exists and is `None`.

Most of the credit goes to juwu for uncovering the issue.

opw-4372052
opw-4426881
Fixes #185673

closes #193143

X-original-commit: 1fb35ad
Signed-off-by: Xavier Morel (xmo) <xmo@odoo.com>
Co-authored-by: Junqi <juwu@odoo.com>
robodoo pushed a commit to odoo/odoo that referenced this pull request Jan 10, 2025
Even if 5.1 is not technically supported, debuntu might switch to it
any moment and we don't know if / when upstream will fix it, so it's
not worth the time bomb.

While technically it was always typed such, since py-pdf/pypdf#2820 it
looks like `_info` is a lot more likely to be `None` as
e.g. `clone_reader_document_root` now starts with unsetting
`_info_obj` and never re-sets it.

Except `add_metadata` was not updated to handle this case, likely
because mypy interprets `assert isinstance(self._info,
DictionaryObject)` as a type narrowing and trusts the developer, thus
does not report the type mismatch... and the assertion ends up blowing
in the user's face at runtime with a simple

    >>> r = pypdf.PdfReader(some_pdf_document)
    >>> w = pypdf.PdfWriter()
    >>> w.clone_reader_document_root(r)
    >>> w.add_metadata({"/foo": "bar"})
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "lib/python3.12/site-packages/pypdf/_writer.py", line 1622, in add_metadata
        assert isinstance(self._info, DictionaryObject)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    AssertionError

This is rather inconsiderate, so monkeypatch `add_metadata` to handle
the case where `_info` exists and is `None`.

Most of the credit goes to juwu for uncovering the issue.

opw-4372052
opw-4426881
Fixes #185673

closes #193155

X-original-commit: 1fb35ad
Signed-off-by: Xavier Morel (xmo) <xmo@odoo.com>
Co-authored-by: Junqi <juwu@odoo.com>
gamarino pushed a commit to numaes/numa-public-odoo that referenced this pull request Jan 24, 2025
Even if 5.1 is not technically supported, debuntu might switch to it
any moment and we don't know if / when upstream will fix it, so it's
not worth the time bomb.

While technically it was always typed such, since py-pdf/pypdf#2820 it
looks like `_info` is a lot more likely to be `None` as
e.g. `clone_reader_document_root` now starts with unsetting
`_info_obj` and never re-sets it.

Except `add_metadata` was not updated to handle this case, likely
because mypy interprets `assert isinstance(self._info,
DictionaryObject)` as a type narrowing and trusts the developer, thus
does not report the type mismatch... and the assertion ends up blowing
in the user's face at runtime with a simple

    >>> r = pypdf.PdfReader(some_pdf_document)
    >>> w = pypdf.PdfWriter()
    >>> w.clone_reader_document_root(r)
    >>> w.add_metadata({"/foo": "bar"})
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "lib/python3.12/site-packages/pypdf/_writer.py", line 1622, in add_metadata
        assert isinstance(self._info, DictionaryObject)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    AssertionError

This is rather inconsiderate, so monkeypatch `add_metadata` to handle
the case where `_info` exists and is `None`.

Most of the credit goes to juwu for uncovering the issue.

opw-4372052
opw-4426881
Fixes #185673

closes odoo/odoo#193155

X-original-commit: 1fb35add8090b5fc9e706aad67f6bb38432273e0
Signed-off-by: Xavier Morel (xmo) <xmo@odoo.com>
Co-authored-by: Junqi <juwu@odoo.com>
BurkhalterY pushed a commit to Burkhalter-IT/odoo that referenced this pull request Jan 28, 2025
Even if 5.1 is not technically supported, debuntu might switch to it
any moment and we don't know if / when upstream will fix it, so it's
not worth the time bomb.

While technically it was always typed such, since py-pdf/pypdf#2820 it
looks like `_info` is a lot more likely to be `None` as
e.g. `clone_reader_document_root` now starts with unsetting
`_info_obj` and never re-sets it.

Except `add_metadata` was not updated to handle this case, likely
because mypy interprets `assert isinstance(self._info,
DictionaryObject)` as a type narrowing and trusts the developer, thus
does not report the type mismatch... and the assertion ends up blowing
in the user's face at runtime with a simple

    >>> r = pypdf.PdfReader(some_pdf_document)
    >>> w = pypdf.PdfWriter()
    >>> w.clone_reader_document_root(r)
    >>> w.add_metadata({"/foo": "bar"})
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "lib/python3.12/site-packages/pypdf/_writer.py", line 1622, in add_metadata
        assert isinstance(self._info, DictionaryObject)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    AssertionError

This is rather inconsiderate, so monkeypatch `add_metadata` to handle
the case where `_info` exists and is `None`.

Most of the credit goes to juwu for uncovering the issue.

opw-4372052
opw-4426881
Fixes odoo#185673

closes odoo#193086

Signed-off-by: Xavier Morel (xmo) <xmo@odoo.com>
Co-authored-by: Junqi <juwu@odoo.com>
StephaneMangin pushed a commit to camptocamp/odoo that referenced this pull request Feb 6, 2025
Even if 5.1 is not technically supported, debuntu might switch to it
any moment and we don't know if / when upstream will fix it, so it's
not worth the time bomb.

While technically it was always typed such, since py-pdf/pypdf#2820 it
looks like `_info` is a lot more likely to be `None` as
e.g. `clone_reader_document_root` now starts with unsetting
`_info_obj` and never re-sets it.

Except `add_metadata` was not updated to handle this case, likely
because mypy interprets `assert isinstance(self._info,
DictionaryObject)` as a type narrowing and trusts the developer, thus
does not report the type mismatch... and the assertion ends up blowing
in the user's face at runtime with a simple

    >>> r = pypdf.PdfReader(some_pdf_document)
    >>> w = pypdf.PdfWriter()
    >>> w.clone_reader_document_root(r)
    >>> w.add_metadata({"/foo": "bar"})
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "lib/python3.12/site-packages/pypdf/_writer.py", line 1622, in add_metadata
        assert isinstance(self._info, DictionaryObject)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    AssertionError

This is rather inconsiderate, so monkeypatch `add_metadata` to handle
the case where `_info` exists and is `None`.

Most of the credit goes to juwu for uncovering the issue.

opw-4372052
opw-4426881
Fixes odoo#185673

closes odoo#193155

X-original-commit: 1fb35ad
Signed-off-by: Xavier Morel (xmo) <xmo@odoo.com>
Co-authored-by: Junqi <juwu@odoo.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants