-
Notifications
You must be signed in to change notification settings - Fork 17
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
refactor: Major refactoring of functions to improve readability, effi…
…ciency and follow standard practices. (#139) * refactor: Reorganize hocr functions - Use more jinja templating instead of hardcoding strings - Simplified bounding box function - Changed parameter name for `_get_hocr_bounding_box` to `page_dimension` for more clarity. * samples: Added sample for convert to hocr * refactor: Reordering of classes in page.py * refactor: Re-added refactoring to remove extra `get_*()` methods in page.py - Added in #110 Lost in Merge * fix: Moved `templates` directory into package. - Required for template to work in installed library * chore: Ran isort and black * chore: Ran no-implicit-optional * refactor: Refactored document.py - improve readability, follow python conventions, and improve efficiency - Also, fixed a previously unknown bug where `Document.search_pages()` returned inaccurate results because it only searched paragraph.text, not page.text * refactor: Refactor gcs_utilities for readability/pythonic style * refactor: Refactor page.py to improve efficiency, readability and follow python conventions * refactor: Rename `Entity.documentai_entity` to `Entity.documentai_object` to match the page.py file * refactor: Move bounding box extraction to `docai_utilities.py` * refactor: Major Refactoring of converter_helpers.py to simplify/organize functions, reduce complexity, and increase readability * fix: Fixed refactor of export_images in document.py * refactor: Cleanup of blocks.py using `getattr()` * refactor: Refactoring of bbox_conversion.py to improve readability and efficiency * fix: Change _get_files() to send full gcs uri to _get_bytes() - Also reduce wait_time in tests * refactor: Move `converter_helpers.py` functions into `converter.py` - `converter.py` only had one external facing function that called an internal function with the same parameters. - Not sure if there was a specific reason for this setup, can be undone if needed. * chore(deps): update dependency google-cloud-documentai to v2.16.1 (#138) * fix: Change _get_files() to send full gcs uri to _get_bytes() - Also reduce wait_time in tests * refactor: Move `converter_helpers.py` functions into `converter.py` - `converter.py` only had one external facing function that called an internal function with the same parameters. - Not sure if there was a specific reason for this setup, can be undone if needed. * chore: Reran black formatting after merge conflict * refactor: Minor refactoring of test_bbox_conversion.py to improve readability * refactor: Changed blocks.py to block.py for consistency. - Changed how `Block` is initialized. - Changed `load_blocks_from_schema` into a `@classmethod` to simplify imports. * fix: Added Missing type annotations to `document.py` * fix: Add new filename for block.py into test_bbox_conversion.py * 🦉 Updates from OwlBot post-processor See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md * fix: Fix failing tests for Block class. Changed all fields to have types * fix: Changed `converter._get_bytes` to return a Tuple * chore: Addressed Code Review Comments - Removed FILES_TO_IGNORE - Simplification of logic in `_get_multiplier` `convert_bbox_to_docproto_bbox` - Addressed other lint errors - Adjusted function names to indicate not protected members. * fix: Remove extra reference to metadata_blob * fix: Change expected test output and remove references to `geometry` --------- Co-authored-by: Mend Renovate <bot@renovateapp.com> Co-authored-by: Owl Bot <gcf-owl-bot[bot]@users.noreply.github.com>
- Loading branch information
1 parent
fcf5dbd
commit 82ac823
Showing
22 changed files
with
1,596 additions
and
1,770 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.