Tags · skyeyester/pypdf

5.1.0

REL: 5.1.0

## What's new

### New Features (ENH)
- Add `layout_mode_font_height_weight` argument to `PageObject.extract_text()` (py-pdf#2920) by @hpierre001

### Bug Fixes (BUG)
- Fix font specificier for FreeText annotation (py-pdf#2893) by @ssjkamei
- Line breaks are not generated due to incorrect calculation of text leading (py-pdf#2890) by @ssjkamei
- Improve handling of spaces in text extraction (py-pdf#2882) by @ssjkamei

### Robustness (ROB)
- Soft failure for flate encode image mode 1 with wrong LUT size (py-pdf#2900) by @stefan6419846

### Documentation (DOC)
- Use latest package versions (py-pdf#2907) by @stefan6419846
- Correct example of reading FileAttachment annotation (py-pdf#2906) by @j-t-1

### Developer Experience (DEV)
- Update pinned requirements (py-pdf#2918) by @stefan6419846
- Make make_release.py compatible with Windows environment (py-pdf#2894) by @pubpub-zz

### Maintenance (MAINT)
- Remove references to outdated Python versions (py-pdf#2919) by @stefan6419846
- Generalize the method of obtaining space_code (py-pdf#2891) by @ssjkamei
- Unnecessary character mapping process (py-pdf#2888) by @ssjkamei
- New LZW decoding implementation (py-pdf#2887) by @MartinThoma

### Testing (TST)
- Add LzwCodec for encoding (py-pdf#2883) by @MartinThoma

### Code Style (STY)
- Capitalize error messages (py-pdf#2903) by @j-t-1
- Modify error messages in PdfWriter (py-pdf#2902) by @j-t-1

[Full Changelog](py-pdf/pypdf@5.0.1...5.1.0)

Oct 27, 2024
9f647e6
zip
tar.gz

5.0.1

REL: 5.0.1 (py-pdf#2884)

## Version 5.0.1, 2024-09-29

### New Features (ENH)
- Add `full` parameter to PdfWriter constructor (py-pdf#2865)

### Bug Fixes (BUG)
- Update pyproject.toml with minimum Python version of 3.8 (py-pdf#2859)
- Cope with unbalanced delimiters in dictionary object (py-pdf#2878)
- Cope with encoding with too many differences (py-pdf#2873)
- Missing spaces in extract_text() method (py-pdf#1328) (py-pdf#2868)
- Tolerate truncated files and no warning when jumping startxref (py-pdf#2855)

### Robustness (ROB)
- Repair PDF with invalid Root object (py-pdf#2880)
- Continue parsing dictionary object when error is detected (py-pdf#2872)
- Merge documents with invalid pages in named destinations (py-pdf#2857)
- Tolerate comments in arrays (py-pdf#2856)

### Developer Experience (DEV)
- Use latest Python version for benchmarking (py-pdf#2879)

### Maintenance (MAINT)
- Add tests to source distributions (py-pdf#2874)
- Refactor _update_field_annotation (py-pdf#2862)

[Full Changelog](py-pdf/pypdf@5.0.0...5.0.1)

Sep 29, 2024
ab21802
zip
tar.gz

5.0.0

REL: 5.0.0 (py-pdf#2851)

## Version 5.0.0, 2024-09-15

This version drops support for Python 3.7 (not maintained since July 2023), PdfMerger (use PdfWriter instead) and AnnotationBuilder (use annotations instead).


### Deprecations (DEP)
- Remove the deprecated PfdMerger and AnnotationBuilder classes and other deprecations cleanup (py-pdf#2813)
- Drop Python 3.7 support (py-pdf#2793)

### New Features (ENH)
- Add capability to remove /Info from PDF (py-pdf#2820)
- Add incremental capability to PdfWriter (py-pdf#2811)
- Add UniGB-UTF16 encodings (py-pdf#2819)
- Accept utf strings for metadata (py-pdf#2802)
- Report PdfReadError instead of RecursionError (py-pdf#2800)
- Compress PDF files merging identical objects (py-pdf#2795)

### Bug Fixes (BUG)
- Fix sheared image (py-pdf#2801)

### Robustness (ROB)
- Robustify .set_data() (py-pdf#2821)
- Raise PdfReadError when missing /Root in trailer (py-pdf#2808)
- Fix extract_text() issues on damaged PDFs (py-pdf#2760)
- Handle images with empty data when processing an image from bytes (py-pdf#2786)

### Developer Experience (DEV)
- Fix coverage uploads (py-pdf#2832)
- Test against Python 3.13 (py-pdf#2776)


[Full Changelog](py-pdf/pypdf@4.3.1...5.0.0)

Sep 17, 2024
637bc44
zip
tar.gz

4.3.1

## Version 4.3.1, 2024-07-21

### Bug Fixes (BUG)
- Cope with Matrix entry in field annotations (py-pdf#2736)

### Robustness (ROB)
- Cope with fields with upside down box/rectangle (py-pdf#2729)

### Maintenance (MAINT)
- Add deprecate_with_replacement to StreamObject.initializeFromD… (py-pdf#2728)
- Deal with cryptography>=43 moving ARC4 (py-pdf#2765)

[Full Changelog](py-pdf/pypdf@4.3.0...4.3.1)

Jul 21, 2024
8f62120
zip
tar.gz

4.3.0

REL: 4.3.0

## What's new

### New Features (ENH)
- Accept ETen-B5 and UniCNS-UTF16 encodings (py-pdf#2721) by @pubpub-zz
- Add decode_as_image() to ContentStreams (py-pdf#2615) by @pubpub-zz
- context manager for PdfReader (py-pdf#2666) by @tibor-reiss
- Add capability to set font and size in fields (py-pdf#2636) by @pubpub-zz
- Allow to pass input file without named argument (py-pdf#2576) by @pubpub-zz

### Bug Fixes (BUG)
- Fix deprecation for Ressources when using old constants (py-pdf#2705) by @stefan6419846
- Fix images issue 4 bits encoding and LUT starting with UTF16_BOM (py-pdf#2675) by @pubpub-zz
- Reading large compressed images takes huge time to process (py-pdf#2644) by @snanda85
- Highlighted Text Cannot Be Printed (py-pdf#2604) by @Nifury
- Fix UnboundLocalError on malformed pdf (py-pdf#2619) by @farjasju

### Documentation (DOC)
- Various improvements on docstrings and examples by @j-t-1

### Robustness (ROB)
- Cope with missing Standard 14 fonts in fields (py-pdf#2677) by @pubpub-zz
- Improve inline image extraction (py-pdf#2622) by @pubpub-zz
- Cope with loops in Fields tree (py-pdf#2656) by @pubpub-zz
- Discard /I in choice fields for compatibility with Acrobat (py-pdf#2614) by @pubpub-zz
- Cope with some issues in pillow (py-pdf#2595) by @pubpub-zz
- Cope with some image extraction issues (py-pdf#2591) by @pubpub-zz

### Maintenance (MAINT)
- Deprecate interiour_color with replacement interior_color (py-pdf#2706) by @j-t-1
- Add deprecate_with_replacement to PdfWriter.find_bookmark (py-pdf#2674) by @j-t-1

### Code Style (STY)
- Change Link to be a non-markup annotation (py-pdf#2714) by @j-t-1

[Full Changelog](py-pdf/pypdf@4.2.0...4.3.0)

Jul 14, 2024
d3ef5e5
zip
tar.gz

4.2.0

Version 4.2.0, 2024-04-07

## What's new

### New Features (ENH)
- Allow multiple charsets for NameObject.read_from_stream (py-pdf#2585)
- Add support for /Kids in page labels (py-pdf#2562)
- Allow to update fields on many pages (py-pdf#2571)
- Tolerate PDF with invalid xref pointed objects (py-pdf#2335)
- Add Enforce from PDF2.0 in viewer_preferences (py-pdf#2511)
- Add += and -= operators to ArrayObject (py-pdf#2510)

### Bug Fixes (BUG)
- Fix merge_page sometimes generating unknown operator 'QQ' (py-pdf#2588)
- Fix fields update where annotations are kids of field (py-pdf#2570)
- Process CMYK images without a filter correctly (py-pdf#2557)
- Extract text in layout mode without finding resources (py-pdf#2555)
- Prevent recursive loop in some PDF files (py-pdf#2505)

### Robustness (ROB)
- Tolerate "truncated" xref (py-pdf#2580)
- Replace error by warning for EOD in RunLengthDecode/ASCIIHexDecode (py-pdf#2334)
- Rebuild xref table if one entry is invalid (py-pdf#2528)
- Robustify stream extraction (py-pdf#2526)

### Documentation (DOC)
- Update release process for latest changes (py-pdf#2564)
- Encryption/decryption: Clone document instead of copying all pages (py-pdf#2546)
- Minor improvements (py-pdf#2542)
- Update annotation list (py-pdf#2534)
- Update references and formatting (py-pdf#2529)
- Correct threads reference, plus minor changes (py-pdf#2521)
- Minor readability increases (py-pdf#2515)
- Simplify PaperSize examples (py-pdf#2504)
- Minor improvements (py-pdf#2501)

### Developer Experience (DEV)
- Remove unused dependencies (py-pdf#2572)
- Remove page labels PR link from message (py-pdf#2561)
- Fix changelog generator regarding whitespace and handling of "Other" group (py-pdf#2492)
- Add REL to known PR prefixes (py-pdf#2554)
- Release using the REL commit instead of git tag (py-pdf#2500)
- Unify code between PdfReader and PdfWriter (py-pdf#2497)
- Bump softprops/action-gh-release from 1 to 2 (py-pdf#2514)

### Maintenance (MAINT)
- Ressources → Resources (and internal name childs) (py-pdf#2550)
- Fix typos found by codespell (py-pdf#2549)
- Update Read the Docs configuration (py-pdf#2538)
- Add root_object, _info and _ID to PdfReader (py-pdf#2495)

### Testing (TST)
- Allow loading truncated images if required (py-pdf#2586)
- Fix download issues from py-pdf#2562 (py-pdf#2578)
- Improve test_get_contents_from_nullobject to show real use-case (py-pdf#2524)
- Add missing test annotations (py-pdf#2507)

[Full Changelog](py-pdf/pypdf@4.1.0...4.2.0)

Apr 7, 2024
2ac88e6
zip
tar.gz

4.1.0

Version 4.1.0, 2024-03-03

## What's new

### New Features (ENH)
-  Add get_pages_from_field  (py-pdf#2494) by @pubpub-zz
-  Add reattach_fields function (py-pdf#2480) by @pubpub-zz
-  Automatic access to pointed object for IndirectObject (py-pdf#2464) by @pubpub-zz

### Bug Fixes (BUG)
-  missing error on name without leading / (py-pdf#2387) by @Rak424
-  encode_pdfdocencoding() always returns bytes (py-pdf#2440) by @sbourlon
-  BI in text content identified as image tag (py-pdf#2459) by @pubpub-zz

### Robustness (ROB)
-  Missing basefont entry in type 3 font (py-pdf#2469) by @pubpub-zz

### Documentation (DOC)
-  Amend robustness documentation (py-pdf#2479) by @j-t-1

### Developer Experience (DEV)
-  Fix changelog for UTF-8 characters (py-pdf#2462) by @stefan6419846

### Maintenance (MAINT)
-  Add _get_page_number_from_indirect in writer (py-pdf#2493) by @pubpub-zz
-  Remove user assignment for feature requests (py-pdf#2483) by @stefan6419846
-  Remove reference to old 2.0.0 branch (py-pdf#2482) by @stefan6419846

### Testing (TST)
-  Fix benchmark failures (py-pdf#2481) by @stefan6419846
-  Resolve file naming conflict in test_iss1767 (py-pdf#2445) by @sbourlon

[Full Changelog](py-pdf/pypdf@4.0.2...4.1.0)

Mar 3, 2024
6cf47c5
zip
tar.gz

4.0.2

Version 4.0.2, 2024-02-18

## What's new

### Bug Fixes (BUG)
-  Use NumberObject for /Border elements of annotations (py-pdf#2451) by @rsinger417

### Documentation (DOC)
-  Document easier way to update metadata (py-pdf#2454) by @stefan6419846
-  Typo `Polyline` \xe2\x86\x92 `PolyLine` in adding-pdf-annotations.md (py-pdf#2426) by @CWKSC

### Developer Experience (DEV)
-  Bump codecov/codecov-action from 3 to 4 (py-pdf#2430) by @dependabot[bot]

### Testing (TST)
-  Avoid catching not emitted warnings (py-pdf#2429) by @stefan6419846

[Full Changelog](py-pdf/pypdf@4.0.1...4.0.2)

Feb 18, 2024
cc306ad
zip
tar.gz

4.0.1

Version 4.0.1, 2024-01-28

## What's new

### Bug Fixes (BUG)
-  layout mode text extraction ZeroDivisionError (py-pdf#2417) by @shartzog

### Testing (TST)
-  Skip tests using fpdf2 if it\'s not installed (py-pdf#2419) by @MartinThoma

[Full Changelog](py-pdf/pypdf@4.0.0...4.0.1)

Jan 28, 2024
7579329
zip
tar.gz

4.0.0

Version 4.0.0, 2024-01-19

## What's new

pypdf==4.0.0 is a big milestone forward:

* We finally have a layout-mode text extraction.
    This enables users who want to detect / extract tables
    with heuristics to give it a try.
* We deprecated a lot of the old PyPDF2 API that was either
    not following PEP8 naming styles or was not using a
    property. Users comming from PyPDF2 might want to switch
    first to pypdf<4.0.0 to get helpful error messages
    that show the new API in their speicific cases.

A big 'Thank you!' the the whole pypdf community for your
work. Thanks to you, pypdf is better than ever.

Kudos to @shartzog who added the layout-mode with his first
contribution!

### Deprecations (DEP)
-  Drop Python 3.6 support (py-pdf#2369) by @MartinThoma
-  Remove deprecated code (py-pdf#2367) by @MartinThoma
-  Remove deprecated XMP properties (py-pdf#2386) by @stefan6419846

### New Features (ENH)
-  Add "layout" mode for text extraction (py-pdf#2388) by @shartzog
-  Add Jupyter Notebook integration for PdfReader (py-pdf#2375) by @MartinThoma
-  Improve/rewrite PDF permission retrieval (py-pdf#2400) by @stefan6419846

### Bug Fixes (BUG)
-  PdfWriter.add_uri was setting the wrong type (py-pdf#2406) by @pmiller66
-  Add support for GBK2K cmaps (py-pdf#2385) by @stefan6419846

### Documentation (DOC)
-  Add pmiller66 for py-pdf#2406 as a contributor by @MartinThoma
-  Add missing expand parameter (py-pdf#2393) by @Atomnp
-  Resolve build warnings (py-pdf#2380) by @stefan6419846
-  Fix testing prerequisites (py-pdf#2381) by @stefan6419846
-  Improve formatting of contributors page (py-pdf#2383) by @stefan6419846
-  Add Tobeabellwether as a contributor for py-pdf#2341 by @MartinThoma

### Developer Experience (DEV)
-  Make dependabot aware of our PR prefixes (py-pdf#2415) by @stefan6419846
-  Fail on Sphinx issues (py-pdf#2405) by @stefan6419846
-  Move title check to own workflow (py-pdf#2384) by @MasterOdin
-  Write to temporary files instead of the working directory (py-pdf#2379) by @stefan6419846
-  Ensure that the PR titles have the correct format (py-pdf#2378) by @stefan6419846

### Maintenance (MAINT)
-  Return None instead of -1 when page is not attached (py-pdf#2376) by @MartinThoma
-  Complete FileSpecificationDictionaryEntries constants (py-pdf#2416) by @MartinThoma
-  Replace warning with logging.error (py-pdf#2377) by @MartinThoma

### Testing (TST)
-  Add missing pytest.mark.samples annotations (py-pdf#2412) by @kitterma
-  Correctly close temporary files (py-pdf#2396) by @stefan6419846
-  Fix  side effect py-pdf#2379 (py-pdf#2395) by @pubpub-zz
-  Add test for layout extraction mode (py-pdf#2390) by @MartinThoma

### Code Style (STY)
-  Use the UserAccessPermissions enum (py-pdf#2398) by @MartinThoma
-  Run black (py-pdf#2370) by @MartinThoma

[Full Changelog](py-pdf/pypdf@3.17.4...4.0.0)

Jan 19, 2024
26b9a97
zip
tar.gz

PreviousNext

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

5.1.0

5.0.1

5.0.0

4.3.1

4.3.0

4.2.0

4.1.0

4.0.2

4.0.1

4.0.0

Tags: skyeyester/pypdf