From nobody Mon Nov 17 07:27:11 2025 X-Original-To: dev-commits-ports-main@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4d8zpS01jzz6GlTj; Mon, 17 Nov 2025 07:27:12 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R12" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4d8zpR2Syzz3ZVl; Mon, 17 Nov 2025 07:27:11 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1763364431; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=6X5Jrps+/wYTJppjImwx0Rgm9hhKo8MHhsDemLc5EF4=; b=U0yik4XXOwMgIvbJy4potHOz21egJQfP1KLr3rC8MLwqj/iJveVhvuASQfyMi6LCnCgDe6 3XVsdkPtLBVneqUqR6acCdM8WvQYHZS5CybJ+NxY9m6paahfCXQi3wyIAiCBUdWyQqHkYs Q3ww0HUVtnaVF7rMG5f7IGqp+nX7ZwoWQ88F8bJlB3s4Kd5FVJd5hRaTIfmQV0rSPWU+bp OY7P28+a7u/z+89YkOv7SmGEBKo5p66Fa1sbNIGB3n1g9dRdjtMzHDKKsIjiFzZJkDPvIU rAzx+RSHBwpV4JhUH5tF1GlNb0kknJpY52o6EKIXSUBRTmeq+agHEctbKm2ESQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1763364431; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=6X5Jrps+/wYTJppjImwx0Rgm9hhKo8MHhsDemLc5EF4=; b=pTd7WnE9lbL3HsQK+SM0CpUoftdS4jN6vJXVOqa6zTvtLcAREvh2k6P2HFNDeLiUokjuS3 CME+2mpRW7tfj0ydyHssj3pWII71DUGqw2PbfWQTxALdqZYOWaGMiZ1QMuFa9AxP5fEjMc qvSf6BY5319RmL0GT5vpxwyT71mGFqfID3e2vCg7m0Ncx08pn7v9nlUFUyJbnA/L/Qtc7h 5KjrwXoMVxOSTI2yNJ+3z7SHoiw4lRPx5Hkm7J0TR2A3WUOTTTx6vP3kmkJCRHD+8ivoou QZS1RYfvgUF9sHq6kbfAi35ZpCEyQcGcxg36+YJZ5ZPxieFVatxRMIJJNdCmWQ== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1763364431; a=rsa-sha256; cv=none; b=vKNLKllgz5MkqrmwVyQ/mUTgNaYjyrsRUFCUrczuWo5IKbP7ClnbAuGgwS6hwUBBTBJg49 /O2WCfJ+wgq+hbUhwsbd4E4E5tgwTWpkIIaVi2SmDIKidRZT1ihT4r5NeqK/ViGaWKfY1f vaE65uOGMP8PUufmN/2W122YzRbiOQ0NhbprJM2T5S1WXzklvzTaa0LE45623sP5XF1pfp GYeQxLaaj39X9FPrGE2EUJ9aEhCSCx+/Cfj08YeUIauiSkVUPwu5MdeWuJT/NlKsJvD+3B Ml3baWY5ArQcWbmLsMfiJ7b6nzBISDzFZzsUMPE9Y7FcduBDjWI82/Q3OPI8EQ== ARC-Authentication-Results: i=1; mx1.freebsd.org; none Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4d8zpR1M7Nz1DcC; Mon, 17 Nov 2025 07:27:11 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.18.1/8.18.1) with ESMTP id 5AH7RBOQ073338; Mon, 17 Nov 2025 07:27:11 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.18.1/8.18.1/Submit) id 5AH7RBIu073335; Mon, 17 Nov 2025 07:27:11 GMT (envelope-from git) Date: Mon, 17 Nov 2025 07:27:11 GMT Message-Id: <202511170727.5AH7RBIu073335@gitrepo.freebsd.org> To: ports-committers@FreeBSD.org, dev-commits-ports-all@FreeBSD.org, dev-commits-ports-main@FreeBSD.org From: Kai Knoblich Subject: git: 103146dda2ac - main - textproc/py-ocrmypdf: Update to 16.11.1 List-Id: Commits to the main branch of the FreeBSD ports repository List-Archive: https://lists.freebsd.org/archives/dev-commits-ports-main List-Help: List-Post: List-Subscribe: List-Unsubscribe: X-BeenThere: dev-commits-ports-main@freebsd.org Sender: owner-dev-commits-ports-main@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: kai X-Git-Repository: ports X-Git-Refname: refs/heads/main X-Git-Reftype: branch X-Git-Commit: 103146dda2acf8f8d0882baedccfd0124b5be6e1 Auto-Submitted: auto-generated The branch main has been updated by kai: URL: https://cgit.FreeBSD.org/ports/commit/?id=103146dda2acf8f8d0882baedccfd0124b5be6e1 commit 103146dda2acf8f8d0882baedccfd0124b5be6e1 Author: Kai Knoblich AuthorDate: 2025-11-17 07:21:22 +0000 Commit: Kai Knoblich CommitDate: 2025-11-17 07:25:03 +0000 textproc/py-ocrmypdf: Update to 16.11.1 Backport a workaround for JPEG encoding issues with Ghostscript 10.6.0. There's already release 16.12.0, but it requires py-pikepdf 10.0.1 as a minimum which isn't present in the ports tree, yet. Changelog: https://github.com/ocrmypdf/OCRmyPDF/blob/v16.11.1/docs/release_notes.md MFH: 2025Q4 --- textproc/py-ocrmypdf/Makefile | 5 +- textproc/py-ocrmypdf/distinfo | 6 +- .../files/patch-src_ocrmypdf_optimize.py | 66 ++++++++++++++++++++++ 3 files changed, 72 insertions(+), 5 deletions(-) diff --git a/textproc/py-ocrmypdf/Makefile b/textproc/py-ocrmypdf/Makefile index b62e369362ec..157d71bad57d 100644 --- a/textproc/py-ocrmypdf/Makefile +++ b/textproc/py-ocrmypdf/Makefile @@ -1,5 +1,5 @@ PORTNAME= ocrmypdf -DISTVERSION= 16.11.0 +DISTVERSION= 16.11.1 CATEGORIES= textproc python MASTER_SITES= PYPI PKGNAMEPREFIX= ${PYTHON_PKGNAMEPREFIX} @@ -31,9 +31,10 @@ TEST_DEPENDS= ${PYTHON_PKGNAMEPREFIX}hypothesis>=6.36.0:devel/py-hypothesis@${PY USES= ghostscript:run python shebangfix USE_PYTHON= autoplist concurrent pep517 pytest # Skip some checks as they yield wrong results if run with the root account +# "test_watcher" requires additional deps used by the "watcher" feature PYTEST_IGNORED_TESTS= test_chmod \ test_input_file_not_readable \ - test_malformed_docinfo # leads to an internal pytest error + test_watcher SHEBANG_FILES= src/ocrmypdf/__main__.py \ src/ocrmypdf/pdfinfo/__init__.py diff --git a/textproc/py-ocrmypdf/distinfo b/textproc/py-ocrmypdf/distinfo index e20d42f98e01..582ec949cdca 100644 --- a/textproc/py-ocrmypdf/distinfo +++ b/textproc/py-ocrmypdf/distinfo @@ -1,3 +1,3 @@ -TIMESTAMP = 1757764047 -SHA256 (ocrmypdf-16.11.0.tar.gz) = d89077e503238dac35c6e565925edc8d98b71e5289853c02cacbc1d0901f1be7 -SIZE (ocrmypdf-16.11.0.tar.gz) = 7015068 +TIMESTAMP = 1763048154 +SHA256 (ocrmypdf-16.11.1.tar.gz) = 838ab69e0ee0f04feea0d5861a17badecab6d3beaed0e29a97058eadda58cbb1 +SIZE (ocrmypdf-16.11.1.tar.gz) = 7015278 diff --git a/textproc/py-ocrmypdf/files/patch-src_ocrmypdf_optimize.py b/textproc/py-ocrmypdf/files/patch-src_ocrmypdf_optimize.py new file mode 100644 index 000000000000..34e6453d57df --- /dev/null +++ b/textproc/py-ocrmypdf/files/patch-src_ocrmypdf_optimize.py @@ -0,0 +1,66 @@ +From: "James R. Barlow" +Date: Sun, 9 Nov 2025 15:43:36 -0800 +Subject: [PATCH] Work around Ghostscript 10.6.0 JPEG encoding issue by forcing + optimization. + +Not an ideal fix, but it improves an issue affecting numerous users. + +Fixes 1585. + +Obtained from: + +https://github.com/ocrmypdf/OCRmyPDF/commit/f4c6c8121ba8178ff3a1cb8f70037bbc3a31391b.patch + +--- src/ocrmypdf/optimize.py.orig 2020-02-02 00:00:00 UTC ++++ src/ocrmypdf/optimize.py +@@ -17,6 +17,7 @@ import img2pdf + from zlib import compress + + import img2pdf ++from packaging.version import Version + from pikepdf import ( + Dictionary, + Name, +@@ -32,7 +33,7 @@ from ocrmypdf._concurrent import Executor, SerialExecu + from PIL import Image + + from ocrmypdf._concurrent import Executor, SerialExecutor +-from ocrmypdf._exec import jbig2enc, pngquant ++from ocrmypdf._exec import ghostscript, jbig2enc, pngquant + from ocrmypdf._jobcontext import PdfContext + from ocrmypdf._progressbar import ProgressBar + from ocrmypdf.exceptions import OutputFileAccessError +@@ -189,6 +190,16 @@ def extract_image_jbig2( + return None + + ++def _should_optimize_jpeg(options, filtdp): ++ if options.optimize >= 2: ++ return True ++ if options.optimize < 2 and ghostscript.version() >= Version('10.6.0'): ++ # Ghostscript 10.6.0+ introduced some sort of JPEG encoding issue. ++ # To resolve this, re-optimize the JPEG anyway. ++ return True ++ return False ++ ++ + def extract_image_generic( + *, pdf: Pdf, root: Path, image: Stream, xref: Xref, options + ) -> XrefExt | None: +@@ -202,15 +213,7 @@ def extract_image_generic( + if pim.bits_per_component == 1: + return None + +- if filtdp[0] == Name.DCTDecode and options.optimize >= 2: +- # This is a simple heuristic derived from some training data, that has +- # about a 70% chance of guessing whether the JPEG is high quality, +- # and possibly recompressible, or not. The number itself doesn't mean +- # anything. +- # bytes_per_pixel = int(raw_jpeg.Length) / (w * h) +- # jpeg_quality_estimate = 117.0 * (bytes_per_pixel ** 0.213) +- # if jpeg_quality_estimate < 65: +- # return None ++ if filtdp[0] == Name.DCTDecode and _should_optimize_jpeg(options, filtdp): + try: + imgname = root / f'{xref:08d}' + with imgname.open('wb') as f: