Date: Fri, 20 Dec 2024 02:09:50 GMT From: Wen Heping <wen@FreeBSD.org> To: ports-committers@FreeBSD.org, dev-commits-ports-all@FreeBSD.org, dev-commits-ports-main@FreeBSD.org Subject: git: 89d55115a4c0 - main - converters/py-markitdown: New port Message-ID: <202412200209.4BK29oeC092787@gitrepo.freebsd.org>
next in thread | raw e-mail | index | archive | help
The branch main has been updated by wen: URL: https://cgit.FreeBSD.org/ports/commit/?id=89d55115a4c0f52bbeb08ea5f5899d6e6b62fa1b commit 89d55115a4c0f52bbeb08ea5f5899d6e6b62fa1b Author: Wen Heping <wen@FreeBSD.org> AuthorDate: 2024-12-20 02:01:25 +0000 Commit: Wen Heping <wen@FreeBSD.org> CommitDate: 2024-12-20 02:09:15 +0000 converters/py-markitdown: New port MarkItDown library is a utility tool for converting various files to Markdown (e.g., for indexing, text analysis, etc.) It presently supports: *PDF (.pdf) *PowerPoint (.pptx) *Word (.docx) *Excel (.xlsx) *Images (EXIF metadata, and OCR) *Audio (EXIF metadata, and speech transcription) *HTML (special handling of Wikipedia, etc.) *Various other text-based formats (csv, json, xml, etc.) *ZIP (Iterates over contents and converts each file) --- converters/Makefile | 1 + converters/py-markitdown/Makefile | 27 +++++++++++++++++++++++++++ converters/py-markitdown/distinfo | 3 +++ converters/py-markitdown/pkg-descr | 13 +++++++++++++ 4 files changed, 44 insertions(+) diff --git a/converters/Makefile b/converters/Makefile index d963b78583d0..645b3b83065f 100644 --- a/converters/Makefile +++ b/converters/Makefile @@ -153,6 +153,7 @@ SUBDIR += py-bsdconv SUBDIR += py-gotenberg-client SUBDIR += py-mammoth + SUBDIR += py-markitdown SUBDIR += py-rencode SUBDIR += py-svglib SUBDIR += py-text-unidecode diff --git a/converters/py-markitdown/Makefile b/converters/py-markitdown/Makefile new file mode 100644 index 000000000000..a9ce7a689d57 --- /dev/null +++ b/converters/py-markitdown/Makefile @@ -0,0 +1,27 @@ +PORTNAME= markitdown +DISTVERSION= 0.0.1a3 +CATEGORIES= converters python +MASTER_SITES= PYPI +PKGNAMEPREFIX= ${PYTHON_PKGNAMEPREFIX} + +MAINTAINER= wen@FreeBSD.org +COMMENT= Utility tool for converting various files to Markdown +WWW= https://pypi.org/project/tlv8/ + +LICENSE= APACHE20 + +BUILD_DEPENDS= ${PYTHON_PKGNAMEPREFIX}hatchling>=0:devel/py-hatchling@${PY_FLAVOR} +RUN_DEPENDS= ${PYTHON_PKGNAMEPREFIX}mammoth>=0:converters/py-mammoth@${PY_FLAVOR} \ + ${PYTHON_PKGNAMEPREFIX}markdownify>=0:textproc/py-markdownify@${PY_FLAVOR} \ + ${PYTHON_PKGNAMEPREFIX}pandas>=0:math/py-pandas@${PY_FLAVOR} \ + ${PYTHON_PKGNAMEPREFIX}pdfminer.six>=0:textproc/py-pdfminer.six@${PY_FLAVOR} \ + ${PYTHON_PKGNAMEPREFIX}python-pptx>=0:textproc/py-python-pptx@${PY_FLAVOR} \ + ${PYTHON_PKGNAMEPREFIX}puremagic>=0:sysutils/py-puremagic@${PY_FLAVOR} \ + ${PYTHON_PKGNAMEPREFIX}requests>=0:www/py-requests@${PY_FLAVOR} + +USES= python +USE_PYTHON= autoplist pep517 + +NO_ARCH= yes + +.include <bsd.port.mk> diff --git a/converters/py-markitdown/distinfo b/converters/py-markitdown/distinfo new file mode 100644 index 000000000000..a69065a058ef --- /dev/null +++ b/converters/py-markitdown/distinfo @@ -0,0 +1,3 @@ +TIMESTAMP = 1734654122 +SHA256 (markitdown-0.0.1a3.tar.gz) = f6c8f5f7f5541e91c6c535218318968fefd71e2a6faa0eb782b3492e04cd023d +SIZE (markitdown-0.0.1a3.tar.gz) = 16073 diff --git a/converters/py-markitdown/pkg-descr b/converters/py-markitdown/pkg-descr new file mode 100644 index 000000000000..8871cf0e5603 --- /dev/null +++ b/converters/py-markitdown/pkg-descr @@ -0,0 +1,13 @@ +MarkItDown library is a utility tool for converting various files to Markdown +(e.g., for indexing, text analysis, etc.) + +It presently supports: + *PDF (.pdf) + *PowerPoint (.pptx) + *Word (.docx) + *Excel (.xlsx) + *Images (EXIF metadata, and OCR) + *Audio (EXIF metadata, and speech transcription) + *HTML (special handling of Wikipedia, etc.) + *Various other text-based formats (csv, json, xml, etc.) + *ZIP (Iterates over contents and converts each file)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?202412200209.4BK29oeC092787>