Date: Tue, 24 Mar 2026 19:17:08 +0000 From: Robert Clausecker <fuz@FreeBSD.org> To: ports-committers@FreeBSD.org, dev-commits-ports-all@FreeBSD.org, dev-commits-ports-main@FreeBSD.org Cc: Wade Markham <wadegimpbc@tuta.com> Subject: git: 02a5aded7e25 - main - textproc/sonic: Make tokenizer features optional via OPTIONS, adopt port Message-ID: <69c2e334.41043.6b787d0f@gitrepo.freebsd.org>
index | next in thread | raw e-mail
The branch main has been updated by fuz: URL: https://cgit.FreeBSD.org/ports/commit/?id=02a5aded7e2587143522858fe321ae1a14a56d9e commit 02a5aded7e2587143522858fe321ae1a14a56d9e Author: Wade Markham <wadegimpbc@tuta.com> AuthorDate: 2026-03-21 08:16:20 +0000 Commit: Robert Clausecker <fuz@FreeBSD.org> CommitDate: 2026-03-24 19:12:37 +0000 textproc/sonic: Make tokenizer features optional via OPTIONS, adopt port This patch makes the Japanese and Chinese word segmentation features optional via FreeBSD OPTIONS helpers, and adopts the port. Currently the port unconditionally downloads a ~100MB UniDic Japanese dictionary (unidic-mecab-2.1.2_src.zip) for every build, regardless of whether the user needs Japanese tokenization. Upstream removed tokenizer-japanese from default cargo features in v1.4.2 because it 10x'd the final binary size. This patch brings the port in line with upstream's intent. Changes: - MAINTAINER changed to wadegimpbc@tuta.com - Added CHINESE and JAPANESE OPTIONS using OPTIONS helpers - OPTIONS_DEFAULT includes CHINESE (matching upstream's default features) - UniDic download now conditional on JAPANESE option - CARGO_FEATURES uses --no-default-features with allocator-jemalloc as base, per cargo.mk convention (lines 23-26, 192, 197-200) - added missing zstd dependency PR: 293943 --- textproc/sonic/Makefile | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/textproc/sonic/Makefile b/textproc/sonic/Makefile index c533ef8857c7..e84acf782907 100644 --- a/textproc/sonic/Makefile +++ b/textproc/sonic/Makefile @@ -3,10 +3,7 @@ DISTVERSIONPREFIX= v DISTVERSION= 1.4.9 PORTREVISION= 16 CATEGORIES= textproc -MASTER_SITES+= https://clrd.ninjal.ac.jp/unidic_archive/cwj/2.1.2/:unidic -DISTFILES+= unidic-mecab-2.1.2_src.zip:unidic # check cargo-crates/lindera-unidic-XXX/build.rs - -MAINTAINER= ports@FreeBSD.org +MAINTAINER= wadegimpbc@tuta.com COMMENT= Fast, lightweight, and schema-less search backend WWW= https://github.com/valeriansaliou/sonic @@ -14,6 +11,7 @@ LICENSE= MPL20 LICENSE_FILE= ${WRKSRC}/LICENSE.md BUILD_DEPENDS= llvm${LLVM_DEFAULT}>0:devel/llvm${LLVM_DEFAULT} +LIB_DEPENDS= libzstd.so:archivers/zstd USES= cargo compiler:c++11-lang gmake USE_GITHUB= yes @@ -26,9 +24,17 @@ GROUPS= sonic PLIST_FILES= bin/sonic \ "@sample ${ETCDIR}/config.cfg.sample" PORTDOCS= CONFIGURATION.md PROTOCOL.md README.md -OPTIONS_DEFINE= DOCS +OPTIONS_DEFINE= CHINESE DOCS JAPANESE +OPTIONS_DEFAULT= CHINESE +CHINESE_DESC= Chinese word segmentation +JAPANESE_DESC= Japanese word segmentation (adds ~100MB UniDic download) CARGO_ENV+= DISTDIR=${DISTDIR} +CARGO_FEATURES= --no-default-features allocator-jemalloc +CHINESE_VARS= CARGO_FEATURES+=tokenizer-chinese +JAPANESE_VARS= CARGO_FEATURES+=tokenizer-japanese +JAPANESE_MASTER_SITES= https://clrd.ninjal.ac.jp/unidic_archive/cwj/2.1.2/:unidic +JAPANESE_DISTFILES= unidic-mecab-2.1.2_src.zip:unidic post-install: @${MKDIR} ${STAGEDIR}${ETCDIR}home | help
Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?69c2e334.41043.6b787d0f>
