From owner-svn-src-head@freebsd.org Mon Oct 22 18:29:18 2018 Return-Path: Delivered-To: svn-src-head@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9211710531DF; Mon, 22 Oct 2018 18:29:17 +0000 (UTC) (envelope-from cem@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 45121899D5; Mon, 22 Oct 2018 18:29:17 +0000 (UTC) (envelope-from cem@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 3F66620246; Mon, 22 Oct 2018 18:29:17 +0000 (UTC) (envelope-from cem@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w9MITH6x033510; Mon, 22 Oct 2018 18:29:17 GMT (envelope-from cem@FreeBSD.org) Received: (from cem@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w9MITDKv033488; Mon, 22 Oct 2018 18:29:13 GMT (envelope-from cem@FreeBSD.org) Message-Id: <201810221829.w9MITDKv033488@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: cem set sender to cem@FreeBSD.org using -f From: Conrad Meyer Date: Mon, 22 Oct 2018 18:29:12 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r339606 - in head: lib/libzstd sys/conf sys/contrib/zstd sys/contrib/zstd/contrib/gen_html sys/contrib/zstd/contrib/meson sys/contrib/zstd/contrib/pzstd sys/contrib/zstd/contrib/seekabl... X-SVN-Group: head X-SVN-Commit-Author: cem X-SVN-Commit-Paths: in head: lib/libzstd sys/conf sys/contrib/zstd sys/contrib/zstd/contrib/gen_html sys/contrib/zstd/contrib/meson sys/contrib/zstd/contrib/pzstd sys/contrib/zstd/contrib/seekable_format sys/contrib/zstd... X-SVN-Commit-Revision: 339606 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 Oct 2018 18:29:18 -0000 Author: cem Date: Mon Oct 22 18:29:12 2018 New Revision: 339606 URL: https://svnweb.freebsd.org/changeset/base/339606 Log: Update to Zstandard 1.3.7 Relnotes: yes Sponsored by: Dell EMC Isilon Added: head/sys/contrib/zstd/doc/images/cdict_v136.png (contents, props changed) head/sys/contrib/zstd/doc/images/zstd_cdict_v1_3_5.png (contents, props changed) head/sys/contrib/zstd/lib/common/debug.c (contents, props changed) head/sys/contrib/zstd/lib/common/debug.h (contents, props changed) head/sys/contrib/zstd/lib/compress/hist.c (contents, props changed) head/sys/contrib/zstd/lib/compress/hist.h (contents, props changed) head/sys/contrib/zstd/lib/dictBuilder/cover.h (contents, props changed) head/sys/contrib/zstd/lib/dictBuilder/fastcover.c (contents, props changed) head/sys/contrib/zstd/programs/zstdgrep.1 (contents, props changed) head/sys/contrib/zstd/programs/zstdgrep.1.md head/sys/contrib/zstd/programs/zstdless.1 (contents, props changed) head/sys/contrib/zstd/programs/zstdless.1.md head/sys/contrib/zstd/tests/libzstd_partial_builds.sh (contents, props changed) head/sys/contrib/zstd/tests/rateLimiter.py (contents, props changed) Deleted: head/sys/contrib/zstd/circle.yml head/sys/contrib/zstd/tests/namespaceTest.c Modified: head/lib/libzstd/Makefile head/sys/conf/files head/sys/conf/files.sparc64 head/sys/contrib/zstd/.gitattributes head/sys/contrib/zstd/Makefile head/sys/contrib/zstd/NEWS head/sys/contrib/zstd/README.md head/sys/contrib/zstd/TESTING.md head/sys/contrib/zstd/appveyor.yml head/sys/contrib/zstd/contrib/gen_html/Makefile head/sys/contrib/zstd/contrib/meson/meson.build head/sys/contrib/zstd/contrib/pzstd/Makefile head/sys/contrib/zstd/contrib/pzstd/Options.cpp head/sys/contrib/zstd/contrib/pzstd/Pzstd.cpp head/sys/contrib/zstd/contrib/seekable_format/examples/Makefile head/sys/contrib/zstd/contrib/seekable_format/examples/seekable_compression.c head/sys/contrib/zstd/contrib/seekable_format/examples/seekable_decompression.c head/sys/contrib/zstd/contrib/seekable_format/zstd_seekable.h head/sys/contrib/zstd/contrib/seekable_format/zstdseek_decompress.c head/sys/contrib/zstd/doc/zstd_compression_format.md head/sys/contrib/zstd/doc/zstd_manual.html head/sys/contrib/zstd/lib/BUCK head/sys/contrib/zstd/lib/Makefile head/sys/contrib/zstd/lib/README.md head/sys/contrib/zstd/lib/common/bitstream.h head/sys/contrib/zstd/lib/common/compiler.h head/sys/contrib/zstd/lib/common/cpu.h head/sys/contrib/zstd/lib/common/entropy_common.c head/sys/contrib/zstd/lib/common/fse.h head/sys/contrib/zstd/lib/common/fse_decompress.c head/sys/contrib/zstd/lib/common/huf.h head/sys/contrib/zstd/lib/common/mem.h head/sys/contrib/zstd/lib/common/pool.c head/sys/contrib/zstd/lib/common/pool.h head/sys/contrib/zstd/lib/common/xxhash.c head/sys/contrib/zstd/lib/common/zstd_common.c head/sys/contrib/zstd/lib/common/zstd_internal.h head/sys/contrib/zstd/lib/compress/fse_compress.c head/sys/contrib/zstd/lib/compress/huf_compress.c head/sys/contrib/zstd/lib/compress/zstd_compress.c head/sys/contrib/zstd/lib/compress/zstd_compress_internal.h head/sys/contrib/zstd/lib/compress/zstd_double_fast.c head/sys/contrib/zstd/lib/compress/zstd_double_fast.h head/sys/contrib/zstd/lib/compress/zstd_fast.c head/sys/contrib/zstd/lib/compress/zstd_fast.h head/sys/contrib/zstd/lib/compress/zstd_lazy.c head/sys/contrib/zstd/lib/compress/zstd_lazy.h head/sys/contrib/zstd/lib/compress/zstd_ldm.c head/sys/contrib/zstd/lib/compress/zstd_ldm.h head/sys/contrib/zstd/lib/compress/zstd_opt.c head/sys/contrib/zstd/lib/compress/zstd_opt.h head/sys/contrib/zstd/lib/compress/zstdmt_compress.c head/sys/contrib/zstd/lib/compress/zstdmt_compress.h head/sys/contrib/zstd/lib/decompress/huf_decompress.c head/sys/contrib/zstd/lib/decompress/zstd_decompress.c head/sys/contrib/zstd/lib/dictBuilder/cover.c head/sys/contrib/zstd/lib/dictBuilder/divsufsort.c head/sys/contrib/zstd/lib/dictBuilder/zdict.c head/sys/contrib/zstd/lib/dictBuilder/zdict.h head/sys/contrib/zstd/lib/freebsd/zstd_kmalloc.c head/sys/contrib/zstd/lib/legacy/zstd_v01.c head/sys/contrib/zstd/lib/legacy/zstd_v02.c head/sys/contrib/zstd/lib/legacy/zstd_v03.c head/sys/contrib/zstd/lib/legacy/zstd_v04.c head/sys/contrib/zstd/lib/legacy/zstd_v05.c head/sys/contrib/zstd/lib/legacy/zstd_v06.c head/sys/contrib/zstd/lib/legacy/zstd_v07.c head/sys/contrib/zstd/lib/zstd.h head/sys/contrib/zstd/programs/Makefile head/sys/contrib/zstd/programs/README.md head/sys/contrib/zstd/programs/bench.c head/sys/contrib/zstd/programs/bench.h head/sys/contrib/zstd/programs/datagen.c head/sys/contrib/zstd/programs/dibio.c head/sys/contrib/zstd/programs/dibio.h head/sys/contrib/zstd/programs/fileio.c head/sys/contrib/zstd/programs/fileio.h head/sys/contrib/zstd/programs/platform.h head/sys/contrib/zstd/programs/util.h head/sys/contrib/zstd/programs/zstd.1 head/sys/contrib/zstd/programs/zstd.1.md head/sys/contrib/zstd/programs/zstdcli.c head/sys/contrib/zstd/tests/.gitignore head/sys/contrib/zstd/tests/Makefile head/sys/contrib/zstd/tests/README.md head/sys/contrib/zstd/tests/decodecorpus.c head/sys/contrib/zstd/tests/fullbench.c head/sys/contrib/zstd/tests/fuzz/fuzz.h head/sys/contrib/zstd/tests/fuzz/fuzz.py head/sys/contrib/zstd/tests/fuzz/regression_driver.c head/sys/contrib/zstd/tests/fuzz/zstd_helpers.c head/sys/contrib/zstd/tests/fuzzer.c head/sys/contrib/zstd/tests/gzip/Makefile head/sys/contrib/zstd/tests/legacy.c head/sys/contrib/zstd/tests/longmatch.c head/sys/contrib/zstd/tests/paramgrill.c head/sys/contrib/zstd/tests/playTests.sh head/sys/contrib/zstd/tests/poolTests.c head/sys/contrib/zstd/tests/roundTripCrash.c head/sys/contrib/zstd/tests/symbols.c head/sys/contrib/zstd/tests/test-zstd-versions.py head/sys/contrib/zstd/tests/zstreamtest.c head/sys/contrib/zstd/zlibWrapper/examples/minigzip.c head/sys/contrib/zstd/zlibWrapper/examples/zwrapbench.c head/sys/contrib/zstd/zlibWrapper/gzguts.h head/sys/contrib/zstd/zlibWrapper/gzlib.c head/sys/contrib/zstd/zlibWrapper/gzwrite.c Modified: head/lib/libzstd/Makefile ============================================================================== --- head/lib/libzstd/Makefile Mon Oct 22 17:42:57 2018 (r339605) +++ head/lib/libzstd/Makefile Mon Oct 22 18:29:12 2018 (r339606) @@ -24,7 +24,10 @@ SRCS= entropy_common.c \ zstd_lazy.c \ zstd_ldm.c \ zstd_opt.c \ - zstd_double_fast.c + zstd_double_fast.c \ + debug.c \ + hist.c \ + fastcover.c WARNS= 2 INCS= zstd.h CFLAGS+= -I${ZSTDDIR}/lib -I${ZSTDDIR}/lib/common -DXXH_NAMESPACE=ZSTD_ \ Modified: head/sys/conf/files ============================================================================== --- head/sys/conf/files Mon Oct 22 17:42:57 2018 (r339605) +++ head/sys/conf/files Mon Oct 22 18:29:12 2018 (r339606) @@ -645,6 +645,7 @@ contrib/zstd/lib/common/error_private.c optional zstd contrib/zstd/lib/common/xxhash.c optional zstdio compile-with ${ZSTD_C} contrib/zstd/lib/compress/zstd_compress.c optional zstdio compile-with ${ZSTD_C} contrib/zstd/lib/compress/fse_compress.c optional zstdio compile-with ${ZSTD_C} +contrib/zstd/lib/compress/hist.c optional zstdio compile-with ${ZSTD_C} contrib/zstd/lib/compress/huf_compress.c optional zstdio compile-with ${ZSTD_C} contrib/zstd/lib/compress/zstd_double_fast.c optional zstdio compile-with ${ZSTD_C} contrib/zstd/lib/compress/zstd_fast.c optional zstdio compile-with ${ZSTD_C} Modified: head/sys/conf/files.sparc64 ============================================================================== --- head/sys/conf/files.sparc64 Mon Oct 22 17:42:57 2018 (r339605) +++ head/sys/conf/files.sparc64 Mon Oct 22 18:29:12 2018 (r339606) @@ -149,3 +149,6 @@ sparc64/sparc64/uio_machdep.c standard sparc64/sparc64/upa.c optional creator sparc64/sparc64/vm_machdep.c standard sparc64/sparc64/zeus.c standard + +# Zstd +contrib/zstd/lib/freebsd/zstd_kfreebsd.c optional zstdio compile-with ${ZSTD_C} Modified: head/sys/contrib/zstd/.gitattributes ============================================================================== --- head/sys/contrib/zstd/.gitattributes Mon Oct 22 17:42:57 2018 (r339605) +++ head/sys/contrib/zstd/.gitattributes Mon Oct 22 18:29:12 2018 (r339606) @@ -19,6 +19,3 @@ # Windows *.bat text eol=crlf *.cmd text eol=crlf - -# .travis.yml merging -.travis.yml merge=ours Modified: head/sys/contrib/zstd/Makefile ============================================================================== --- head/sys/contrib/zstd/Makefile Mon Oct 22 17:42:57 2018 (r339605) +++ head/sys/contrib/zstd/Makefile Mon Oct 22 18:29:12 2018 (r339606) @@ -23,20 +23,19 @@ else EXT = endif +## default: Build lib-release and zstd-release .PHONY: default default: lib-release zstd-release .PHONY: all -all: | allmost examples manual contrib +all: allmost examples manual contrib .PHONY: allmost -allmost: allzstd - $(MAKE) -C $(ZWRAPDIR) all +allmost: allzstd zlibwrapper -#skip zwrapper, can't build that on alternate architectures without the proper zlib installed +# skip zwrapper, can't build that on alternate architectures without the proper zlib installed .PHONY: allzstd -allzstd: - $(MAKE) -C $(ZSTDDIR) all +allzstd: lib $(MAKE) -C $(PRGDIR) all $(MAKE) -C $(TESTDIR) all @@ -45,58 +44,62 @@ all32: $(MAKE) -C $(PRGDIR) zstd32 $(MAKE) -C $(TESTDIR) all32 -.PHONY: lib -lib: +.PHONY: lib lib-release libzstd.a +lib lib-release : @$(MAKE) -C $(ZSTDDIR) $@ -.PHONY: lib-release -lib-release: - @$(MAKE) -C $(ZSTDDIR) - -.PHONY: zstd -zstd: +.PHONY: zstd zstd-release +zstd zstd-release: @$(MAKE) -C $(PRGDIR) $@ cp $(PRGDIR)/zstd$(EXT) . -.PHONY: zstd-release -zstd-release: - @$(MAKE) -C $(PRGDIR) - cp $(PRGDIR)/zstd$(EXT) . - .PHONY: zstdmt zstdmt: @$(MAKE) -C $(PRGDIR) $@ cp $(PRGDIR)/zstd$(EXT) ./zstdmt$(EXT) .PHONY: zlibwrapper -zlibwrapper: - $(MAKE) -C $(ZWRAPDIR) test +zlibwrapper: lib + $(MAKE) -C $(ZWRAPDIR) all +## test: run long-duration tests .PHONY: test +test: MOREFLAGS += -g -DDEBUGLEVEL=1 -Werror test: - $(MAKE) -C $(PRGDIR) allVariants MOREFLAGS+="-g -DZSTD_DEBUG=1" + MOREFLAGS="$(MOREFLAGS)" $(MAKE) -j -C $(PRGDIR) allVariants $(MAKE) -C $(TESTDIR) $@ +## shortest: same as `make check` .PHONY: shortest shortest: $(MAKE) -C $(TESTDIR) $@ +## check: run basic tests for `zstd` cli .PHONY: check check: shortest +## examples: build all examples in `/examples` directory .PHONY: examples -examples: +examples: lib CPPFLAGS=-I../lib LDFLAGS=-L../lib $(MAKE) -C examples/ all +## manual: generate API documentation in html format .PHONY: manual manual: $(MAKE) -C contrib/gen_html $@ +## man: generate man page +.PHONY: man +man: + $(MAKE) -C programs $@ + +## contrib: build all supported projects in `/contrib` directory .PHONY: contrib contrib: lib $(MAKE) -C contrib/pzstd all $(MAKE) -C contrib/seekable_format/examples all $(MAKE) -C contrib/adaptive-compression all + $(MAKE) -C contrib/largeNbDicts all .PHONY: cleanTabs cleanTabs: @@ -113,21 +116,39 @@ clean: @$(MAKE) -C contrib/pzstd $@ > $(VOID) @$(MAKE) -C contrib/seekable_format/examples $@ > $(VOID) @$(MAKE) -C contrib/adaptive-compression $@ > $(VOID) + @$(MAKE) -C contrib/largeNbDicts $@ > $(VOID) @$(RM) zstd$(EXT) zstdmt$(EXT) tmp* @$(RM) -r lz4 @echo Cleaning completed #------------------------------------------------------------------------------ -# make install is validated only for Linux, OSX, Hurd and some BSD targets +# make install is validated only for Linux, macOS, Hurd and some BSD targets #------------------------------------------------------------------------------ -ifneq (,$(filter $(shell uname),Linux Darwin GNU/kFreeBSD GNU FreeBSD DragonFly NetBSD MSYS_NT)) +ifneq (,$(filter $(shell uname),Linux Darwin GNU/kFreeBSD GNU OpenBSD FreeBSD DragonFly NetBSD MSYS_NT Haiku)) HOST_OS = POSIX -CMAKE_PARAMS = -DZSTD_BUILD_CONTRIB:BOOL=ON -DZSTD_BUILD_STATIC:BOOL=ON -DZSTD_BUILD_TESTS:BOOL=ON -DZSTD_ZLIB_SUPPORT:BOOL=ON -DZSTD_LZMA_SUPPORT:BOOL=ON +CMAKE_PARAMS = -DZSTD_BUILD_CONTRIB:BOOL=ON -DZSTD_BUILD_STATIC:BOOL=ON -DZSTD_BUILD_TESTS:BOOL=ON -DZSTD_ZLIB_SUPPORT:BOOL=ON -DZSTD_LZMA_SUPPORT:BOOL=ON -DCMAKE_BUILD_TYPE=Release +EGREP = egrep --color=never + +# Print a two column output of targets and their description. To add a target description, put a +# comment in the Makefile with the format "## : ". For example: +# +## list: Print all targets and their descriptions (if provided) .PHONY: list list: - @$(MAKE) -pRrq -f $(lastword $(MAKEFILE_LIST)) : 2>/dev/null | awk -v RS= -F: '/^# File/,/^# Finished Make data base/ {if ($$1 !~ "^[#.]") {print $$1}}' | sort | egrep -v -e '^[^[:alnum:]]' -e '^$@$$' | xargs + @TARGETS=$$($(MAKE) -pRrq -f $(lastword $(MAKEFILE_LIST)) : 2>/dev/null \ + | awk -v RS= -F: '/^# File/,/^# Finished Make data base/ {if ($$1 !~ "^[#.]") {print $$1}}' \ + | $(EGREP) -v -e '^[^[:alnum:]]' | sort); \ + { \ + printf "Target Name\tDescription\n"; \ + printf "%0.s-" {1..16}; printf "\t"; printf "%0.s-" {1..40}; printf "\n"; \ + for target in $$TARGETS; do \ + line=$$($(EGREP) "^##[[:space:]]+$$target:" $(lastword $(MAKEFILE_LIST))); \ + description=$$(echo $$line | awk '{i=index($$0,":"); print substr($$0,i+1)}' | xargs); \ + printf "$$target\t$$description\n"; \ + done \ + } | column -t -s $$'\t' .PHONY: install clangtest armtest usan asan uasan install: @@ -183,6 +204,7 @@ armfuzz: clean CC=arm-linux-gnueabi-gcc QEMU_SYS=qemu-arm-static MOREFLAGS="-static" FUZZER_FLAGS=--no-big-tests $(MAKE) -C $(TESTDIR) fuzztest aarch64fuzz: clean + ld -v CC=aarch64-linux-gnu-gcc QEMU_SYS=qemu-aarch64-static MOREFLAGS="-static" FUZZER_FLAGS=--no-big-tests $(MAKE) -C $(TESTDIR) fuzztest ppcfuzz: clean @@ -206,7 +228,7 @@ gcc6test: clean clangtest: clean clang -v - $(MAKE) all CXX=clang-++ CC=clang MOREFLAGS="-Werror -Wconversion -Wno-sign-conversion -Wdocumentation" + $(MAKE) all CXX=clang++ CC=clang MOREFLAGS="-Werror -Wconversion -Wno-sign-conversion -Wdocumentation" armtest: clean $(MAKE) -C $(TESTDIR) datagen # use native, faster @@ -295,6 +317,9 @@ gcc6install: apt-add-repo gcc7install: apt-add-repo APT_PACKAGES="libc6-dev-i386 gcc-multilib gcc-7 gcc-7-multilib" $(MAKE) apt-install +gcc8install: apt-add-repo + APT_PACKAGES="libc6-dev-i386 gcc-multilib gcc-8 gcc-8-multilib" $(MAKE) apt-install + gpp6install: apt-add-repo APT_PACKAGES="libc6-dev-i386 g++-multilib gcc-6 g++-6 g++-6-multilib" $(MAKE) apt-install @@ -326,23 +351,23 @@ cmakebuild: c90build: clean $(CC) -v - CFLAGS="-std=c90" $(MAKE) allmost # will fail, due to missing support for `long long` + CFLAGS="-std=c90 -Werror" $(MAKE) allmost # will fail, due to missing support for `long long` gnu90build: clean $(CC) -v - CFLAGS="-std=gnu90" $(MAKE) allmost + CFLAGS="-std=gnu90 -Werror" $(MAKE) allmost c99build: clean $(CC) -v - CFLAGS="-std=c99" $(MAKE) allmost + CFLAGS="-std=c99 -Werror" $(MAKE) allmost gnu99build: clean $(CC) -v - CFLAGS="-std=gnu99" $(MAKE) allmost + CFLAGS="-std=gnu99 -Werror" $(MAKE) allmost c11build: clean $(CC) -v - CFLAGS="-std=c11" $(MAKE) allmost + CFLAGS="-std=c11 -Werror" $(MAKE) allmost bmix64build: clean $(CC) -v @@ -356,7 +381,10 @@ bmi32build: clean $(CC) -v CFLAGS="-O3 -mbmi -m32 -Werror" $(MAKE) -C $(TESTDIR) test -staticAnalyze: clean +# static analyzer test uses clang's scan-build +# does not analyze zlibWrapper, due to detected issues in zlib source code +staticAnalyze: SCANBUILD ?= scan-build +staticAnalyze: $(CC) -v - CPPFLAGS=-g scan-build --status-bugs -v $(MAKE) all + CC=$(CC) CPPFLAGS=-g $(SCANBUILD) --status-bugs -v $(MAKE) allzstd examples contrib endif Modified: head/sys/contrib/zstd/NEWS ============================================================================== --- head/sys/contrib/zstd/NEWS Mon Oct 22 17:42:57 2018 (r339605) +++ head/sys/contrib/zstd/NEWS Mon Oct 22 18:29:12 2018 (r339606) @@ -1,3 +1,39 @@ +v1.3.7 +perf: slightly better decompression speed on clang (depending on hardware target) +fix : performance of dictionary compression for small input < 4 KB at levels 9 and 10 +build: no longer build backtrace by default in release mode; restrict further automatic mode +build: control backtrace support through build macro BACKTRACE +misc: added man pages for zstdless and zstdgrep, by @samrussell + +v1.3.6 +perf: much faster dictionary builder, by @jenniferliu +perf: faster dictionary compression on small data when using multiple contexts, by @felixhandte +perf: faster dictionary decompression when using a very large number of dictionaries simultaneously +cli : fix : does no longer overwrite destination when source does not exist (#1082) +cli : new command --adapt, for automatic compression level adaptation +api : fix : block api can be streamed with > 4 GB, reported by @catid +api : reduced ZSTD_DDict size by 2 KB +api : minimum negative compression level is defined, and can be queried using ZSTD_minCLevel(). +build: support Haiku target, by @korli +build: Read Legacy format is limited to v0.5+ by default. Can be changed at compile time with macro ZSTD_LEGACY_SUPPORT. +doc : zstd_compression_format.md updated to match wording in IETF RFC 8478 +misc: tests/paramgrill, a parameter optimizer, by @GeorgeLu97 + +v1.3.5 +perf: much faster dictionary compression, by @felixhandte +perf: small quality improvement for dictionary generation, by @terrelln +perf: slightly improved high compression levels (notably level 19) +mem : automatic memory release for long duration contexts +cli : fix : overlapLog can be manually set +cli : fix : decoding invalid lz4 frames +api : fix : performance degradation for dictionary compression when using advanced API, by @terrelln +api : change : clarify ZSTD_CCtx_reset() vs ZSTD_CCtx_resetParameters(), by @terrelln +build: select custom libzstd scope through control macros, by @GeorgeLu97 +build: OpenBSD patch, by @bket +build: make and make all are compatible with -j +doc : clarify zstd_compression_format.md, updated for IETF RFC process +misc: pzstd compatible with reproducible compilation, by @lamby + v1.3.4 perf: faster speed (especially decoding speed) on recent cpus (haswell+) perf: much better performance associating --long with multi-threading, by @terrelln Modified: head/sys/contrib/zstd/README.md ============================================================================== --- head/sys/contrib/zstd/README.md Mon Oct 22 17:42:57 2018 (r339605) +++ head/sys/contrib/zstd/README.md Mon Oct 22 18:29:12 2018 (r339606) @@ -4,7 +4,7 @@ __Zstandard__, or `zstd` as short version, is a fast l targeting real-time compression scenarios at zlib-level and better compression ratios. It's backed by a very fast entropy stage, provided by [Huff0 and FSE library](https://github.com/Cyan4973/FiniteStateEntropy). -The project is provided as an open-source BSD-licensed **C** library, +The project is provided as an open-source dual [BSD](LICENSE) and [GPLv2](COPYING) licensed **C** library, and a command line utility producing and decoding `.zst`, `.gz`, `.xz` and `.lz4` files. Should your project require another programming language, a list of known ports and bindings is provided on [Zstandard homepage](http://www.zstd.net/#other-languages). @@ -120,6 +120,8 @@ Other available options include: A `cmake` project generator is provided within `build/cmake`. It can generate Makefiles or other build scripts to create `zstd` binary, and `libzstd` dynamic and static libraries. + +By default, `CMAKE_BUILD_TYPE` is set to `Release`. #### Meson Modified: head/sys/contrib/zstd/TESTING.md ============================================================================== --- head/sys/contrib/zstd/TESTING.md Mon Oct 22 17:42:57 2018 (r339605) +++ head/sys/contrib/zstd/TESTING.md Mon Oct 22 18:29:12 2018 (r339606) @@ -41,4 +41,4 @@ They consist of the following tests: - `pzstd` with asan and tsan, as well as in 32-bits mode - Testing `zstd` with legacy mode off - Testing `zbuff` (old streaming API) -- Entire test suite and make install on OS X +- Entire test suite and make install on macOS Modified: head/sys/contrib/zstd/appveyor.yml ============================================================================== --- head/sys/contrib/zstd/appveyor.yml Mon Oct 22 17:42:57 2018 (r339605) +++ head/sys/contrib/zstd/appveyor.yml Mon Oct 22 18:29:12 2018 (r339606) @@ -181,15 +181,15 @@ - COMPILER: "gcc" HOST: "mingw" PLATFORM: "x64" - SCRIPT: "make allzstd" + SCRIPT: "CPPFLAGS=-DDEBUGLEVEL=2 CFLAGS=-Werror make -j allzstd DEBUGLEVEL=2" - COMPILER: "gcc" HOST: "mingw" PLATFORM: "x86" - SCRIPT: "make allzstd" + SCRIPT: "CFLAGS=-Werror make -j allzstd" - COMPILER: "clang" HOST: "mingw" PLATFORM: "x64" - SCRIPT: "MOREFLAGS='--target=x86_64-w64-mingw32 -Werror -Wconversion -Wno-sign-conversion' make allzstd" + SCRIPT: "CFLAGS='--target=x86_64-w64-mingw32 -Werror -Wconversion -Wno-sign-conversion' make -j allzstd" - COMPILER: "visual" HOST: "visual" Modified: head/sys/contrib/zstd/contrib/gen_html/Makefile ============================================================================== --- head/sys/contrib/zstd/contrib/gen_html/Makefile Mon Oct 22 17:42:57 2018 (r339605) +++ head/sys/contrib/zstd/contrib/gen_html/Makefile Mon Oct 22 18:29:12 2018 (r339606) @@ -10,7 +10,7 @@ CXXFLAGS ?= -O3 CXXFLAGS += -Wall -Wextra -Wcast-qual -Wcast-align -Wshadow -Wstrict-aliasing=1 -Wswitch-enum -Wno-comment CXXFLAGS += $(MOREFLAGS) -FLAGS = $(CPPFLAGS) $(CXXFLAGS) $(CXXFLAGS) $(LDFLAGS) +FLAGS = $(CPPFLAGS) $(CXXFLAGS) $(LDFLAGS) ZSTDAPI = ../../lib/zstd.h ZSTDMANUAL = ../../doc/zstd_manual.html Modified: head/sys/contrib/zstd/contrib/meson/meson.build ============================================================================== --- head/sys/contrib/zstd/contrib/meson/meson.build Mon Oct 22 17:42:57 2018 (r339605) +++ head/sys/contrib/zstd/contrib/meson/meson.build Mon Oct 22 18:29:12 2018 (r339606) @@ -18,6 +18,7 @@ libzstd_srcs = [ join_paths(common_dir, 'error_private.c'), join_paths(common_dir, 'xxhash.c'), join_paths(compress_dir, 'fse_compress.c'), + join_paths(compress_dir, 'hist.c'), join_paths(compress_dir, 'huf_compress.c'), join_paths(compress_dir, 'zstd_compress.c'), join_paths(compress_dir, 'zstd_fast.c'), @@ -130,6 +131,7 @@ test('fuzzer', fuzzer) if target_machine.system() != 'windows' paramgrill = executable('paramgrill', datagen_c, join_paths(tests_dir, 'paramgrill.c'), + join_paths(programs_dir, 'bench.c'), include_directories: test_includes, link_with: libzstd, dependencies: libm) Modified: head/sys/contrib/zstd/contrib/pzstd/Makefile ============================================================================== --- head/sys/contrib/zstd/contrib/pzstd/Makefile Mon Oct 22 17:42:57 2018 (r339605) +++ head/sys/contrib/zstd/contrib/pzstd/Makefile Mon Oct 22 18:29:12 2018 (r339606) @@ -42,7 +42,7 @@ PZSTD_LDFLAGS = EXTRA_FLAGS = ALL_CFLAGS = $(EXTRA_FLAGS) $(CPPFLAGS) $(PZSTD_CPPFLAGS) $(CFLAGS) $(PZSTD_CFLAGS) ALL_CXXFLAGS = $(EXTRA_FLAGS) $(CPPFLAGS) $(PZSTD_CPPFLAGS) $(CXXFLAGS) $(PZSTD_CXXFLAGS) -ALL_LDFLAGS = $(EXTRA_FLAGS) $(LDFLAGS) $(PZSTD_LDFLAGS) +ALL_LDFLAGS = $(EXTRA_FLAGS) $(CXXFLAGS) $(LDFLAGS) $(PZSTD_LDFLAGS) # gtest libraries need to go before "-lpthread" because they depend on it. @@ -50,7 +50,7 @@ GTEST_LIB = -L googletest/build/googlemock/gtest LIBS = # Compilation commands -LD_COMMAND = $(CXX) $^ $(ALL_LDFLAGS) $(LIBS) -lpthread -o $@ +LD_COMMAND = $(CXX) $^ $(ALL_LDFLAGS) $(LIBS) -pthread -o $@ CC_COMMAND = $(CC) $(DEPFLAGS) $(ALL_CFLAGS) -c $< -o $@ CXX_COMMAND = $(CXX) $(DEPFLAGS) $(ALL_CXXFLAGS) -c $< -o $@ Modified: head/sys/contrib/zstd/contrib/pzstd/Options.cpp ============================================================================== --- head/sys/contrib/zstd/contrib/pzstd/Options.cpp Mon Oct 22 17:42:57 2018 (r339605) +++ head/sys/contrib/zstd/contrib/pzstd/Options.cpp Mon Oct 22 18:29:12 2018 (r339606) @@ -18,17 +18,6 @@ #include #include -#if defined(MSDOS) || defined(OS2) || defined(WIN32) || defined(_WIN32) || \ - defined(__CYGWIN__) -#include /* _isatty */ -#define IS_CONSOLE(stdStream) _isatty(_fileno(stdStream)) -#elif defined(_POSIX_C_SOURCE) || defined(_XOPEN_SOURCE) || defined(_POSIX_SOURCE) || (defined(__APPLE__) && defined(__MACH__)) || \ - defined(__DragonFly__) || defined(__FreeBSD__) || defined(__NetBSD__) || defined(__OpenBSD__) /* https://sourceforge.net/p/predef/wiki/OperatingSystems/ */ -#include /* isatty */ -#define IS_CONSOLE(stdStream) isatty(fileno(stdStream)) -#else -#define IS_CONSOLE(stdStream) 0 -#endif namespace pzstd { @@ -85,7 +74,7 @@ void usage() { std::fprintf(stderr, "Usage:\n"); std::fprintf(stderr, " pzstd [args] [FILE(s)]\n"); std::fprintf(stderr, "Parallel ZSTD options:\n"); - std::fprintf(stderr, " -p, --processes # : number of threads to use for (de)compression (default:%d)\n", defaultNumThreads()); + std::fprintf(stderr, " -p, --processes # : number of threads to use for (de)compression (default:)\n"); std::fprintf(stderr, "ZSTD options:\n"); std::fprintf(stderr, " -# : # compression level (1-%d, default:%d)\n", kMaxNonUltraCompressionLevel, kDefaultCompressionLevel); Modified: head/sys/contrib/zstd/contrib/pzstd/Pzstd.cpp ============================================================================== --- head/sys/contrib/zstd/contrib/pzstd/Pzstd.cpp Mon Oct 22 17:42:57 2018 (r339605) +++ head/sys/contrib/zstd/contrib/pzstd/Pzstd.cpp Mon Oct 22 18:29:12 2018 (r339606) @@ -6,6 +6,7 @@ * LICENSE file in the root directory of this source tree) and the GPLv2 (found * in the COPYING file in the root directory of this source tree). */ +#include "platform.h" /* Large Files support, SET_BINARY_MODE */ #include "Pzstd.h" #include "SkippableFrame.h" #include "utils/FileSystem.h" @@ -21,14 +22,6 @@ #include #include -#if defined(MSDOS) || defined(OS2) || defined(WIN32) || defined(_WIN32) || defined(__CYGWIN__) -# include /* _O_BINARY */ -# include /* _setmode, _isatty */ -# define SET_BINARY_MODE(file) { if (_setmode(_fileno(file), _O_BINARY) == -1) perror("Cannot set _O_BINARY"); } -#else -# include /* isatty */ -# define SET_BINARY_MODE(file) -#endif namespace pzstd { Modified: head/sys/contrib/zstd/contrib/seekable_format/examples/Makefile ============================================================================== --- head/sys/contrib/zstd/contrib/seekable_format/examples/Makefile Mon Oct 22 17:42:57 2018 (r339605) +++ head/sys/contrib/zstd/contrib/seekable_format/examples/Makefile Mon Oct 22 18:29:12 2018 (r339606) @@ -9,19 +9,25 @@ # This Makefile presumes libzstd is built, using `make` in / or /lib/ -LDFLAGS += ../../../lib/libzstd.a +ZSTDLIB_PATH = ../../../lib +ZSTDLIB_NAME = libzstd.a +ZSTDLIB = $(ZSTDLIB_PATH)/$(ZSTDLIB_NAME) + CPPFLAGS += -I../ -I../../../lib -I../../../lib/common CFLAGS ?= -O3 CFLAGS += -g -SEEKABLE_OBJS = ../zstdseek_compress.c ../zstdseek_decompress.c +SEEKABLE_OBJS = ../zstdseek_compress.c ../zstdseek_decompress.c $(ZSTDLIB) .PHONY: default all clean test default: all all: seekable_compression seekable_decompression parallel_processing + +$(ZSTDLIB): + make -C $(ZSTDLIB_PATH) $(ZSTDLIB_NAME) seekable_compression : seekable_compression.c $(SEEKABLE_OBJS) $(CC) $(CPPFLAGS) $(CFLAGS) $^ $(LDFLAGS) -o $@ Modified: head/sys/contrib/zstd/contrib/seekable_format/examples/seekable_compression.c ============================================================================== --- head/sys/contrib/zstd/contrib/seekable_format/examples/seekable_compression.c Mon Oct 22 17:42:57 2018 (r339605) +++ head/sys/contrib/zstd/contrib/seekable_format/examples/seekable_compression.c Mon Oct 22 18:29:12 2018 (r339606) @@ -101,7 +101,7 @@ static void compressFile_orDie(const char* fname, cons free(buffOut); } -static const char* createOutFilename_orDie(const char* filename) +static char* createOutFilename_orDie(const char* filename) { size_t const inL = strlen(filename); size_t const outL = inL + 5; @@ -109,7 +109,7 @@ static const char* createOutFilename_orDie(const char* memset(outSpace, 0, outL); strcat(outSpace, filename); strcat(outSpace, ".zst"); - return (const char*)outSpace; + return (char*)outSpace; } int main(int argc, const char** argv) { @@ -124,8 +124,9 @@ int main(int argc, const char** argv) { { const char* const inFileName = argv[1]; unsigned const frameSize = (unsigned)atoi(argv[2]); - const char* const outFileName = createOutFilename_orDie(inFileName); + char* const outFileName = createOutFilename_orDie(inFileName); compressFile_orDie(inFileName, outFileName, 5, frameSize); + free(outFileName); } return 0; Modified: head/sys/contrib/zstd/contrib/seekable_format/examples/seekable_decompression.c ============================================================================== --- head/sys/contrib/zstd/contrib/seekable_format/examples/seekable_decompression.c Mon Oct 22 17:42:57 2018 (r339605) +++ head/sys/contrib/zstd/contrib/seekable_format/examples/seekable_decompression.c Mon Oct 22 18:29:12 2018 (r339606) @@ -84,7 +84,7 @@ static void fseek_orDie(FILE* file, long int offset, i } -static void decompressFile_orDie(const char* fname, unsigned startOffset, unsigned endOffset) +static void decompressFile_orDie(const char* fname, off_t startOffset, off_t endOffset) { FILE* const fin = fopen_orDie(fname, "rb"); FILE* const fout = stdout; @@ -129,8 +129,8 @@ int main(int argc, const char** argv) { const char* const inFilename = argv[1]; - unsigned const startOffset = (unsigned) atoi(argv[2]); - unsigned const endOffset = (unsigned) atoi(argv[3]); + off_t const startOffset = atoll(argv[2]); + off_t const endOffset = atoll(argv[3]); decompressFile_orDie(inFilename, startOffset, endOffset); } Modified: head/sys/contrib/zstd/contrib/seekable_format/zstd_seekable.h ============================================================================== --- head/sys/contrib/zstd/contrib/seekable_format/zstd_seekable.h Mon Oct 22 17:42:57 2018 (r339605) +++ head/sys/contrib/zstd/contrib/seekable_format/zstd_seekable.h Mon Oct 22 18:29:12 2018 (r339606) @@ -6,8 +6,10 @@ extern "C" { #endif #include +#include "zstd.h" /* ZSTDLIB_API */ -static const unsigned ZSTD_seekTableFooterSize = 9; + +#define ZSTD_seekTableFooterSize 9 #define ZSTD_SEEKABLE_MAGICNUMBER 0x8F92EAB1 Modified: head/sys/contrib/zstd/contrib/seekable_format/zstdseek_decompress.c ============================================================================== --- head/sys/contrib/zstd/contrib/seekable_format/zstdseek_decompress.c Mon Oct 22 17:42:57 2018 (r339605) +++ head/sys/contrib/zstd/contrib/seekable_format/zstdseek_decompress.c Mon Oct 22 18:29:12 2018 (r339606) @@ -24,7 +24,7 @@ #endif /* ************************************************************ -* Avoid fseek()'s 2GiB barrier with MSVC, MacOS, *BSD, MinGW +* Avoid fseek()'s 2GiB barrier with MSVC, macOS, *BSD, MinGW ***************************************************************/ #if defined(_MSC_VER) && _MSC_VER >= 1400 # define LONG_SEEK _fseeki64 @@ -56,6 +56,7 @@ #include /* malloc, free */ #include /* FILE* */ +#include #define XXH_STATIC_LINKING_ONLY #define XXH_NAMESPACE ZSTD_ @@ -88,7 +89,7 @@ static int ZSTD_seekable_read_FILE(void* opaque, void* return 0; } -static int ZSTD_seekable_seek_FILE(void* opaque, S64 offset, int origin) +static int ZSTD_seekable_seek_FILE(void* opaque, long long offset, int origin) { int const ret = LONG_SEEK((FILE*)opaque, offset, origin); if (ret) return ret; @@ -110,9 +111,9 @@ static int ZSTD_seekable_read_buff(void* opaque, void* return 0; } -static int ZSTD_seekable_seek_buff(void* opaque, S64 offset, int origin) +static int ZSTD_seekable_seek_buff(void* opaque, long long offset, int origin) { - buffWrapper_t* buff = (buffWrapper_t*) opaque; + buffWrapper_t* const buff = (buffWrapper_t*) opaque; unsigned long long newOffset; switch (origin) { case SEEK_SET: @@ -124,6 +125,8 @@ static int ZSTD_seekable_seek_buff(void* opaque, S64 o case SEEK_END: newOffset = (unsigned long long)buff->size - offset; break; + default: + assert(0); /* not possible */ } if (newOffset > buff->size) { return -1; @@ -197,7 +200,7 @@ size_t ZSTD_seekable_free(ZSTD_seekable* zs) * Performs a binary search to find the last frame with a decompressed offset * <= pos * @return : the frame's index */ -U32 ZSTD_seekable_offsetToFrameIndex(ZSTD_seekable* const zs, U64 pos) +U32 ZSTD_seekable_offsetToFrameIndex(ZSTD_seekable* const zs, unsigned long long pos) { U32 lo = 0; U32 hi = zs->seekTable.tableLen; @@ -222,13 +225,13 @@ U32 ZSTD_seekable_getNumFrames(ZSTD_seekable* const zs return zs->seekTable.tableLen; } -U64 ZSTD_seekable_getFrameCompressedOffset(ZSTD_seekable* const zs, U32 frameIndex) +unsigned long long ZSTD_seekable_getFrameCompressedOffset(ZSTD_seekable* const zs, U32 frameIndex) { if (frameIndex >= zs->seekTable.tableLen) return ZSTD_SEEKABLE_FRAMEINDEX_TOOLARGE; return zs->seekTable.entries[frameIndex].cOffset; } -U64 ZSTD_seekable_getFrameDecompressedOffset(ZSTD_seekable* const zs, U32 frameIndex) +unsigned long long ZSTD_seekable_getFrameDecompressedOffset(ZSTD_seekable* const zs, U32 frameIndex) { if (frameIndex >= zs->seekTable.tableLen) return ZSTD_SEEKABLE_FRAMEINDEX_TOOLARGE; return zs->seekTable.entries[frameIndex].dOffset; @@ -294,7 +297,6 @@ static size_t ZSTD_seekable_loadSeekTable(ZSTD_seekabl { /* Allocate an extra entry at the end so that we can do size * computations on the last element without special case */ seekEntry_t* entries = (seekEntry_t*)malloc(sizeof(seekEntry_t) * (numFrames + 1)); - const BYTE* tableBase = zs->inBuff + ZSTD_skippableHeaderSize; U32 idx = 0; U32 pos = 8; @@ -311,8 +313,8 @@ static size_t ZSTD_seekable_loadSeekTable(ZSTD_seekabl /* compute cumulative positions */ for (; idx < numFrames; idx++) { if (pos + sizePerEntry > SEEKABLE_BUFF_SIZE) { - U32 const toRead = MIN(remaining, SEEKABLE_BUFF_SIZE); U32 const offset = SEEKABLE_BUFF_SIZE - pos; + U32 const toRead = MIN(remaining, SEEKABLE_BUFF_SIZE - offset); memmove(zs->inBuff, zs->inBuff + pos, offset); /* move any data we haven't read yet */ CHECK_IO(src.read(src.opaque, zs->inBuff+offset, toRead)); remaining -= toRead; @@ -372,7 +374,7 @@ size_t ZSTD_seekable_initAdvanced(ZSTD_seekable* zs, Z return 0; } -size_t ZSTD_seekable_decompress(ZSTD_seekable* zs, void* dst, size_t len, U64 offset) +size_t ZSTD_seekable_decompress(ZSTD_seekable* zs, void* dst, size_t len, unsigned long long offset) { U32 targetFrame = ZSTD_seekable_offsetToFrameIndex(zs, offset); do { Added: head/sys/contrib/zstd/doc/images/cdict_v136.png ============================================================================== Binary file. No diff available. Added: head/sys/contrib/zstd/doc/images/zstd_cdict_v1_3_5.png ============================================================================== Binary file. No diff available. Modified: head/sys/contrib/zstd/doc/zstd_compression_format.md ============================================================================== --- head/sys/contrib/zstd/doc/zstd_compression_format.md Mon Oct 22 17:42:57 2018 (r339605) +++ head/sys/contrib/zstd/doc/zstd_compression_format.md Mon Oct 22 18:29:12 2018 (r339606) @@ -16,7 +16,7 @@ Distribution of this document is unlimited. ### Version -0.2.6 (19/08/17) +0.3.0 (25/09/18) Introduction @@ -27,6 +27,8 @@ that is independent of CPU type, operating system, file system and character set, suitable for file compression, pipe and streaming compression, using the [Zstandard algorithm](http://www.zstandard.org). +The text of the specification assumes a basic background in programming +at the level of bits and other primitive data representations. The data can be produced or consumed, even for an arbitrarily long sequentially presented input data stream, @@ -39,11 +41,6 @@ for detection of data corruption. The data format defined by this specification does not attempt to allow random access to compressed data. -This specification is intended for use by implementers of software -to compress data into Zstandard format and/or decompress data from Zstandard format. -The text of the specification assumes a basic background in programming -at the level of bits and other primitive data representations. - Unless otherwise indicated below, a compliant compressor must produce data sets that conform to the specifications presented here. @@ -57,6 +54,12 @@ Whenever it does not support a parameter defined in th it must produce a non-ambiguous error code and associated error message explaining which parameter is unsupported. +This specification is intended for use by implementers of software +to compress data into Zstandard format and/or decompress data from Zstandard format. +The Zstandard format is supported by an open source reference implementation, +written in portable C, and available at : https://github.com/facebook/zstd . + + ### Overall conventions In this document: - square brackets i.e. `[` and `]` are used to indicate optional fields or parameters. @@ -69,7 +72,7 @@ A frame is completely independent, has a defined begin and a set of parameters which tells the decoder how to decompress it. A frame encapsulates one or multiple __blocks__. -Each block can be compressed or not, +Each block contains arbitrary content, which is described by its header, and has a guaranteed maximum content size, which depends on frame parameters. Unlike frames, each block depends on previous blocks for proper decoding. However, each block can be decompressed without waiting for its successor, @@ -92,14 +95,14 @@ Overview Frames ------ Zstandard compressed data is made of one or more __frames__. -Each frame is independent and can be decompressed indepedently of other frames. +Each frame is independent and can be decompressed independently of other frames. The decompressed content of multiple concatenated frames is the concatenation of each frame decompressed content. There are two frame formats defined by Zstandard: Zstandard frames and Skippable frames. Zstandard frames contain compressed data, while -skippable frames contain no data and can be used for metadata. +skippable frames contain custom user metadata. ## Zstandard frames The structure of a single Zstandard frame is following: @@ -112,6 +115,11 @@ __`Magic_Number`__ 4 Bytes, __little-endian__ format. Value : 0xFD2FB528 +Note: This value was selected to be less probable to find at the beginning of some random file. +It avoids trivial patterns (0x00, 0xFF, repeated bytes, increasing bytes, etc.), +contains byte values outside of ASCII range, +and doesn't map into UTF8 space. +It reduces the chances that a text file represent this value by accident. __`Frame_Header`__ @@ -171,8 +179,8 @@ according to the following table: |`FCS_Field_Size`| 0 or 1 | 2 | 4 | 8 | When `Flag_Value` is `0`, `FCS_Field_Size` depends on `Single_Segment_flag` : -if `Single_Segment_flag` is set, `Field_Size` is 1. -Otherwise, `Field_Size` is 0 : `Frame_Content_Size` is not provided. +if `Single_Segment_flag` is set, `FCS_Field_Size` is 1. +Otherwise, `FCS_Field_Size` is 0 : `Frame_Content_Size` is not provided. __`Single_Segment_flag`__ @@ -196,10 +204,10 @@ depending on local limitations. __`Unused_bit`__ -The value of this bit should be set to zero. -A decoder compliant with this specification version shall not interpret it. -It might be used in a future version, -to signal a property which is not mandatory to properly decode the frame. +A decoder compliant with this specification version shall not interpret this bit. +It might be used in any future version, +to signal a property which is transparent to properly decode the frame. +An encoder compliant with this specification version must set this bit to zero. __`Reserved_bit`__ @@ -218,11 +226,11 @@ __`Dictionary_ID_flag`__ This is a 2-bits flag (`= FHD & 3`), telling if a dictionary ID is provided within the header. -It also specifies the size of this field as `Field_Size`. +It also specifies the size of this field as `DID_Field_Size`. -|`Flag_Value`| 0 | 1 | 2 | 3 | -| ---------- | --- | --- | --- | --- | -|`Field_Size`| 0 | 1 | 2 | 4 | +|`Flag_Value` | 0 | 1 | 2 | 3 | +| -------------- | --- | --- | --- | --- | +|`DID_Field_Size`| 0 | 1 | 2 | 4 | #### `Window_Descriptor` @@ -249,6 +257,9 @@ Window_Size = windowBase + windowAdd; The minimum `Window_Size` is 1 KB. The maximum `Window_Size` is `(1<<41) + 7*(1<<38)` bytes, which is 3.75 TB. +In general, larger `Window_Size` tend to improve compression ratio, +but at the cost of memory usage. + To properly decode compressed data, a decoder will need to allocate a buffer of at least `Window_Size` bytes. @@ -257,8 +268,8 @@ a decoder is allowed to reject a compressed frame which requests a memory size beyond decoder's authorized range. For improved interoperability, -decoders are recommended to be compatible with `Window_Size <= 8 MB`, -and encoders are recommended to not request more than 8 MB. +it's recommended for decoders to support `Window_Size` of up to 8 MB, +and it's recommended for encoders to not generate frame requiring `Window_Size` larger than 8 MB. It's merely a recommendation though, decoders are free to support larger or lower limits, depending on local limitations. @@ -268,9 +279,10 @@ depending on local limitations. This is a variable size field, which contains the ID of the dictionary required to properly decode the frame. `Dictionary_ID` field is optional. When it's not present, -it's up to the decoder to make sure it uses the correct dictionary. +it's up to the decoder to know which dictionary to use. -Field size depends on `Dictionary_ID_flag`. +`Dictionary_ID` field size is provided by `DID_Field_Size`. +`DID_Field_Size` is directly derived from value of `Dictionary_ID_flag`. 1 byte can represent an ID 0-255. 2 bytes can represent an ID 0-65535. 4 bytes can represent an ID 0-4294967295. @@ -280,13 +292,21 @@ It's allowed to represent a small ID (for example `13` with a large 4-bytes dictionary ID, even if it is less efficient. _Reserved ranges :_ -If the frame is going to be distributed in a private environment, -any dictionary ID can be used. -However, for public distribution of compressed frames using a dictionary, -the following ranges are reserved and shall not be used : +Within private environments, any `Dictionary_ID` can be used. + +However, for frames and dictionaries distributed in public space, +`Dictionary_ID` must be attributed carefully. +Rules for public environment are not yet decided, +but the following ranges are reserved for some future registrar : - low range : `<= 32767` - high range : `>= (1 << 31)` +Outside of these ranges, any value of `Dictionary_ID` +which is both `>= 32768` and `< (1<<31)` can be used freely, +even in public environment. + + + #### `Frame_Content_Size` This is the original (uncompressed) size. This information is optional. @@ -359,22 +379,23 @@ There are 4 block types : - `Reserved` - this is not a block. This value cannot be used with current version of this specification. + If such a value is present, it is considered corrupted data. __`Block_Size`__ The upper 21 bits of `Block_Header` represent the `Block_Size`. +`Block_Size` is the size of the block excluding the header. +A block can contain any number of bytes (even zero), up to +`Block_Maximum_Decompressed_Size`, which is the smallest of: +- Window_Size +- 128 KB -Block sizes must respect a few rules : -- For `Compressed_Block`, `Block_Size` is always strictly less than decompressed size. -- Block decompressed size is always <= `Window_Size` -- Block decompressed size is always <= 128 KB. +A `Compressed_Block` has the extra restriction that `Block_Size` is always +strictly less than the decompressed size. +If this condition cannot be respected, +the block must be sent uncompressed instead (`Raw_Block`). -A block can contain any number of bytes (even empty), -up to `Block_Maximum_Decompressed_Size`, which is the smallest of : -- `Window_Size` -- 128 KB - Compressed Blocks ----------------- To decompress a compressed block, the compressed size must be provided @@ -390,11 +411,17 @@ data in [Sequence Execution](#sequence-execution) #### Prerequisites To decode a compressed block, the following elements are necessary : - Previous decoded data, up to a distance of `Window_Size`, - or all previously decoded data when `Single_Segment_flag` is set. + or beginning of the Frame, whichever is smaller. - List of "recent offsets" from previous `Compressed_Block`. -- Decoding tables of previous `Compressed_Block` for each symbol type - (literals, literals lengths, match lengths, offsets). +- The previous Huffman tree, required by `Treeless_Literals_Block` type +- Previous FSE decoding tables, required by `Repeat_Mode` + for each symbol type (literals lengths, match lengths, offsets) +Note that decoding tables aren't always from the previous `Compressed_Block`. + +- Every decoding table can come from a dictionary. +- The Huffman tree comes from the previous `Compressed_Literals_Block`. + Literals Section ---------------- All literals are regrouped in the first part of the block. @@ -405,11 +432,11 @@ Literals can be stored uncompressed or compressed usin When compressed, an optional tree description can be present, followed by 1 or 4 streams. -| `Literals_Section_Header` | [`Huffman_Tree_Description`] | Stream1 | [Stream2] | [Stream3] | [Stream4] | -| ------------------------- | ---------------------------- | ------- | --------- | --------- | --------- | +| `Literals_Section_Header` | [`Huffman_Tree_Description`] | [jumpTable] | Stream1 | [Stream2] | [Stream3] | [Stream4] | +| ------------------------- | ---------------------------- | ----------- | ------- | --------- | --------- | --------- | -#### `Literals_Section_Header` +### `Literals_Section_Header` Header is in charge of describing how literals are packed. It's a byte-aligned variable-size bitfield, ranging from 1 to 5 bytes, @@ -460,18 +487,21 @@ For values spanning several bytes, convention is __lit __`Size_Format` for `Raw_Literals_Block` and `RLE_Literals_Block`__ : -- Value ?0 : `Size_Format` uses 1 bit. +`Size_Format` uses 1 _or_ 2 bits. +Its value is : `Size_Format = (Literals_Section_Header[0]>>2) & 3` + +- `Size_Format` == 00 or 10 : `Size_Format` uses 1 bit. `Regenerated_Size` uses 5 bits (0-31). - `Literals_Section_Header` has 1 byte. - `Regenerated_Size = Header[0]>>3` -- Value 01 : `Size_Format` uses 2 bits. + `Literals_Section_Header` uses 1 byte. + `Regenerated_Size = Literals_Section_Header[0]>>3` +- `Size_Format` == 01 : `Size_Format` uses 2 bits. `Regenerated_Size` uses 12 bits (0-4095). - `Literals_Section_Header` has 2 bytes. - `Regenerated_Size = (Header[0]>>4) + (Header[1]<<4)` -- Value 11 : `Size_Format` uses 2 bits. + `Literals_Section_Header` uses 2 bytes. + `Regenerated_Size = (Literals_Section_Header[0]>>4) + (Literals_Section_Header[1]<<4)` +- `Size_Format` == 11 : `Size_Format` uses 2 bits. `Regenerated_Size` uses 20 bits (0-1048575). - `Literals_Section_Header` has 3 bytes. - `Regenerated_Size = (Header[0]>>4) + (Header[1]<<4) + (Header[2]<<12)` + `Literals_Section_Header` uses 3 bytes. + `Regenerated_Size = (Literals_Section_Header[0]>>4) + (Literals_Section_Header[1]<<4) + (Literals_Section_Header[2]<<12)` Only Stream1 is present for these cases. Note : it's allowed to represent a short value (for example `13`) @@ -479,66 +509,74 @@ using a long format, even if it's less efficient. __`Size_Format` for `Compressed_Literals_Block` and `Treeless_Literals_Block`__ : -- Value 00 : _A single stream_. +`Size_Format` always uses 2 bits. + +- `Size_Format` == 00 : _A single stream_. Both `Regenerated_Size` and `Compressed_Size` use 10 bits (0-1023). - `Literals_Section_Header` has 3 bytes. -- Value 01 : 4 streams. + `Literals_Section_Header` uses 3 bytes. +- `Size_Format` == 01 : 4 streams. Both `Regenerated_Size` and `Compressed_Size` use 10 bits (0-1023). - `Literals_Section_Header` has 3 bytes. -- Value 10 : 4 streams. + `Literals_Section_Header` uses 3 bytes. +- `Size_Format` == 10 : 4 streams. Both `Regenerated_Size` and `Compressed_Size` use 14 bits (0-16383). - `Literals_Section_Header` has 4 bytes. -- Value 11 : 4 streams. + `Literals_Section_Header` uses 4 bytes. +- `Size_Format` == 11 : 4 streams. Both `Regenerated_Size` and `Compressed_Size` use 18 bits (0-262143). - `Literals_Section_Header` has 5 bytes. + `Literals_Section_Header` uses 5 bytes. Both `Compressed_Size` and `Regenerated_Size` fields follow __little-endian__ convention. Note: `Compressed_Size` __includes__ the size of the Huffman Tree description _when_ it is present. -### Raw Literals Block +#### Raw Literals Block The data in Stream1 is `Regenerated_Size` bytes long, it contains the raw literals data to be used during [Sequence Execution]. -### RLE Literals Block +#### RLE Literals Block Stream1 consists of a single byte which should be repeated `Regenerated_Size` times to generate the decoded literals. -### Compressed Literals Block and Treeless Literals Block +#### Compressed Literals Block and Treeless Literals Block Both of these modes contain Huffman encoded data. -`Treeless_Literals_Block` does not have a `Huffman_Tree_Description`. -#### `Huffman_Tree_Description` +For `Treeless_Literals_Block`, +the Huffman table comes from previously compressed literals block, +or from a dictionary. + + +### `Huffman_Tree_Description` This section is only present when `Literals_Block_Type` type is `Compressed_Literals_Block` (`2`). The format of the Huffman tree description can be found at [Huffman Tree description](#huffman-tree-description). The size of `Huffman_Tree_Description` is determined during decoding process, it must be used to determine where streams begin. `Total_Streams_Size = Compressed_Size - Huffman_Tree_Description_Size`. -For `Treeless_Literals_Block`, *** DIFF OUTPUT TRUNCATED AT 1000 LINES ***