Date: Fri, 25 Dec 2020 15:31:38 GMT From: Conrad Meyer <cem@FreeBSD.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org Subject: git: f6ae97673c28 - Import zstd 1.4.8 Message-ID: <202012251531.0BPFVcgh007718@gitrepo.freebsd.org>
next in thread | raw e-mail | index | archive | help
The branch vendor/zstd has been updated by cem: URL: https://cgit.FreeBSD.org/src/commit/?id=f6ae97673c28bdb9ae795bd235ab6f26f2536a2d commit f6ae97673c28bdb9ae795bd235ab6f26f2536a2d Author: Conrad Meyer <cem@FreeBSD.org> AuthorDate: 2020-12-25 00:21:42 +0000 Commit: Conrad Meyer <cem@FreeBSD.org> CommitDate: 2020-12-25 00:21:42 +0000 Import zstd 1.4.8 --- CHANGELOG | 173 ++- CONTRIBUTING.md | 18 +- Makefile | 68 +- README.md | 12 +- TESTING.md | 2 +- appveyor.yml | 19 +- doc/zstd_compression_format.md | 114 +- doc/zstd_manual.html | 314 +++-- examples/Makefile | 53 +- examples/streaming_compression.c | 1 + examples/streaming_compression_thread_pool.c | 178 +++ lib/Makefile | 365 ++++-- lib/README.md | 28 + lib/common/bitstream.h | 39 +- lib/common/compiler.h | 119 +- lib/common/cpu.h | 2 - lib/common/debug.h | 29 +- lib/common/entropy_common.c | 230 +++- lib/common/error_private.c | 1 + lib/common/error_private.h | 2 +- lib/common/fse.h | 50 +- lib/common/fse_decompress.c | 139 +- lib/common/huf.h | 31 +- lib/common/mem.h | 159 +-- lib/common/pool.c | 38 +- lib/common/pool.h | 2 +- lib/common/threading.c | 11 +- lib/common/xxhash.c | 74 +- lib/common/xxhash.h | 2 +- lib/common/zstd_common.c | 18 +- lib/common/zstd_deps.h | 111 ++ lib/common/zstd_errors.h | 1 + lib/common/zstd_internal.h | 147 ++- lib/compress/fse_compress.c | 53 +- lib/compress/hist.c | 54 +- lib/compress/hist.h | 2 +- lib/compress/huf_compress.c | 316 +++-- lib/compress/zstd_compress.c | 1750 ++++++++++++++++++++------ lib/compress/zstd_compress_internal.h | 160 ++- lib/compress/zstd_compress_literals.c | 8 +- lib/compress/zstd_compress_sequences.c | 20 +- lib/compress/zstd_compress_superblock.c | 42 +- lib/compress/zstd_cwksp.h | 84 +- lib/compress/zstd_double_fast.c | 44 +- lib/compress/zstd_fast.c | 38 +- lib/compress/zstd_lazy.c | 428 +++++-- lib/compress/zstd_lazy.h | 20 + lib/compress/zstd_ldm.c | 77 +- lib/compress/zstd_ldm.h | 6 + lib/compress/zstd_opt.c | 235 +++- lib/compress/zstdmt_compress.c | 480 ++----- lib/compress/zstdmt_compress.h | 134 +- lib/decompress/huf_decompress.c | 502 +++++--- lib/decompress/zstd_ddict.c | 16 +- lib/decompress/zstd_ddict.h | 2 +- lib/decompress/zstd_decompress.c | 205 +-- lib/decompress/zstd_decompress_block.c | 184 ++- lib/decompress/zstd_decompress_block.h | 7 +- lib/decompress/zstd_decompress_internal.h | 21 +- lib/dictBuilder/cover.c | 49 +- lib/dictBuilder/cover.h | 2 +- lib/dictBuilder/fastcover.c | 39 +- lib/dictBuilder/zdict.c | 31 +- lib/dictBuilder/zdict.h | 2 +- lib/legacy/zstd_v01.c | 6 +- lib/legacy/zstd_v02.c | 6 +- lib/legacy/zstd_v03.c | 6 +- lib/legacy/zstd_v04.c | 8 +- lib/legacy/zstd_v05.c | 6 +- lib/legacy/zstd_v06.c | 6 +- lib/legacy/zstd_v07.c | 6 +- lib/libzstd.pc.in | 6 +- lib/zstd.h | 395 +++++- programs/Makefile | 284 +++-- programs/README.md | 70 +- programs/dibio.c | 2 +- programs/fileio.c | 462 +++++-- programs/fileio.h | 29 +- programs/platform.h | 6 + programs/timefn.h | 6 +- programs/util.c | 407 +++++- programs/util.h | 62 +- programs/zstd.1 | 77 +- programs/zstd.1.md | 150 ++- programs/zstdcli.c | 195 +-- programs/zstdgrep.1 | 2 +- programs/zstdless.1 | 2 +- zlibWrapper/Makefile | 4 +- zlibWrapper/zstd_zlibwrapper.c | 137 +- 89 files changed, 6820 insertions(+), 3081 deletions(-) diff --git a/CHANGELOG b/CHANGELOG index 0ed939a5bbb1..86092563177c 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -1,4 +1,65 @@ -v1.4.5 +v1.4.8 (Dec 18, 2020) +hotfix: wrong alignment of an internal buffer + +v1.4.7 (Dec 16, 2020) +perf: stronger --long mode at high compression levels, by @senhuang42 +perf: stronger --patch-from at high compression levels, thanks to --long improvements +perf: faster dictionary compression at medium compression levels, by @felixhandte +perf: small speed & memory usage improvements for ZSTD_compress2(), by @terrelln +perf: improved fast compression speeds with Visual Studio, by @animalize +cli : Set nb of threads with environment variable ZSTD_NBTHREADS, by @senhuang42 +cli : accept decompressing files with *.zstd suffix +cli : provide a condensed summary by default when processing multiple files +cli : fix : stdin input no longer confused as user prompt +cli : improve accuracy of several error messages +api : new sequence ingestion API, by @senhuang42 +api : shared thread pool: control total nb of threads used by multiple compression jobs, by @marxin +api : new ZSTD_getDictID_fromCDict(), by @LuAPi +api : zlibWrapper only uses public API, and is compatible with dynamic library, by @terrelln +api : fix : multithreaded compression has predictable output even in special cases (see #2327) (issue not accessible from cli) +api : fix : dictionary compression correctly respects dictionary compression level (see #2303) (issue not accessible from cli) +build: fix cmake script when using path with spaces, by @terrelln +build: improved compile-time detection of aarch64/neon platforms, by @bsdimp +build: Fix building on AIX 5.1, by @likema +build: compile paramgrill with cmake on Windows, requested by @mirh +doc : clarify repcode updates in format specification, by @felixhandte + +v1.4.6 +fix : Always return dstSize_tooSmall when that is the case +fix : Fix ZSTD_initCStream_advanced() with static allocation and no dictionary +perf: Improve small block decompression speed by 20%+, by @terrelln +perf: Reduce compression stack usage by 1 KB, by @terrelln +perf: Improve decompression speed by improving ZSTD_wildcopy, by @helloguo (#2252, #2256) +perf: Improve histogram construction, by @cyan4973 (#2253) +cli : Add --output-dir-mirror option, by @xxie24 (#2219) +cli : Warn when (de)compressing multiple files into a single output, by @senhuang42 (#2279) +cli : Improved progress bar and status summary when (de)compressing multiple files, by @senhuang42 (#2283) +cli : Call stat less often, by @felixhandte (#2262) +cli : Allow --patch-from XXX and --filelist XXX in addition to --patch-from=XXX and --filelist=XXX, by @cyan4973 (#2250) +cli : Allow --patch-from to compress stdin with --stream-size, by @bimbashrestha (#2206) +api : Do not install zbuff.h, since it has long been deprecated, by @cyan4973 (#2166). +api : Fix ZSTD_CCtx_setParameter() with ZSTD_c_compressionLevel to make 0 mean default level, by @i-do-cpp (#2291) +api : Rename ZSTDMT_NBTHREADS_MAX to ZSTDMT_NBWORKERS_MAX, by @marxin (#2228). +build: Install pkg-config file with CMake and MinGW, by @tonytheodore (#2183) +build: Install DLL with CMake on Windows, by @BioDataAnalysis (#2221) +build: Fix DLL install location with CMake, by @xantares and @bimbashrestha (#2186) +build: Add ZSTD_NO_UNUSED_FUNCTIONS macro to hide unused functions +build: Add ZSTD_NO_INTRINSICS macro to avoid explicit intrinsics +build: Add STATIC_BMI2 macro for compile time detection of BMI2 on MSVC, by @Niadb (#2258) +build: Fix -Wcomma warnings, by @cwoffenden +build: Remove distutils requirement for meson build, by @neheb (#2197) +build: Fix cli compilation with uclibc +build: Fix cli compilation without st_mtime, by @ffontaine (#2246) +build: Fix shadowing warnings in library +build: Fix single file library compilation with Enscripten, by @yoshihitoh (#2227) +misc: Improve single file library and include dictBuilder, by @cwoffenden +misc: Allow compression dictionaries with missing symbols +misc: Add freestanding translation script in contrib/freestanding_lib +misc: Collect all of zstd's libc dependencies into zstd_deps.h +doc : Add ZSTD_versionString() to manual, by @animalize +doc : Fix documentation for ZSTD_CCtxParams_setParameter(), by @felixhandte (#2270) + +v1.4.5 (May 22, 2020) fix : Compression ratio regression on huge files (> 3 GB) using high levels (--ultra) and multithreading, by @terrelln perf: Improved decompression speed: x64 : +10% (clang) / +5% (gcc); ARM : from +15% to +50%, depending on SoC, by @terrelln perf: Automatically downsizes ZSTD_DCtx when too large for too long (#2069, by @bimbashreshta) @@ -24,7 +85,7 @@ misc: Edit-distance match finder in contrib/ doc : Improved beginner CONTRIBUTING.md docs doc : New issue templates for zstd -v1.4.4 +v1.4.4 (Nov 6, 2019) perf: Improved decompression speed, by > 10%, by @terrelln perf: Better compression speed when re-using a context, by @felixhandte perf: Fix compression ratio when compressing large files with small dictionary, by @senhuang42 @@ -51,18 +112,18 @@ pack: modified pkgconfig, for better integration into openwrt, requested by @neh misc: Improved documentation : ZSTD_CLEVEL, DYNAMIC_BMI2, ZSTD_CDict, function deprecation, zstd format misc: fixed educational decoder : accept larger literals section, and removed UNALIGNED() macro -v1.4.3 +v1.4.3 (Aug 20, 2019) bug: Fix Dictionary Compression Ratio Regression by @cyan4973 (#1709) bug: Fix Buffer Overflow in legacy v0.3 decompression by @felixhandte (#1722) build: Add support for IAR C/C++ Compiler for Arm by @joseph0918 (#1705) -v1.4.2 +v1.4.2 (Jul 26, 2019) bug: Fix bug in zstd-0.5 decoder by @terrelln (#1696) bug: Fix seekable decompression in-memory API by @iburinoc (#1695) misc: Validate blocks are smaller than size limit by @vivekmg (#1685) misc: Restructure source files by @ephiepark (#1679) -v1.4.1 +v1.4.1 (Jul 20, 2019) bug: Fix data corruption in niche use cases by @terrelln (#1659) bug: Fuzz legacy modes, fix uncovered bugs by @terrelln (#1593, #1594, #1595) bug: Fix out of bounds read by @terrelln (#1590) @@ -92,7 +153,7 @@ build: Visual Studio: fix linking by @absotively (#1639) build: Fix MinGW-W64 build by @myzhang1029 (#1600) misc: Expand decodecorpus coverage by @ephiepark (#1664) -v1.4.0 +v1.4.0 (Apr 17, 2019) perf: Improve level 1 compression speed in most scenarios by 6% by @gbtucker and @terrelln api: Move the advanced API, including all functions in the staging section, to the stable section api: Make ZSTD_e_flush and ZSTD_e_end block for maximum forward progress @@ -129,7 +190,7 @@ misc: Optimize dictionary memory usage in corner cases misc: Improve the dictionary builder on small or homogeneous data misc: Fix spelling across the repo by @jsoref -v1.3.8 +v1.3.8 (Dec 28, 2018) perf: better decompression speed on large files (+7%) and cold dictionaries (+15%) perf: slightly better compression ratio at high compression modes api : finalized advanced API, last stage before "stable" status @@ -151,14 +212,14 @@ doc : clarified zstd_compression_format.md, by @ulikunitz misc: fixed zstdgrep, returns 1 on failure, by @lzutao misc: NEWS renamed as CHANGELOG, in accordance with fboss -v1.3.7 +v1.3.7 (Oct 20, 2018) perf: slightly better decompression speed on clang (depending on hardware target) fix : performance of dictionary compression for small input < 4 KB at levels 9 and 10 build: no longer build backtrace by default in release mode; restrict further automatic mode build: control backtrace support through build macro BACKTRACE misc: added man pages for zstdless and zstdgrep, by @samrussell -v1.3.6 +v1.3.6 (Oct 6, 2018) perf: much faster dictionary builder, by @jenniferliu perf: faster dictionary compression on small data when using multiple contexts, by @felixhandte perf: faster dictionary decompression when using a very large number of dictionaries simultaneously @@ -172,7 +233,7 @@ build: Read Legacy format is limited to v0.5+ by default. Can be changed at comp doc : zstd_compression_format.md updated to match wording in IETF RFC 8478 misc: tests/paramgrill, a parameter optimizer, by @GeorgeLu97 -v1.3.5 +v1.3.5 (Jun 29, 2018) perf: much faster dictionary compression, by @felixhandte perf: small quality improvement for dictionary generation, by @terrelln perf: slightly improved high compression levels (notably level 19) @@ -187,7 +248,7 @@ build: make and make all are compatible with -j doc : clarify zstd_compression_format.md, updated for IETF RFC process misc: pzstd compatible with reproducible compilation, by @lamby -v1.3.4 +v1.3.4 (Mar 27, 2018) perf: faster speed (especially decoding speed) on recent cpus (haswell+) perf: much better performance associating --long with multi-threading, by @terrelln perf: better compression at levels 13-15 @@ -205,7 +266,7 @@ build: VS2017 scripts, by @HaydnTrigg misc: all /contrib projects fixed misc: added /contrib/docker script by @gyscos -v1.3.3 +v1.3.3 (Dec 21, 2017) perf: faster zstd_opt strategy (levels 16-19) fix : bug #944 : multithreading with shared ditionary and large data, reported by @gsliepen cli : fix : content size written in header by default @@ -217,7 +278,7 @@ api : change : when setting `pledgedSrcSize`, use `ZSTD_CONTENTSIZE_UNKNOWN` mac build: fix : compilation under rhel6 and centos6, reported by @pixelb build: added `check` target -v1.3.2 +v1.3.2 (Oct 10, 2017) new : long range mode, using --long command, by Stella Lau (@stellamplau) new : ability to generate and decode magicless frames (#591) changed : maximum nb of threads reduced to 200, to avoid address space exhaustion in 32-bits mode @@ -240,7 +301,7 @@ example : added streaming_memory_usage license : changed /examples license to BSD + GPLv2 license : fix a few header files to reflect new license (#825) -v1.3.1 +v1.3.1 (Aug 21, 2017) New license : BSD + GPLv2 perf: substantially decreased memory usage in Multi-threading mode, thanks to reports by Tino Reichardt (@mcmilk) perf: Multi-threading supports up to 256 threads. Cap at 256 when more are requested (#760) @@ -255,7 +316,7 @@ new : contrib/adaptive-compression, I/O driven compression strength, by Paul Cru new : contrib/long_distance_matching, statistics by Stella Lau (@stellamplau) updated : contrib/linux-kernel, by Nick Terrell (@terrelln) -v1.3.0 +v1.3.0 (Jul 6, 2017) cli : new : `--list` command, by Paul Cruz cli : changed : xz/lzma support enabled by default cli : changed : `-t *` continue processing list after a decompression error @@ -270,7 +331,7 @@ tools : decodecorpus can generate random dictionary-compressed samples, by Paul new : contrib/seekable_format, demo and API, by Sean Purcell changed : contrib/linux-kernel, updated version and license, by Nick Terrell -v1.2.0 +v1.2.0 (May 5, 2017) cli : changed : Multithreading enabled by default (use target zstd-nomt or HAVE_THREAD=0 to disable) cli : new : command -T0 means "detect and use nb of cores", by Sean Purcell cli : new : zstdmt symlink hardwired to `zstd -T0` @@ -292,7 +353,7 @@ build: enabled Multi-threading support for *BSD, by Baptiste Daroussin tools: updated Paramgrill. Command -O# provides best parameters for sample and speed target. new : contrib/linux-kernel version, by Nick Terrell -v1.1.4 +v1.1.4 (Mar 18, 2017) cli : new : can compress in *.gz format, using --format=gzip command, by Przemyslaw Skibinski cli : new : advanced benchmark command --priority=rt cli : fix : write on sparse-enabled file systems in 32-bits mode, by @ds77 @@ -308,7 +369,7 @@ build : improved cmake script, by @Majlen build : added -Wformat-security flag, as recommended by Padraig Brady doc : new : educational decoder, by Sean Purcell -v1.1.3 +v1.1.3 (Feb 7, 2017) cli : zstd can decompress .gz files (can be disabled with `make zstd-nogz` or `make HAVE_ZLIB=0`) cli : new : experimental target `make zstdmt`, with multi-threading support cli : new : improved dictionary builder "cover" (experimental), by Nick Terrell, based on prior work by Giuseppe Ottaviano. @@ -324,7 +385,7 @@ API : fix : all symbols properly exposed in libzstd, by Nick Terrell build : support for Solaris target, by Przemyslaw Skibinski doc : clarified specification, by Sean Purcell -v1.1.2 +v1.1.2 (Dec 15, 2016) API : streaming : decompression : changed : automatic implicit reset when chain-decoding new frames without init API : experimental : added : dictID retrieval functions, and ZSTD_initCStream_srcSize() API : zbuff : changed : prototypes now generate deprecation warnings @@ -341,7 +402,7 @@ zlib_wrapper : added support for gz* functions, by Przemyslaw Skibinski install : better compatibility with FreeBSD, by Dimitry Andric source tree : changed : zbuff source files moved to lib/deprecated -v1.1.1 +v1.1.1 (Nov 2, 2016) New : command -M#, --memory=, --memlimit=, --memlimit-decompress= to limit allowed memory consumption New : doc/zstd_manual.html, by Przemyslaw Skibinski Improved : slightly better compression ratio at --ultra levels (>= 20) @@ -352,7 +413,7 @@ Changed : zstd_errors.h is now installed within /include (and replaces errors_pu Updated man page Fixed : zstd-small, zstd-compress and zstd-decompress compilation targets -v1.1.0 +v1.1.0 (Sep 28, 2016) New : contrib/pzstd, parallel version of zstd, by Nick Terrell added : NetBSD install target (#338) Improved : speed for batches of small files @@ -366,7 +427,7 @@ Fixed : compatibility with OpenBSD, reported by Juan Francisco Cantero Hurtado ( Fixed : compatibility with Hurd, by Przemyslaw Skibinski (#365) Fixed : zstd-pgo, reported by octoploid (#329) -v1.0.0 +v1.0.0 (Sep 1, 2016) Change Licensing, all project is now BSD, Copyright Facebook Small decompression speed improvement API : Streaming API supports legacy format @@ -375,7 +436,7 @@ CLI supports legacy formats v0.4+ Fixed : compression fails on certain huge files, reported by Jesse McGrew Enhanced documentation, by Przemyslaw Skibinski -v0.8.1 +v0.8.1 (Aug 18, 2016) New streaming API Changed : --ultra now enables levels beyond 19 Changed : -i# now selects benchmark time in second @@ -384,7 +445,7 @@ Fixed : speed regression on specific patterns (#272) Fixed : support for Z_SYNC_FLUSH, by Dmitry Krot (#291) Fixed : ICC compilation, by Przemyslaw Skibinski -v0.8.0 +v0.8.0 (Aug 2, 2016) Improved : better speed on clang and gcc -O2, thanks to Eric Biggers New : Build on FreeBSD and DragonFly, thanks to JrMarino Changed : modified API : ZSTD_compressEnd() @@ -397,17 +458,17 @@ Modified : minor compression level adaptations Updated : compression format specification to v0.2.0 changed : zstd.h moved to /lib directory -v0.7.5 +v0.7.5 (Aug 1, 2016) Transition version, supporting decoding of v0.8.x -v0.7.4 +v0.7.4 (Jul 17, 2016) Added : homebrew for Mac, by Daniel Cade Added : more examples Fixed : segfault when using small dictionaries, reported by Felix Handte Modified : default compression level for CLI is now 3 Updated : specification, to v0.1.1 -v0.7.3 +v0.7.3 (Jul 9, 2016) New : compression format specification New : `--` separator, stating that all following arguments are file names. Suggested by Chip Turner. New : `ZSTD_getDecompressedSize()` @@ -419,18 +480,18 @@ fixed : multi-blocks decoding with intermediate uncompressed blocks, reported by modified : removed "mem.h" and "error_public.h" dependencies from "zstd.h" (experimental section) modified : legacy functions no longer need magic number -v0.7.2 +v0.7.2 (Jul 4, 2016) fixed : ZSTD_decompressBlock() using multiple consecutive blocks. Reported by Greg Slazinski. fixed : potential segfault on very large files (many gigabytes). Reported by Chip Turner. fixed : CLI displays system error message when destination file cannot be created (#231). Reported by Chip Turner. -v0.7.1 +v0.7.1 (Jun 23, 2016) fixed : ZBUFF_compressEnd() called multiple times with too small `dst` buffer, reported by Christophe Chevalier fixed : dictBuilder fails if first sample is too small, reported by Руслан Ковалёв fixed : corruption issue, reported by cj modified : checksum enabled by default in command line mode -v0.7.0 +v0.7.0 (Jun 17, 2016) New : Support for directory compression, using `-r`, thanks to Przemyslaw Skibinski New : Command `--rm`, to remove source file after successful de/compression New : Visual build scripts, by Christophe Chevalier @@ -443,7 +504,7 @@ API : support for custom malloc/free functions New : controllable Dictionary ID New : Support for skippable frames -v0.6.1 +v0.6.1 (May 13, 2016) New : zlib wrapper API, thanks to Przemyslaw Skibinski New : Ability to compile compressor / decompressor separately Changed : new lib directory structure @@ -453,103 +514,103 @@ Fixed : null-string roundtrip (#176) New : benchmark mode can select directory as input Experimental : midipix support, VMS support -v0.6.0 +v0.6.0 (Apr 13, 2016) Stronger high compression modes, thanks to Przemyslaw Skibinski API : ZSTD_getFrameParams() provides size of decompressed content New : highest compression modes require `--ultra` command to fully unleash their capacity Fixed : zstd cli return error code > 0 and removes dst file artifact when decompression fails, thanks to Chip Turner -v0.5.1 +v0.5.1 (Feb 18, 2016) New : Optimal parsing => Very high compression modes, thanks to Przemyslaw Skibinski Changed : Dictionary builder integrated into libzstd and zstd cli Changed (!) : zstd cli now uses "multiple input files" as default mode. See `zstd -h`. Fix : high compression modes for big-endian platforms New : zstd cli : `-t` | `--test` command -v0.5.0 +v0.5.0 (Feb 5, 2016) New : dictionary builder utility Changed : streaming & dictionary API Improved : better compression of small data -v0.4.7 +v0.4.7 (Jan 22, 2016) Improved : small compression speed improvement in HC mode Changed : `zstd_decompress.c` has ZSTD_LEGACY_SUPPORT to 0 by default fix : bt search bug -v0.4.6 +v0.4.6 (Jan 13, 2016) fix : fast compression mode on Windows New : cmake configuration file, thanks to Artyom Dymchenko Improved : high compression mode on repetitive data New : block-level API New : ZSTD_duplicateCCtx() -v0.4.5 +v0.4.5 (Dec 18, 2015) new : -m/--multiple : compress/decompress multiple files -v0.4.4 +v0.4.4 (Dec 14, 2015) Fixed : high compression modes for Windows 32 bits new : external dictionary API extended to buffered mode and accessible through command line new : windows DLL project, thanks to Christophe Chevalier -v0.4.3 : +v0.4.3 (Dec 7, 2015) new : external dictionary API new : zstd-frugal -v0.4.2 : +v0.4.2 (Dec 2, 2015) Generic minor improvements for small blocks Fixed : big-endian compatibility, by Peter Harris (#85) -v0.4.1 +v0.4.1 (Dec 1, 2015) Fixed : ZSTD_LEGACY_SUPPORT=0 build mode (reported by Luben) removed `zstd.c` -v0.4.0 +v0.4.0 (Nov 29, 2015) Command line utility compatible with high compression levels Removed zstdhc => merged into zstd Added : ZBUFF API (see zstd_buffered.h) Rolling buffer support -v0.3.6 +v0.3.6 (Nov 10, 2015) small blocks params -v0.3.5 +v0.3.5 (Nov 9, 2015) minor generic compression improvements -v0.3.4 +v0.3.4 (Nov 6, 2015) Faster fast cLevels -v0.3.3 +v0.3.3 (Nov 5, 2015) Small compression ratio improvement -v0.3.2 +v0.3.2 (Nov 2, 2015) Fixed Visual Studio -v0.3.1 : +v0.3.1 (Nov 2, 2015) Small compression ratio improvement -v0.3 +v0.3 (Oct 30, 2015) HC mode : compression levels 2-26 -v0.2.2 +v0.2.2 (Oct 28, 2015) Fix : Visual Studio 2013 & 2015 release compilation, by Christophe Chevalier -v0.2.1 +v0.2.1 (Oct 24, 2015) Fix : Read errors, advanced fuzzer tests, by Hanno Böck -v0.2.0 +v0.2.0 (Oct 22, 2015) **Breaking format change** Faster decompression speed Can still decode v0.1 format -v0.1.3 +v0.1.3 (Oct 15, 2015) fix uninitialization warning, reported by Evan Nemerson -v0.1.2 +v0.1.2 (Sep 11, 2015) frame concatenation support -v0.1.1 +v0.1.1 (Aug 27, 2015) fix compression bug detects write-flush errors -v0.1.0 +v0.1.0 (Aug 25, 2015) first release diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 637e37188550..44f2393a2c15 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -5,7 +5,7 @@ possible. ## Our Development Process New versions are being developed in the "dev" branch, or in their own feature branch. -When they are deemed ready for a release, they are merged into "master". +When they are deemed ready for a release, they are merged into "release". As a consequences, all contributions must stage first through "dev" or their own feature branch. @@ -126,6 +126,20 @@ just `contrib/largeNbDicts` and nothing else, you can run: scan-build make -C contrib/largeNbDicts largeNbDicts ``` +### Pitfalls of static analysis +`scan-build` is part of our regular CI suite. Other static analyzers are not. + +It can be useful to look at additional static analyzers once in a while (and we do), but it's not a good idea to multiply the nb of analyzers run continuously at each commit and PR. The reasons are : + +- Static analyzers are full of false positive. The signal to noise ratio is actually pretty low. +- A good CI policy is "zero-warning tolerance". That means that all issues must be solved, including false positives. This quickly becomes a tedious workload. +- Multiple static analyzers will feature multiple kind of false positives, sometimes applying to the same code but in different ways leading to : + + torteous code, trying to please multiple constraints, hurting readability and therefore maintenance. Sometimes, such complexity introduce other more subtle bugs, that are just out of scope of the analyzers. + + sometimes, these constraints are mutually exclusive : if one try to solve one, the other static analyzer will complain, they can't be both happy at the same time. +- As if that was not enough, the list of false positives change with each version. It's hard enough to follow one static analyzer, but multiple ones with their own update agenda, this quickly becomes a massive velocity reducer. + +This is different from running a static analyzer once in a while, looking at the output, and __cherry picking__ a few warnings that seem helpful, either because they detected a genuine risk of bug, or because it helps expressing the code in a way which is more readable or more difficult to misuse. These kind of reports can be useful, and are accepted. + ## Performance Performance is extremely important for zstd and we only merge pull requests whose performance landscape and corresponding trade-offs have been adequately analyzed, reproduced, and presented. @@ -369,7 +383,7 @@ CI tests run every time a pull request (PR) is created or updated. The exact tes that get run will depend on the destination branch you specify. Some tests take longer to run than others. Currently, our CI is set up to run a short series of tests when creating a PR to the dev branch and a longer series of tests -when creating a PR to the master branch. You can look in the configuration files +when creating a PR to the release branch. You can look in the configuration files of the respective CI platform for more information on what gets run when. Most people will just want to create a PR with the destination set to their local dev diff --git a/Makefile b/Makefile index 2c1d34604fe9..2832fb4752b8 100644 --- a/Makefile +++ b/Makefile @@ -8,6 +8,9 @@ # You may select, at your option, one of the above-listed licenses. # ################################################################ +# verbose mode (print commands) on V=1 or VERBOSE=1 +Q = $(if $(filter 1,$(V) $(VERBOSE)),,@) + PRGDIR = programs ZSTDDIR = lib BUILDIR = build @@ -28,9 +31,9 @@ VOID = /dev/null TARGET_SYSTEM ?= $(OS) ifneq (,$(filter Windows%,$(TARGET_SYSTEM))) -EXT =.exe + EXT =.exe else -EXT = + EXT = endif ## default: Build lib-release and zstd-release @@ -46,8 +49,8 @@ allmost: allzstd zlibwrapper # skip zwrapper, can't build that on alternate architectures without the proper zlib installed .PHONY: allzstd allzstd: lib-all - $(MAKE) -C $(PRGDIR) all - $(MAKE) -C $(TESTDIR) all + $(Q)$(MAKE) -C $(PRGDIR) all + $(Q)$(MAKE) -C $(TESTDIR) all .PHONY: all32 all32: @@ -55,18 +58,19 @@ all32: $(MAKE) -C $(TESTDIR) all32 .PHONY: lib lib-release libzstd.a +lib-all : lib lib lib-release lib-all : - @$(MAKE) -C $(ZSTDDIR) $@ + $(Q)$(MAKE) -C $(ZSTDDIR) $@ .PHONY: zstd zstd-release zstd zstd-release: - @$(MAKE) -C $(PRGDIR) $@ - cp $(PRGDIR)/zstd$(EXT) . + $(Q)$(MAKE) -C $(PRGDIR) $@ + $(Q)ln -sf $(PRGDIR)/zstd$(EXT) zstd$(EXT) .PHONY: zstdmt zstdmt: - @$(MAKE) -C $(PRGDIR) $@ - cp $(PRGDIR)/zstd$(EXT) ./zstdmt$(EXT) + $(Q)$(MAKE) -C $(PRGDIR) $@ + $(Q)cp $(PRGDIR)/zstd$(EXT) ./zstdmt$(EXT) .PHONY: zlibwrapper zlibwrapper: lib @@ -75,16 +79,16 @@ zlibwrapper: lib ## test: run long-duration tests .PHONY: test DEBUGLEVEL ?= 1 -test: MOREFLAGS += -g -DDEBUGLEVEL=$(DEBUGLEVEL) -Werror +test: MOREFLAGS += -g -Werror test: - MOREFLAGS="$(MOREFLAGS)" $(MAKE) -j -C $(PRGDIR) allVariants + DEBUGLEVEL=$(DEBUGLEVEL) MOREFLAGS="$(MOREFLAGS)" $(MAKE) -j -C $(PRGDIR) allVariants $(MAKE) -C $(TESTDIR) $@ - ZSTD=../../programs/zstd $(MAKE) -C doc/educational_decoder test + ZSTD=../../programs/zstd $(MAKE) -C doc/educational_decoder $@ ## shortest: same as `make check` .PHONY: shortest shortest: - $(MAKE) -C $(TESTDIR) $@ + $(Q)$(MAKE) -C $(TESTDIR) $@ ## check: run basic tests for `zstd` cli .PHONY: check @@ -97,10 +101,10 @@ automated_benchmarking: .PHONY: benchmarking benchmarking: automated_benchmarking -## examples: build all examples in `/examples` directory +## examples: build all examples in `examples/` directory .PHONY: examples examples: lib - CPPFLAGS=-I../lib LDFLAGS=-L../lib $(MAKE) -C examples/ all + $(MAKE) -C examples all ## manual: generate API documentation in html format .PHONY: manual @@ -117,6 +121,7 @@ man: contrib: lib $(MAKE) -C contrib/pzstd all $(MAKE) -C contrib/seekable_format/examples all + $(MAKE) -C contrib/seekable_format/tests test $(MAKE) -C contrib/largeNbDicts all cd contrib/single_file_libs/ ; ./build_decoder_test.sh cd contrib/single_file_libs/ ; ./build_library_test.sh @@ -127,17 +132,18 @@ cleanTabs: .PHONY: clean clean: - @$(MAKE) -C $(ZSTDDIR) $@ > $(VOID) - @$(MAKE) -C $(PRGDIR) $@ > $(VOID) - @$(MAKE) -C $(TESTDIR) $@ > $(VOID) - @$(MAKE) -C $(ZWRAPDIR) $@ > $(VOID) - @$(MAKE) -C examples/ $@ > $(VOID) - @$(MAKE) -C contrib/gen_html $@ > $(VOID) - @$(MAKE) -C contrib/pzstd $@ > $(VOID) - @$(MAKE) -C contrib/seekable_format/examples $@ > $(VOID) - @$(MAKE) -C contrib/largeNbDicts $@ > $(VOID) - @$(RM) zstd$(EXT) zstdmt$(EXT) tmp* - @$(RM) -r lz4 + $(Q)$(MAKE) -C $(ZSTDDIR) $@ > $(VOID) + $(Q)$(MAKE) -C $(PRGDIR) $@ > $(VOID) + $(Q)$(MAKE) -C $(TESTDIR) $@ > $(VOID) + $(Q)$(MAKE) -C $(ZWRAPDIR) $@ > $(VOID) + $(Q)$(MAKE) -C examples/ $@ > $(VOID) + $(Q)$(MAKE) -C contrib/gen_html $@ > $(VOID) + $(Q)$(MAKE) -C contrib/pzstd $@ > $(VOID) + $(Q)$(MAKE) -C contrib/seekable_format/examples $@ > $(VOID) + $(Q)$(MAKE) -C contrib/seekable_format/tests $@ > $(VOID) + $(Q)$(MAKE) -C contrib/largeNbDicts $@ > $(VOID) + $(Q)$(RM) zstd$(EXT) zstdmt$(EXT) tmp* + $(Q)$(RM) -r lz4 @echo Cleaning completed #------------------------------------------------------------------------------ @@ -161,7 +167,7 @@ EGREP = egrep $(EGREP_OPTIONS) ## list: Print all targets and their descriptions (if provided) .PHONY: list list: - @TARGETS=$$($(MAKE) -pRrq -f $(lastword $(MAKEFILE_LIST)) : 2>/dev/null \ + $(Q)TARGETS=$$($(MAKE) -pRrq -f $(lastword $(MAKEFILE_LIST)) : 2>/dev/null \ | awk -v RS= -F: '/^# File/,/^# Finished Make data base/ {if ($$1 !~ "^[#.]") {print $$1}}' \ | $(EGREP) -v -e '^[^[:alnum:]]' | sort); \ { \ @@ -176,13 +182,13 @@ list: .PHONY: install armtest usan asan uasan install: - @$(MAKE) -C $(ZSTDDIR) $@ - @$(MAKE) -C $(PRGDIR) $@ + $(Q)$(MAKE) -C $(ZSTDDIR) $@ + $(Q)$(MAKE) -C $(PRGDIR) $@ .PHONY: uninstall uninstall: - @$(MAKE) -C $(ZSTDDIR) $@ - @$(MAKE) -C $(PRGDIR) $@ + $(Q)$(MAKE) -C $(ZSTDDIR) $@ + $(Q)$(MAKE) -C $(PRGDIR) $@ .PHONY: travis-install travis-install: diff --git a/README.md b/README.md index 5c300fdc49aa..dcca7662d2ff 100644 --- a/README.md +++ b/README.md @@ -176,6 +176,12 @@ Going into `build` directory, you will find additional possibilities: You can build the zstd binary via buck by executing: `buck build programs:zstd` from the root of the repo. The output binary will be in `buck-out/gen/programs/`. +## Testing + +You can run quick local smoke tests by executing the `playTest.sh` script from the `src/tests` directory. +Two env variables `$ZSTD_BIN` and `$DATAGEN_BIN` are needed for the test script to locate the zstd and datagen binary. +For information on CI testing, please refer to TESTING.md + ## Status Zstandard is currently deployed within Facebook. It is used continuously to compress large amounts of data in multiple formats and use cases. @@ -187,7 +193,7 @@ Zstandard is dual-licensed under [BSD](LICENSE) and [GPLv2](COPYING). ## Contributing -The "dev" branch is the one where all contributions are merged before reaching "master". -If you plan to propose a patch, please commit into the "dev" branch, or its own feature branch. -Direct commit to "master" are not permitted. +The `dev` branch is the one where all contributions are merged before reaching `release`. +If you plan to propose a patch, please commit into the `dev` branch, or its own feature branch. +Direct commit to `release` are not permitted. For more information, please read [CONTRIBUTING](CONTRIBUTING.md). diff --git a/TESTING.md b/TESTING.md index 7e5305178b97..b851d1c8d71a 100644 --- a/TESTING.md +++ b/TESTING.md @@ -27,7 +27,7 @@ They consist of the following tests: Long Tests ---------- -Long tests run on all commits to `master` branch, +Long tests run on all commits to `release` branch, and once a day on the current version of `dev` branch, on TravisCI. They consist of the following tests: diff --git a/appveyor.yml b/appveyor.yml index 5d77b3103481..0e872557525a 100644 --- a/appveyor.yml +++ b/appveyor.yml @@ -1,20 +1,20 @@ -# Following tests are run _only_ on master branch -# To reproduce these tests, it's possible to push into a branch `appveyorTest` -# or a branch `visual*`, they will intentionnally trigger `master` tests +# Following tests are run _only_ on `release` branch +# and on selected feature branch named `appveyorTest` or `visual*` - version: 1.0.{build} branches: only: + - release - master - - appveyorTest + - /appveyor*/ - /visual*/ environment: matrix: - COMPILER: "gcc" HOST: "mingw" PLATFORM: "x64" - SCRIPT: "make allzstd MOREFLAGS=-static && make -C tests fullbench-lib" + SCRIPT: "make allzstd MOREFLAGS=-static" ARTIFACT: "true" BUILD: "true" - COMPILER: "gcc" @@ -92,9 +92,9 @@ cd programs\ && 7z a -tzip -mx9 zstd-win-binary-%PLATFORM%.zip zstd.exe && appveyor PushArtifact zstd-win-binary-%PLATFORM%.zip && cp zstd.exe ..\bin\zstd.exe && - git clone --depth 1 --branch master https://github.com/facebook/zstd && + git clone --depth 1 --branch release https://github.com/facebook/zstd && cd zstd && - git archive --format=tar master -o zstd-src.tar && + git archive --format=tar release -o zstd-src.tar && ..\zstd -19 zstd-src.tar && appveyor PushArtifact zstd-src.tar.zst && certUtil -hashfile zstd-src.tar.zst SHA256 > zstd-src.tar.zst.sha256.sig && @@ -162,6 +162,8 @@ - if [%TEST%]==[cmake] ( mkdir build\cmake\build && cd build\cmake\build && + SET FUZZERTEST=-T2mn && + SET ZSTREAM_TESTTIME=-T2mn && cmake -G "Visual Studio 14 2015 Win64" .. && cd ..\..\.. && make clean @@ -194,7 +196,7 @@ - COMPILER: "gcc" HOST: "mingw" PLATFORM: "x64" - SCRIPT: "CPPFLAGS=-DDEBUGLEVEL=2 CFLAGS=-Werror make -j allzstd DEBUGLEVEL=2" + SCRIPT: "CFLAGS=-Werror make -j allzstd DEBUGLEVEL=2" - COMPILER: "gcc" HOST: "mingw" PLATFORM: "x86" @@ -285,5 +287,6 @@ - ECHO Testing %COMPILER% %PLATFORM% %CONFIGURATION% - if [%HOST%]==[mingw] ( set "CC=%COMPILER%" && + make clean && make check ) diff --git a/doc/zstd_compression_format.md b/doc/zstd_compression_format.md index fc61726fc98c..0af6bf91a204 100644 --- a/doc/zstd_compression_format.md +++ b/doc/zstd_compression_format.md @@ -3,7 +3,7 @@ Zstandard Compression Format ### Notices -Copyright (c) 2016-present Yann Collet, Facebook, Inc. +Copyright (c) 2016-2020 Yann Collet, Facebook, Inc. Permission is granted to copy and distribute this document for any purpose and without charge, @@ -16,7 +16,7 @@ Distribution of this document is unlimited. ### Version -0.3.5 (13/11/19) +0.3.7 (2020-12-09) Introduction @@ -291,21 +291,10 @@ Format is __little-endian__. It's allowed to represent a small ID (for example `13`) with a large 4-bytes dictionary ID, even if it is less efficient. -_Reserved ranges :_ -Within private environments, any `Dictionary_ID` can be used. - -However, for frames and dictionaries distributed in public space, -`Dictionary_ID` must be attributed carefully. -Rules for public environment are not yet decided, -but the following ranges are reserved for some future registrar : -- low range : `<= 32767` -- high range : `>= (1 << 31)` - -Outside of these ranges, any value of `Dictionary_ID` -which is both `>= 32768` and `< (1<<31)` can be used freely, -even in public environment. - - +A value of `0` has same meaning as no `Dictionary_ID`, +in which case the frame may or may not need a dictionary to be decoded, +and the ID of such a dictionary is not specified. +The decoder must know this information by other means. #### `Frame_Content_Size` @@ -389,7 +378,7 @@ __`Block_Size`__ The upper 21 bits of `Block_Header` represent the `Block_Size`. When `Block_Type` is `Compressed_Block` or `Raw_Block`, -`Block_Size` is the size of `Block_Content` (hence excluding `Block_Header`). +`Block_Size` is the size of `Block_Content` (hence excluding `Block_Header`). When `Block_Type` is `RLE_Block`, since `Block_Content`’s size is always 1, `Block_Size` represents the number of times this byte must be repeated. @@ -929,38 +918,41 @@ Note that blocks which are not `Compressed_Block` are skipped, they do not contr ###### Offset updates rules -The newest offset takes the lead in offset history, -shifting others back by one rank, -up to the previous rank of the new offset _if it was present in history_. - -__Examples__ : - -In the common case, when new offset is not part of history : -`Repeated_Offset3` = `Repeated_Offset2` -`Repeated_Offset2` = `Repeated_Offset1` -`Repeated_Offset1` = `NewOffset` - -When the new offset _is_ part of history, there may be specific adjustments. - -When `NewOffset` == `Repeated_Offset1`, offset history remains actually unmodified. - -When `NewOffset` == `Repeated_Offset2`, -`Repeated_Offset1` and `Repeated_Offset2` ranks are swapped. -`Repeated_Offset3` is unmodified. - -When `NewOffset` == `Repeated_Offset3`, -there is actually no difference with the common case : -all offsets are shifted by one rank, -`NewOffset` (== `Repeated_Offset3`) becomes the new `Repeated_Offset1`. - -Also worth mentioning, the specific corner case when `offset_value` == 3, -and the literal length of the current sequence is zero. -In which case , `NewOffset` = `Repeated_Offset1` - 1_byte. -Here also, from an offset history update perspective, it's just a common case : -`Repeated_Offset3` = `Repeated_Offset2` -`Repeated_Offset2` = `Repeated_Offset1` -`Repeated_Offset1` = `NewOffset` ( == `Repeated_Offset1` - 1_byte ) - +During the execution of the sequences of a `Compressed_Block`, the +`Repeated_Offsets`' values are kept up to date, so that they always represent +the three most-recently used offsets. In order to achieve that, they are +updated after executing each sequence in the following way: + +When the sequence's `offset_value` does not refer to one of the +`Repeated_Offsets`--when it has value greater than 3, or when it has value 3 +and the sequence's `literals_length` is zero--the `Repeated_Offsets`' values +are shifted back one, and `Repeated_Offset1` takes on the value of the +just-used offset. + +Otherwise, when the sequence's `offset_value` refers to one of the +`Repeated_Offsets`--when it has value 1 or 2, or when it has value 3 and the +sequence's `literals_length` is non-zero--the `Repeated_Offsets` are re-ordered +so that `Repeated_Offset1` takes on the value of the used Repeated_Offset, and +the existing values are pushed back from the first `Repeated_Offset` through to +the `Repeated_Offset` selected by the `offset_value`. This effectively performs +a single-stepped wrapping rotation of the values of these offsets, so that +their order again reflects the recency of their use. + +The following table shows the values of the `Repeated_Offsets` as a series of +sequences are applied to them: + +| `offset_value` | `literals_length` | `Repeated_Offset1` | `Repeated_Offset2` | `Repeated_Offset3` | Comment | +|:--------------:|:-----------------:|:------------------:|:------------------:|:------------------:|:-----------------------:| +| | | 1 | 4 | 8 | starting values | *** 17169 LINES SKIPPED ***
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?202012251531.0BPFVcgh007718>