From owner-freebsd-ports@FreeBSD.ORG Tue Jun 22 09:15:00 2010 Return-Path: Delivered-To: ports@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BA7A61065674; Tue, 22 Jun 2010 09:15:00 +0000 (UTC) (envelope-from lasse.collin@tukaani.org) Received: from mailfw02.zoner.fi (mailfw02.zoner.fi [84.34.147.249]) by mx1.freebsd.org (Postfix) with ESMTP id A3EC98FC14; Tue, 22 Jun 2010 09:14:59 +0000 (UTC) Received: from www25.zoner.fi ([84.34.147.45]) by wwwsmtp02.zoner.fi with ESMTP; 22 Jun 2010 12:14:56 +0300 Received: from 86-60-146-209-dyn-dsl.ssp.fi ([86.60.146.209] helo=kaneli.localnet) by www25.zoner.fi with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.69) (envelope-from ) id 1OQzZI-0005bT-K6; Tue, 22 Jun 2010 12:14:56 +0300 From: Lasse Collin To: Matthias Andree Date: Tue, 22 Jun 2010 12:14:54 +0300 User-Agent: KMail/1.13.3 (Linux/2.6.33-ARCH; KDE/4.4.4; x86_64; ; ) References: <4C1BA4D4.9000205@FreeBSD.org> <201006201823.03817.lasse.collin@tukaani.org> <4C1E5FC7.7030702@FreeBSD.org> In-Reply-To: <4C1E5FC7.7030702@FreeBSD.org> MIME-Version: 1.0 X-Length: 5952 X-UID: 7 Content-Type: Text/Plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Message-Id: <201006221214.54641.lasse.collin@tukaani.org> X-Antivirus-Scanner: Clean mail though you should still use an Antivirus Cc: ports@freebsd.org, Christian Weisgerber , portmgr@freebsd.org Subject: Re: FreeBSD ports USE_XZ critical issue on low-RAM computers X-BeenThere: freebsd-ports@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting software to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jun 2010 09:15:00 -0000 On 2010-06-20 Matthias Andree wrote: > $ export XZ_OPT=-9 > $ export XZ_OPT_OVERRIDES=-M40% > $ xz -Mmax blah.tar > > would result in the same behaviour as: > > $ xz -9 -M40% blah.tar > # here, the XZ_OPT_OVERRIDES cancels -Mmax from command line > > and could mean: xz trying -9, but lowering that as necessary to meet > the -M40% limit. This or a config file should solve the problem that removing the default memory usage limit would create for me and some other people -- at least as long we remember to add the environment variable or config file on each system. ;-) Environment variable could be easier than a config file. Adding support for it is almost a trivial change to the code, and it is easier to use it per-command basis if needed for some reason. It's still possible that applications using liblzma will use too high settings on low-memory systems, but so far I haven't seen such problems in real-world situations like I have with scripts that use the xz tool. So at least for now there's no need to think about controlling liblzma e.g. via an environment variable, and hopefully it will never be needed. > Environment variables with a big banner "don't XZ_OPT_OVERRIDES use > in scripts, it is reserved for the user" might work. Then everybody > can complain to the script author if it touches XZ_OPT_OVERRIDES. I'm sure it will work. > > Sure, it cannot "fully" parallelize, whatever that means. But the > > amount of parallelization that is possible is welcomed by many > > others (you are the very first person to think it's useless). For > > example, 7-Zip can use any number of threads with .xz files and > > there are some liblzma-based experimental tools too. > > Fully parallelizable means neglible overhead on the algorithmic side, > i. e. near 100% speedup with each new processor added (considering > Amdahl's law and later refinements). > > If compressing position 20-40MB in a file depends on the outcome of > compressing positions 0-20MB, the task is not parallelizable at all. > > If two threads manage 140% of throughput of one, it's not "fully" > parallelizable. OK, so it's fully parallizable only with a simple method that splits the uncompressed data into chunks that are compressed independently. This can decrease compression ratio, but often not too much if chunk size is big enough. Definition of "too much" naturally depends on the specific use case. There are non-fully parallizable ways too with their own advantages and disadvantages. In the long term there will probably be a few different threading methods in liblzma. > > Next question could be how to determine how many threads could be > > OK for multithreaded decompression. It doesn't "fully" parallelize > > either, and would be possible only in certain situations. There > > too the memory usage grows quickly when threads are added. To me, > > a memory usage limit together with a limit on number of threads > > looks good; with no limits, the decompressor could end up reading > > the whole file into RAM (and swap). Threaded decompression isn't > > so important though, so I'm not even sure if I will ever implement > > it. > > The easy answer for you is a "-j N" option like make's, with a > default of 1. Since threads share their address space, the --memory > option can easily be interpreted either way: overall or per-thread. My above description was not good. See my previous email and how a default limit could be useful here even if single-threaded operation should have no limits by default. > I'd like to avoid this discussion though with the large audiences of > ports@ and portmgr@ involved. Feel free to adjust the recipient list. > I think for adoption in infrastructure, > we need consistency across all computers before all else. I can understand that. For me it is important that if the _default_ memory usage limit is thrown away, there needs to be something else to solve the problems that the default memory usage limit was designed to fix. I have got some useful ideas from this discussion, thanks to you and others commenting this thread. I will remove the default limit and probably add support for another environment variable. Hopefully this will make most people somewhat happy. I'm sorry about the hassle that this issue has created. > > The dictionary size is only one thing to get high compression. It > > depends on the file. Some files benefit a lot when dictionary size > > increases while others benefit mostly from spending more CPU > > cycles. That's why there is the --extreme option. It allows > > improving the compression ratio by spending more time without > > requiring so much RAM. > > The manpages states "factor of two", which barely qualifies as > "extreme" in my eyes. "extreme" would be an order of magnitude > (10x). The option name isn't the greatest, I'm generally bad at naming things. Time increase with "xz -2e" is around 10x compared to "xz -2", because it turns a fast mode into slow mode without increasing the dictionary size. With "xz -6" and "xz -6e" the speed difference is not necessarily even 2x. Often "xz -6e" saves only 0.1-0.5 % compared to "xz -6" (sometimes much more though), so the extra CPU cycles with big files often aren't worth it. It depends on what the use case is. -- Lasse Collin | IRC: Larhzu @ IRCnet & Freenode