From owner-freebsd-questions@FreeBSD.ORG Wed Jan 3 19:10:28 2007 Return-Path: X-Original-To: freebsd-questions@freebsd.org Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 8865E16A416 for ; Wed, 3 Jan 2007 19:10:28 +0000 (UTC) (envelope-from keramida@ceid.upatras.gr) Received: from igloo.linux.gr (igloo.linux.gr [62.1.205.36]) by mx1.freebsd.org (Postfix) with ESMTP id 0710F13C461 for ; Wed, 3 Jan 2007 19:10:27 +0000 (UTC) (envelope-from keramida@ceid.upatras.gr) Received: from kobe.laptop (host5.bedc.ondsl.gr [62.103.39.229]) (authenticated bits=128) by igloo.linux.gr (8.13.8/8.13.8/Debian-3) with ESMTP id l03J9hqR017764 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Wed, 3 Jan 2007 21:09:50 +0200 Received: from kobe.laptop (kobe.laptop [127.0.0.1]) by kobe.laptop (8.13.8/8.13.8) with ESMTP id l03J9aKc007221; Wed, 3 Jan 2007 21:09:37 +0200 (EET) (envelope-from keramida@ceid.upatras.gr) Received: (from keramida@localhost) by kobe.laptop (8.13.8/8.13.8/Submit) id l03J9Zrg007220; Wed, 3 Jan 2007 21:09:35 +0200 (EET) (envelope-from keramida@ceid.upatras.gr) Date: Wed, 3 Jan 2007 21:09:35 +0200 From: Giorgos Keramidas To: Kurt Buff Message-ID: <20070103190935.GA7164@kobe.laptop> References: <20070102200721.31D1C16A517@hub.freebsd.org> <20070103035000.GA99263@ns.umpquanet.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Hellug-MailScanner: Found to be clean X-Hellug-MailScanner-SpamCheck: not spam, SpamAssassin (not cached, score=-3.462, required 5, autolearn=not spam, ALL_TRUSTED -1.80, AWL 0.74, BAYES_00 -2.60, DNS_FROM_RFC_ABUSE 0.20) X-Hellug-MailScanner-From: keramida@ceid.upatras.gr X-Spam-Status: No Cc: James Long , freebsd-questions@freebsd.org Subject: Re: Batch file question - average size of file in directory X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Jan 2007 19:10:28 -0000 On 2007-01-03 10:42, Kurt Buff wrote: > On 1/2/07, James Long wrote: > > >Hi, Kurt. > > > >Can I make some assumptions that simplify things? No kinky filenames, > >just [a-zA-Z0-9.]. My approach specifically doesn't like colons or > >spaces, I bet. Also, you say gzipped, so I'm assuming it's ONLY gzip, > >no bzip2, etc. > > Right, no other compression types - just .gz. > > Here's a small snippet of the directory listing: > > -rw-r----- 1 kurt kurt 108208 Dec 21 06:15 dummy-zKLQEWrDDOZh > -rw-r----- 1 kurt kurt 24989 Dec 28 17:29 dummy-zfzaEjlURTU1 > -rw-r----- 1 kurt kurt 30596 Jan 2 19:37 stuff-0+-OvVrXcEoq.gz > -rw-r----- 1 kurt kurt 2055 Dec 22 20:25 stuff-0+19OXqwpEdH.gz > -rw-r----- 1 kurt kurt 13781 Dec 30 03:53 stuff-0+1bMFK2XvlQ.gz > -rw-r----- 1 kurt kurt 11485 Dec 20 04:40 stuff-0+5jriDIt0jc.gz > >> Here's a first draft [...] > > Hmmm.... > > That's the same basic approach that Giogos took, to uncompress the > file and count bytes with wc. I'm liking the 'zcat -l' contstruct, as > it looks more flexible, but then I have to parse the output, probably > with grep and cut. Excellent. I didn't know about the -l option of gzip(1) until today :) You can easily extract the uncompressed size, because it's always in column 2 and it contains only numeric digits: gzip -l *.gz *.Z *.z | awk '{print $2}' | grep '[[:digit:]]\+' Then you can feed the resulting stream of uncompressed sizes to the awk script I sent before :)