Date: Wed, 3 Jan 2007 21:09:35 +0200 From: Giorgos Keramidas <keramida@ceid.upatras.gr> To: Kurt Buff <kurt.buff@gmail.com> Cc: James Long <list@museum.rain.com>, freebsd-questions@freebsd.org Subject: Re: Batch file question - average size of file in directory Message-ID: <20070103190935.GA7164@kobe.laptop> In-Reply-To: <a9f4a3860701031042u45757b7ag897d55e1969f84b8@mail.gmail.com> References: <20070102200721.31D1C16A517@hub.freebsd.org> <20070103035000.GA99263@ns.umpquanet.com> <a9f4a3860701031042u45757b7ag897d55e1969f84b8@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2007-01-03 10:42, Kurt Buff <kurt.buff@gmail.com> wrote: > On 1/2/07, James Long <list@museum.rain.com> wrote: > <snip my problem description> > >Hi, Kurt. > > > >Can I make some assumptions that simplify things? No kinky filenames, > >just [a-zA-Z0-9.]. My approach specifically doesn't like colons or > >spaces, I bet. Also, you say gzipped, so I'm assuming it's ONLY gzip, > >no bzip2, etc. > > Right, no other compression types - just .gz. > > Here's a small snippet of the directory listing: > > -rw-r----- 1 kurt kurt 108208 Dec 21 06:15 dummy-zKLQEWrDDOZh > -rw-r----- 1 kurt kurt 24989 Dec 28 17:29 dummy-zfzaEjlURTU1 > -rw-r----- 1 kurt kurt 30596 Jan 2 19:37 stuff-0+-OvVrXcEoq.gz > -rw-r----- 1 kurt kurt 2055 Dec 22 20:25 stuff-0+19OXqwpEdH.gz > -rw-r----- 1 kurt kurt 13781 Dec 30 03:53 stuff-0+1bMFK2XvlQ.gz > -rw-r----- 1 kurt kurt 11485 Dec 20 04:40 stuff-0+5jriDIt0jc.gz > >> Here's a first draft [...] > > Hmmm.... > > That's the same basic approach that Giogos took, to uncompress the > file and count bytes with wc. I'm liking the 'zcat -l' contstruct, as > it looks more flexible, but then I have to parse the output, probably > with grep and cut. Excellent. I didn't know about the -l option of gzip(1) until today :) You can easily extract the uncompressed size, because it's always in column 2 and it contains only numeric digits: gzip -l *.gz *.Z *.z | awk '{print $2}' | grep '[[:digit:]]\+' Then you can feed the resulting stream of uncompressed sizes to the awk script I sent before :)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070103190935.GA7164>