Date: Wed, 3 Jan 2007 21:09:35 +0200 From: Giorgos Keramidas <keramida@ceid.upatras.gr> To: Kurt Buff <kurt.buff@gmail.com> Cc: James Long <list@museum.rain.com>, freebsd-questions@freebsd.org Subject: Re: Batch file question - average size of file in directory Message-ID: <20070103190935.GA7164@kobe.laptop> In-Reply-To: <a9f4a3860701031042u45757b7ag897d55e1969f84b8@mail.gmail.com> References: <20070102200721.31D1C16A517@hub.freebsd.org> <20070103035000.GA99263@ns.umpquanet.com> <a9f4a3860701031042u45757b7ag897d55e1969f84b8@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2007-01-03 10:42, Kurt Buff <kurt.buff@gmail.com> wrote:
> On 1/2/07, James Long <list@museum.rain.com> wrote:
> <snip my problem description>
> >Hi, Kurt.
> >
> >Can I make some assumptions that simplify things?  No kinky filenames,
> >just [a-zA-Z0-9.].  My approach specifically doesn't like colons or
> >spaces, I bet.  Also, you say gzipped, so I'm assuming it's ONLY gzip,
> >no bzip2, etc.
>
> Right, no other compression types - just .gz.
>
> Here's a small snippet of the directory listing:
>
> -rw-r-----  1 kurt  kurt   108208 Dec 21 06:15 dummy-zKLQEWrDDOZh
> -rw-r-----  1 kurt  kurt    24989 Dec 28 17:29 dummy-zfzaEjlURTU1
> -rw-r-----  1 kurt  kurt    30596 Jan  2 19:37 stuff-0+-OvVrXcEoq.gz
> -rw-r-----  1 kurt  kurt     2055 Dec 22 20:25 stuff-0+19OXqwpEdH.gz
> -rw-r-----  1 kurt  kurt    13781 Dec 30 03:53 stuff-0+1bMFK2XvlQ.gz
> -rw-r-----  1 kurt  kurt    11485 Dec 20 04:40 stuff-0+5jriDIt0jc.gz
>
>> Here's a first draft [...]
>
> Hmmm....
>
> That's the same basic approach that Giogos took, to uncompress the
> file and count bytes with wc. I'm liking the 'zcat -l' contstruct, as
> it looks more flexible, but then I have to parse the output, probably
> with grep and cut.
Excellent.  I didn't know about the -l option of gzip(1) until today :)
You can easily extract the uncompressed size, because it's always in
column 2 and it contains only numeric digits:
    gzip -l *.gz *.Z *.z | awk '{print $2}' | grep '[[:digit:]]\+'
Then you can feed the resulting stream of uncompressed sizes to the awk
script I sent before :)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070103190935.GA7164>
