Date: Wed, 15 Mar 1995 00:55:38 +0100 (MET) From: J Wunsch <j@uriah.heep.sax.de> To: freebsd-hackers@FreeBSD.org (FreeBSD hackers) Subject: Re: SCSI ASC-ASCQ descriptions Message-ID: <199503142355.AAA01220@uriah.heep.sax.de> In-Reply-To: <199503142039.PAA00285@hda.com> from "Peter Dufault" at Mar 14, 95 03:39:44 pm
next in thread | previous in thread | raw e-mail | index | archive | help
As Peter Dufault wrote:
>
> > I'm really tempted to make a program to do this... :)
>
> Yes, I thought of that too. I even went through the effort of seeing
> how many unique words there are (about 300).
>
> If you had a clever way of finding "good overlap" I think you
> could cut the size in half or more.
Well, in this case, even a rather simple compression scheme will do
it. Find the most common words, and -- since they consist only of
ASCII characters -- assign them ``abbrevations'' in the range of 0x80
and up.
A short glance on the file /COPYRIGHT gave me (for all words that
appear at least three times):
$ perl -e 'while(<>) {foreach $word (split) {$sums{$word}++;}}
> $xx = 0x80;
> foreach $key (sort {$sums{$b} <=> $sums{$a}} (keys(%sums))) {
> printf "$key => 0x%2x\n", $xx++ unless $sums{$key} <= 2;
> }' < /COPYRIGHT
the => 0x80
of => 0x81
and => 0x82
OR => 0x83
OF => 0x84
in => 0x85
=> 0x86
following => 0x87
software => 0x88
University => 0x89
this => 0x8a
THE => 0x8b
The => 0x8c
ANY => 0x8d
are => 0x8e
or => 0x8f
AND => 0x90
IEEE => 0x91
by => 0x92
to => 0x93
with => 0x94
IN => 0x95
documentation => 0x96
In => 0x97
is => 0x98
documentation. => 0x99
California. => 0x9a
from => 0x9b
must => 0x9c
Regents => 0x9d
copyright => 0x9e
portions => 0x9f
conditions => 0xa0
All => 0xa1
This is a quick hack only -- i didn't make any attempt to optimize
or such, and note also the ``null'' word (0x86).
I remember that Turbo Pascal V 2 and 3 used a similiar scheme for
their error messages...
--
cheers, J"org
joerg_wunsch@uriah.heep.sax.de -- http://www.sax.de/~joerg/
Never trust an operating system you don't have sources for. ;-)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199503142355.AAA01220>
