From owner-freebsd-current  Wed Jan 12 13:20:51 2000
Delivered-To: freebsd-current@freebsd.org
Received: from mail.rpi.edu (mail.rpi.edu [128.113.100.7])
	by hub.freebsd.org (Postfix) with ESMTP id 0734A15244
	for <current@FreeBSD.ORG>; Wed, 12 Jan 2000 13:20:46 -0800 (PST)
	(envelope-from drosih@rpi.edu)
Received: from [128.113.24.47] (gilead.acs.rpi.edu [128.113.24.47])
	by mail.rpi.edu (8.9.3/8.9.3) with ESMTP id QAA39524;
	Wed, 12 Jan 2000 16:20:42 -0500
Mime-Version: 1.0
X-Sender: drosih@mail.rpi.edu
Message-Id: <v04210106b4a296b28cc2@[128.113.24.47]>
In-Reply-To: <200001120201.SAA26378@gndrsh.dnsmgr.net>
References: <200001120201.SAA26378@gndrsh.dnsmgr.net>
Date: Wed, 12 Jan 2000 16:21:05 -0500
To: "Rodney W. Grimes" <freebsd@gndrsh.dnsmgr.net>
From: Garance A Drosihn <drosih@rpi.edu>
Subject: Re: Additional option to ls -l for large files
Cc: current@FreeBSD.ORG
Content-Type: text/plain; charset="us-ascii" ; format="flowed"
Sender: owner-freebsd-current@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

At 6:01 PM -0800 1/11/00, Rodney W. Grimes wrote:
>Garance wrote:
> > personally, I'd just as soon use K, M, and G and have it mean
> > the base-10 values.  If I'm looking at a decimal number for one
> > file (because it's small enough), I don't want a base-2 version
> > of the similar number for some other (larger) file in the same
> > listing.
> >
> > (ie, whatever letters you use, please just divide the values
> > by 1000 instead of 1024).
>
>Please don't, there is already far to much precedence in both the
>computing world and other commands (df -k and du -k come to mind
>right off the top of my head, sysinstall uses them in the disklabel
>and partition editor, etc)
>
>df man page:
>     -k      Use 1024-byte (1-Kbyte) blocks rather than the default.
>             Note that this overrides the BLOCKSIZE specification
>             from the environment.
>du man page:
>     -k      Display block counts in 1024-byte (1-Kbyte) blocks.

Note that these commands are talking about block-sizes.  Ie, you are
getting a BLOCK count, where the BLOCKSIZE is either 512 (the default
value) or some other number (frequently 1024).  If you do choose a
blocksize of 1024, then the number you get happens to be "kilobytes",
but both commands are still talking about a BLOCK count.

In 'ls' we are not talking about a block count, we are talking about
a byte-count.  In the past I have written things which do about what
we're talking about here.  In the current 'ls', when you see a list
of files as:

-rw-r--r--  1 gad  staff  19672 Dec 11 03:39 HISTORY
-rw-r--r--  1 gad  staff    411 Dec 11 03:47 Makefile.am
-rw-r--r--  1 gad  staff  11492 Dec 12 23:57 Makefile.in
-rw-r--r--  1 gad  staff   2585 Dec 11 03:39 README-bpe
-rw-r--r--  1 gad  staff   1549 Dec 12 23:55 acconfig.h
-rw-r--r--  1 gad  staff   4396 Dec 12 23:57 aclocal.m4
-rw-r--r--  1 gad  staff   2706 Dec 11 03:39 asciiedit.c
-rw-r--r--  1 gad  staff   2046 Dec 11 03:39 asciisrch.c
-rw-r--r--  1 gad  staff   5755 Dec 11 03:39 backup.c
-rw-r--r--  1 gad  staff  35071 Dec 11 03:38 bpe.1

do you really sit there and think "Let's see, that bpe.1 file is
(35071/1024) kilobytes"?  Being a slave to the decimal system, I
just read the numbers.

The original message proposed:
      a new flag to ls which will together with option -l
      change the unit to kilobytes for files larger than
      one megabyte, to megabytes for files larger than one
      gigabyte and gigabytes for files larger than one terabyte.

In the past, I have done similar things to this, exactly as the
person suggested.  If you do not stick to the "10's-based" numbers
for this, things get confusing.  Note that you don't switch to
kilobytes until the file is over a megabyte.  You end up dealing
with "10's based" values *all* the time, and "2's based" values
some of the time.  Ie, consider files:

-rw-r--r--  1 gad  staff   1998246 Dec 11 03:39 asciiedit.c
-rw-r--r--  1 gad  staff    999123 Dec 11 03:39 asciisrch.c

The first one changes because it's over 1 megabyte, but the second
one does not change.  So you get:

-rw-r--r--  1 gad  staff     1951k Dec 11 03:39 asciiedit.c
-rw-r--r--  1 gad  staff    999123 Dec 11 03:39 asciisrch.c

Okay now, how many people can look at those lines and immediately
see that the first file is "about" twice the size of the second
file?  Probably nobody, not without pulling out a calculator.  The
"purity" of using 1024 as a divisor buys one nothing in situations
like this.  The problem is that you ARE using the 10's-based number
(in your head) for the "999 thousand" value.  And the 1,951K is a
mixture of 10's-based ("1 thousand 951") and 2's-based ("K") values.

Point number two:  This option is to keep the number of columns
of the byte-count-field "reasonable".  1 megabyte is 1024*1024.
A 1-megabyte file is therefore 1048576 bytes.  If you make the
switch (to "k") at 1048576 bytes, you have used up an extra
column for no good reason.  So, it only makes sense to make the
switch at 999,999 bytes.  You'll get:

-rw-r--r--  1 gad  staff     8064k Dec 11 03:39 asciiedit.c
-rw-r--r--  1 gad  staff    999999 Dec 11 03:39 asciisrch.c

when the first file is one byte larger than the second file. Again,
the "purity" of going with the "2's based" value is not worth the
confusion caused by it.

This is what my experience has been when I've done similar things.
I actually wrote at least one thing using the 1024 divisor, and
later switched it to using 1000 because using 1024 was just too
confusing.  Thus, my suggestion that people should consider going
with the base-10 values.


---
Garance Alistair Drosehn           =   gad@eclipse.acs.rpi.edu
Senior Systems Programmer          or  drosih@rpi.edu
Rensselaer Polytechnic Institute


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message