Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 26 Jan 2002 16:39:38 -0800
From:      Terry Lambert <tlambert2@mindspring.com>
To:        "Gary W. Swearingen" <swear@blarg.net>
Cc:        Brad Knowles <brad.knowles@skynet.be>, "Matthew D. Fuller" <fullermd@over-yonder.net>, Mike Meyer <mwm-dated-1012390758.50933b@mired.org>, chip <chip@wiegand.org>, freebsd-chat@FreeBSD.ORG
Subject:   Re: Bad disk partitioning policies (was: "Re: FreeBSD Intaller (was   "Re: ... RedHat ...")")
Message-ID:  <3C534C4A.35673769@mindspring.com>
References:  <20020123124025.A60889@HAL9000.wox.org> <3C4F5BEE.294FDCF5@mindspring.com> <20020123223104.SM01952@there> <p0510122eb875d9456cf4@[10.0.1.3]> <15440.35155.637495.417404@guru.mired.org> <p0510123fb876493753e0@[10.0.1.3]> <15440.53202.747536.126815@guru.mired.org> <p05101242b876db6cd5d7@[10.0.1.3]> <15441.17382.77737.291074@guru.mired.org> <p05101245b8771d04e19b@[10.0.1.3]> <20020125212742.C75216@over-yonder.net> <p05101203b8788a930767@[10.0.1.14]> <gc1ygc7sfi.ygc@localhost.localdomain>

next in thread | previous in thread | raw e-mail | index | archive | help
"Gary W. Swearingen" wrote:
> I'd be good to have this documented after some more experts express a
> common opinion on whether absolute or relative size of the reserve
> matters and how they'd choose the numbers.  I'd hope they'd speak of
> partition size instead of disk size.  And whether the value should
> have any dependence on tunefs's -o value.
> 
> I suspect that the answer is "absolute", except for the effect big
> partitions have on the willingness of the SA to reduce risks by
> increasing their safety margins, at the cost of cheap disk space.

It's partition size, but X*(N + M) = (X*N) + (X*M).

Multiplication is commutative and associative.  8-).
So yes: it's absolute.

Basically, at 85% in a perfect hash, there is 0%
fragmentation, at 90% that goes up to 7%.

It's really very easy to understand: you are using a
statistical function to select a non-colliding subset
of a set, and you want to know at what point you end
up with diminishing returns, and collisions occur.

If you have a friend who is a statistician, you should
ask them to explain "The Birthday Paradox" to you.


> One problem is that (according to tunefs(8) man page) if one uses 5% or
> less, the layout algorithm optimizes for defraging and slows down writes
> "greatly".  I wonder if that algorithm is obsolete.  (But one can force
> it to optimize for time with tunefs.)

No.  The problem is that a 95% hash fill is considered to
be unacceptable, since the collision rate goes up to an
"unacceptably high" value.

In laymans terms, when you randomly pick a number, and then
hash it with a perfect hash function to get a block offset
of a block you wish to allocate, you will find that the
block you have picked is already allocated, and you have to
do collision handling.

The "fragmentation avoidance" changes it from a "first fit"
to a "best fit" algorithm for the allocation you want to
make.  It is this change that slows down writes "greatly".

You'd probably benefit from reading the original FFS paper.

Over the years, FreeBSD has eroded the free reserve until
it is very small.

> The tunefs(8) man page leaves me wondering, when it says
> 
>     This value can be set to zero, however up to a factor of three in
>     throughput will be lost over the performance obtained at a 10%
>     threshold.
> 
> whether that's true even when the filesystem is far from full or only
> when comparing, say, two fileystems with 0-10% free space (and, I
> suspect, only a factor of three near 0%).

You know, you could worry about something else... like
the fact that a formatted disk has less capacity than an
unformatted one.

What kind of "idiot" would format the disk and thus lose
all that capacity?  Can't we reduce the formatting
overhead to 1/3 of the manufacturer default?

8-) 8-)

-- Terry

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-chat" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3C534C4A.35673769>