Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 26 Jan 2012 09:54:22 -0500
From:      John Baldwin <jhb@freebsd.org>
To:        freebsd-hackers@freebsd.org
Cc:        Artem Belevich <art@freebsd.org>, Edward, =?utf-8?q?Napiera=C5=82a?= <trasz@freebsd.org>
Subject:   Re: Speeding up the loader(8).
Message-ID:  <201201260954.23179.jhb@freebsd.org>
In-Reply-To: <CAFqOu6ivT34T_RGHxuWZOTk2mCJ29P6wLY3RHt7=S-wH-Y0eYg@mail.gmail.com>
References:  <20120123215503.GA64787@geosci> <CAFqOu6ivT34T_RGHxuWZOTk2mCJ29P6wLY3RHt7=S-wH-Y0eYg@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tuesday, January 24, 2012 1:22:57 pm Artem Belevich wrote:
> 2012/1/23 Edward Tomasz Napiera=C5=82a <trasz@freebsd.org>:
> > Some time ago I've spent some time on trying to speed up loading
> > modules by the loader(8).  Result can be found at:
> >
> > http://people.freebsd.org/~trasz/fast-loader-3.diff
> >
> > This patch solves three issues:
> >
> > 1. As it is now, the code in biosdisk.c tries very hard to split
> >   reasonably sized (up to 64kB, IIRC) requests into smaller ones.

It is more that the I/O's cannot cross a 64kb boundary.  This is due to a=20
limitation in old disk controllers.  Newer versions of EDD provide flags
that indicate whether a device needs this to be honored or not.  You could =
use=20
these flags from EDD to determine if the splitting should be used or not wh=
ich=20
might help with your case while still being safe for older devices (and for=
=20
some more limited devices such as flash).  EDD3 can also let you specify th=
e=20
raw 64-bit physical address to write the bits into rather than always using=
 a=20
bounce buffer in the low 1MB.  This would also be a good thing to take=20
advantage of.

> > 2. The code in biosdisk.c rereads the partition table and probably
> >   some filesystem metadata every time a file gets opened, i.e.
> >   for every module.  These reads bypass the bcache.
> >
> > 3. The code in bcache.c doesn't really implement an LRU - it implements
> >   'least recently added' algorithm, i.e. a kind of queue.  Not that
> >   it matters much, since it flushes the elements two seconds after
> >   caching them anyway.  I replaced it with Least Frequently Used.
> >   LRU didn't behave well, as it tended to replace metadata with data
> >   used only once.

These sound reasonable, though I suspect they are in part due to dealing wi=
th=20
floppies where the user can swap out of the disk and we have no way of=20
noticing otherwise.  However, we could possibly adjust some behavior to cac=
he=20
the bits if the disk is not a floppy drive.

> 4. it flushes cache on access to a different drive which means that
> cache does not help on multi-disk ZFS setups.

I believe this is also necessary to deal with floppies and the fact that yo=
u=20
don't have a reliable way of knowing if a floppy has changed.

=2D-=20
John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201201260954.23179.jhb>