Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 26 Jul 1995 01:50:01 -0700
From:      David Greenman <davidg@Root.COM>
To:        Matt Dillon <dillon@blob.best.net>
Cc:        Doug Rabson <dfr@render.com>, bugs@freebsd.org
Subject:   Re: brelse() panic in nfs_read()/nfs_bioread() 
Message-ID:  <199507260850.BAA27035@corbin.Root.COM>
In-Reply-To: Your message of "Wed, 26 Jul 95 00:57:03 PDT." <199507260757.AAA13857@blob.best.net> 

next in thread | previous in thread | raw e-mail | index | archive | help
>   Dima and I will bring BEST's system uptodate tonight.
>
>   We have been having some rather severe (about once a day)
>   crashes on our second shell machine that are completely
>   different from the crashes we see on other machines.
>
>   This second shell machine is distinguished from the others
>   in that it mounts user's home directories via NFS, so there
>   is a great deal more NFS client activity.
>
>   Unfortunately, the crash locks things up.. it can partially
>   synch the disks but it can't dump core.  The only message I
>   get is the panic message on the console:
>
>	panic biodone: page busy < 0
>	off: 180224, foff: 180224, valid: 0xFF, dirty:0 mapped:0
>	resid: 4096, index: 0, iosize: 8192, lblkno: 22
>
>    I believe the failure is related to NFS.  The question is,
>    is this a new bug or do any of the recent patches have a
>    chance at fixing it?  Hard question considering the lack
>    of information.

   We've been working on this problem for the past week or so and believe it
is fixed in 2.2-current and 2.1-stable. Please update your sources and let us
know if the problem persists.

>    I have been noticing some pretty major cascade failures in the scheduling
>    algorithm.  Basically it is impossibe to use nice() values to give one
>    process a reasonable priority over another.
...
>    The solution is that I've pretty much redone the scheduling core... about
...
>    We are going to install these scheduling changes tonight as well and I
>    will tell you on friday how well they worked.  If they work well, I'd
>    like to submit them for review.

   We've messed with the scheduling algorithm quite a bit since the original
one in 4.4BSD, and I think have made substantial improvements. Our main
concern was that compute-bound processes must execute in a lower priority queue
and there needs to be some form of backward inheritence of CPU consumption/
priority. Without this, people doing compiles (or other compute-intensive
things) will quickly bring the system to it's knees. In the old model, CPU
priorities were evaluated once per second. This is fine for slow computers that
take a couple of minutes to compile your average C file, but on fast machines
that can do it in 1-2 seconds, we found that the compile job was always in the
foreground - making the system appear very sluggish to interactive users. I'd
like to here more about how your algorithm works in real-world situations and
especially how it functions across the spectrum of system loading.

-DG



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199507260850.BAA27035>