Date: Wed, 26 Jul 1995 01:50:01 -0700 From: David Greenman <davidg@Root.COM> To: Matt Dillon <dillon@blob.best.net> Cc: Doug Rabson <dfr@render.com>, bugs@freebsd.org Subject: Re: brelse() panic in nfs_read()/nfs_bioread() Message-ID: <199507260850.BAA27035@corbin.Root.COM> In-Reply-To: Your message of "Wed, 26 Jul 95 00:57:03 PDT." <199507260757.AAA13857@blob.best.net>
next in thread | previous in thread | raw e-mail | index | archive | help
> Dima and I will bring BEST's system uptodate tonight. > > We have been having some rather severe (about once a day) > crashes on our second shell machine that are completely > different from the crashes we see on other machines. > > This second shell machine is distinguished from the others > in that it mounts user's home directories via NFS, so there > is a great deal more NFS client activity. > > Unfortunately, the crash locks things up.. it can partially > synch the disks but it can't dump core. The only message I > get is the panic message on the console: > > panic biodone: page busy < 0 > off: 180224, foff: 180224, valid: 0xFF, dirty:0 mapped:0 > resid: 4096, index: 0, iosize: 8192, lblkno: 22 > > I believe the failure is related to NFS. The question is, > is this a new bug or do any of the recent patches have a > chance at fixing it? Hard question considering the lack > of information. We've been working on this problem for the past week or so and believe it is fixed in 2.2-current and 2.1-stable. Please update your sources and let us know if the problem persists. > I have been noticing some pretty major cascade failures in the scheduling > algorithm. Basically it is impossibe to use nice() values to give one > process a reasonable priority over another. ... > The solution is that I've pretty much redone the scheduling core... about ... > We are going to install these scheduling changes tonight as well and I > will tell you on friday how well they worked. If they work well, I'd > like to submit them for review. We've messed with the scheduling algorithm quite a bit since the original one in 4.4BSD, and I think have made substantial improvements. Our main concern was that compute-bound processes must execute in a lower priority queue and there needs to be some form of backward inheritence of CPU consumption/ priority. Without this, people doing compiles (or other compute-intensive things) will quickly bring the system to it's knees. In the old model, CPU priorities were evaluated once per second. This is fine for slow computers that take a couple of minutes to compile your average C file, but on fast machines that can do it in 1-2 seconds, we found that the compile job was always in the foreground - making the system appear very sluggish to interactive users. I'd like to here more about how your algorithm works in real-world situations and especially how it functions across the spectrum of system loading. -DG
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199507260850.BAA27035>