From owner-freebsd-current Mon Sep 9 19:55:25 1996 Return-Path: owner-current Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id TAA08536 for current-outgoing; Mon, 9 Sep 1996 19:55:25 -0700 (PDT) Received: from Kitten.mcs.com (Kitten.mcs.com [192.160.127.90]) by freefall.freebsd.org (8.7.5/8.7.3) with ESMTP id TAA08529 for ; Mon, 9 Sep 1996 19:55:21 -0700 (PDT) Received: from mailbox.mcs.com (Mailbox.mcs.com [192.160.127.87]) by Kitten.mcs.com (8.7.5/8.7.5) with SMTP id VAA20836; Mon, 9 Sep 1996 21:55:12 -0500 (CDT) Received: by mailbox.mcs.com (/\==/\ Smail3.1.28.1 #28.5) id ; Mon, 9 Sep 96 21:55 CDT Received: (from karl@localhost) by Jupiter.mcs.net (8.7.5/8.7.5) id VAA22327; Mon, 9 Sep 1996 21:55:09 -0500 (CDT) From: Karl Denninger Message-Id: <199609100255.VAA22327@Jupiter.mcs.net> Subject: Re: Grrr. NFS to a Sun (Slowaris 5.5.1) To: jkh@time.cdrom.com (Jordan K. Hubbard) Date: Mon, 9 Sep 1996 21:55:09 -0500 (CDT) Cc: karl@Mcs.Net, henrich@crh.cl.msu.edu, freebsd-current@FreeBSD.ORG In-Reply-To: <2019.842323099@time.cdrom.com> from "Jordan K. Hubbard" at Sep 9, 96 07:38:19 pm X-Mailer: ELM [version 2.4 PL24] Content-Type: text Sender: owner-current@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk > > > You'd be right in that impression - NFS clientry is now one of the > > > easiest way to crash yourself in -current. > > > > Uh, why is it that my news server, which NFS serves the spool to some 30 > > clients, doesn't blow up? :-) > > Well, for one thing it's not the server I'm worried about. NFS > *service* seems to work quite well, it's just the 2.2 clients which > worry me. > > Simple test: > > 22box# cd /usr/src > 22box# make world > 22box# mount foo:/some/big/disk /mnt > 22box# cd /usr/src/release > 22box# make release CHROOTDIR=/mnt/release BUILDNAME=2.2-BLOW_ME_UP > > If it actually gets all the way through this, please, send me mail. > I'd be interested to know. > > Also, to be fair, this doesn't *exactly* match my test environment > as the NFS server is being mounted via AMD, but I strongly doubt > that has anything to do with it. > > Jordan Uh, see this? 9:42PM up 9 days, 21:52, 2 users, load averages: 0.04, 0.03, 0.00 Guess what OS load this is running? FreeBSD Jupiter.mcs.net 2.2-CURRENT FreeBSD 2.2-CURRENT #0: Sat Jul 27 18:09:25 CDT 1996 karl@Codebase.mcs.net:/usr/src/sys/compile/MCS_STANDARD i386 And, I might note, the last time it went down was *MY* fault; I knocked the power cord out of the socket. :-) Prior to that it had been up for about 2 weeks. No problems noted. This system is a general user machine here and is quite stable (at least so far it is). $ ruptime | grep News News1 up 29+11:37, 0 users, load 0.68, 0.75, 0.79 That machine is our primary news feed system (4 X Quantum Atlas Fast/Wide drives, 2 AHA2940s to split the I/O channel load, 2 SMC 100TX net cards, 128MB RAM, Pentium PRO 200 -- your standard fire-breathing monster). It has been completely stable for 29 days with no sign of it changing. That's the system which does the exporting of all that news data. Running 2.2-CURRENT from the same build as above. The disk I/O load on this thing is very heavy; sustained I/O rates over 10mbps for periods of many minutes are not uncommon. No problems at all. I'll note that the BSDI 2.x load that this replaced couldn't keep its noggin' together for more than 48 hours. That we now have 29 days and counting on this code load says a lot -- of good things. I also have another 2.2 load running an NNTP user server (nnrpd) which *does* occasionally fail with all processes wedged in a disk wait, but I'm not at all certain that is really what's going on -- this is one using the "shared active" patches to nnrpd, and nnrpd is known to core fault with the semaphore locked -- which, if it blocks during the IPC operations, will look a lot like a disk I/O block problem...... and the symptoms DO match an IPC lock-up. That NNTP machine doesn't do disk *writes* over NFS. But Jupiter certainly does, as does another 2.2 machine running part of our authentication system (when we rebuild it the source and object directories are mounted). As soon as I can arrange for a spare 4G of space on our NFS farm I'll try your test (probably after this weekend). -- -- Karl Denninger (karl@MCS.Net)| MCSNet - The Finest Internet Connectivity http://www.mcs.net/~karl | T1 from $600 monthly; speeds to DS-3 available | 23 Chicagoland Prefixes, 13 ISDN, much more Voice: [+1 312 803-MCS1 x219]| Email to "info@mcs.net" WWW: http://www.mcs.net/ Fax: [+1 312 248-9865] | Home of Chicago's only FULL Clarinet feed!