From owner-freebsd-current@FreeBSD.ORG Mon Jan 12 12:42:59 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id CF0D416A4CE for ; Mon, 12 Jan 2004 12:42:59 -0800 (PST) Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6311443D1F for ; Mon, 12 Jan 2004 12:41:47 -0800 (PST) (envelope-from robert@fledge.watson.org) Received: from fledge.watson.org (localhost [127.0.0.1]) by fledge.watson.org (8.12.10/8.12.10) with ESMTP id i0CKdtUd079080; Mon, 12 Jan 2004 15:39:55 -0500 (EST) (envelope-from robert@fledge.watson.org) Received: from localhost (robert@localhost)i0CKdtKr079077; Mon, 12 Jan 2004 15:39:55 -0500 (EST) (envelope-from robert@fledge.watson.org) Date: Mon, 12 Jan 2004 15:39:55 -0500 (EST) From: Robert Watson X-Sender: robert@fledge.watson.org To: Mikhail Teterin In-Reply-To: <200401121501.i0CF1eMC047055@aldan.algebra.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: current@FreeBSD.org Subject: Re: core-dumping over NFS X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 Jan 2004 20:42:59 -0000 On Mon, 12 Jan 2004, Mikhail Teterin wrote: > I've observed the following bad behaviour of -current mostly related to > dumping core of a buggy program over the NFS. Sounds unfortunate. A quick starting question: Does the behavior change at all if the core file does or doesn't already exist? Another question: this is a FreeBSD binary, or is it emulated? Could you send the output of running 'file' on the binary? > . 5.2-CURRENT (Dec 14) client, Solaris-8 server: > created core file is empty (zero sized). Not sure how much RPC fun you want to have, but if you could do a tcpdump of the RPC exchange here, would be very helpful. Run ethereal on the result, and look for creation/lookup of the file. It would be interesting to see if one of the RPCs is failing. In particular, I notice that the coredump code calls VOP_SETATTR() to truncate the file without checking the return value. > . 5.2-CURRENT (Dec 14) server, RedHat-9 client: > core is created properly, but sometimes the server goes > into a frenzy with the sys-component (bufdaemon) taking > up the entire 100% of the CPU-time (P4 at 2GHz); it only > writes @4Mb/s (~14% of the disk's bandwidth) and the > only cure is to restart the /etc/rc.d/nfsd; trying to, > for example, switch from X11 to a textual console, when > this is happening reliably hangs the machine. Er. Ouch. Can you confirm if there's an on-going series of RPCs from the client driving the I/O, or if it's just things going nuts on the server? Also, what block size is the RedHat client using by default? Could you set up a serial console -- if so, do you get any interesting messages? > . 5.2-CURRENT (Dec 14) server, 5.2-RC2 (Jan 10) client: > dumps happen normally with rw-mounts, but mounting the > FS read-only (so as to prevent core-dumps) leads to a > panic on the client... > > The mounts are regular and default (v3?), except for the ``intr'' flag. > No rpc.lock or anything... Could you provide the panic message and stack trace? Robert N M Watson FreeBSD Core Team, TrustedBSD Projects robert@fledge.watson.org Senior Research Scientist, McAfee Research