Date: Sun, 26 Oct 2003 01:44:58 -0400 (EDT) From: Robert Watson <rwatson@freebsd.org> To: Bruce Evans <bde@zeta.org.au> Cc: cvs-all@freebsd.org Subject: Re: cvs commit: src/sys/kern kern_sig.c Message-ID: <Pine.NEB.3.96L.1031026014144.74063M-100000@fledge.watson.org> In-Reply-To: <20031026145418.G16944@gamplex.bde.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, 26 Oct 2003, Bruce Evans wrote: > On Sat, 25 Oct 2003, Alfred Perlstein top posted: > > > This is bad, it's time to add a flag to the vnode to do this > > properly instead of relying upon the underlying FS to implement > > the locking. > > How would a mere flag help fix the real complexities for nfs? Well, the point of the locking originally introduced in the core dump code was presumably to help avoid a common case scenario: corrupted core dumps due to parallel dumping. Given the difficulty in addressing the problem in any thorough way (distributed locking, etc), I think I'd almost rather go for the simplest possible mechanism. Setting a vnode flag during a coredump to the vnode, and then causing any other core dump attempts to be aborted while its set, presents a pretty clean solution to the single-host case. > > * Robert Watson <rwatson@FreeBSD.org> [031025 09:14] wrote: > > > rwatson 2003/10/25 09:14:09 PDT > > > > > > FreeBSD src repository > > > > > > Modified files: > > > sys/kern kern_sig.c > > > Log: > > > When generate a core dump, use advisory locking in an advisory way: > > > if we do acquire an advisory lock, great! We'll release it later. > > > However, if we fail to acquire a lock, we perform the coredump > > > anyway. > > Er, advisory locking means that honoring the lock is not enforced, not > that it is good to not honor it. The comment was a bit flippant and inaccurate: if it's possible for the locking request to succeed, we wait for it to succeed. However, if we get a fatal error (rather than blocking), then we plow on ahead. By "advisory locking", I mean that we're using the advisory locking facility. By "advisory way", I mean "if it works, use it, and if it's not available, don't". > > > This problem became particularly visible with NFS after > > > the introduction of rpc.lockd: if the lock manager isn't running, > > > then locking calls will fail, aborting the core dump (resulting in > > > a zero-byte dump file). > > > > > > Reported by: Yogeshwar Shenoy <ynshenoy@alumni.cs.ucsb.edu> > > There is only a problem if the lock manager is supposed to be running > but is not. That is a configuration error, or perhaps a transient > error, so it should not be "fixed" by ignoring the failure. If ignoring > nfs locks is what is wanted in all cases, then it should be configured > by mounting the file system with -L (= nolockd). Maybe the lock request > should hang for transient failures. > > Support for correct configuration of this is still mostly nonexistent in > /etc/defaults/rc.conf and rc.conf(5). The default for nfs mounts is > lockd, but the default for rpc_lockd_enable is "NO". Setting > rpc_lockd_enable to "YES" is not sufficient to configure this. The > setting of at least rc_statd_enable must also be changed. > > This stuff is misconfigured on all of the freebsd machines that I > checked. Some run 4.9, so nfs locking is not available. beast and > builder demonstrate the bug by giving empty core dumps. bento avoids > the bug by dumping cores in a non-nfs directory. Agreed. The current condition of NFS locking is pretty pessimal: we still have substantial bugs in the implementation of the lock manager, and configuring locking correctly is difficult. The default configuration is particularly poor. We should address most of these. :-) Robert N M Watson FreeBSD Core Team, TrustedBSD Projects robert@fledge.watson.org Network Associates Laboratories
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.NEB.3.96L.1031026014144.74063M-100000>