From owner-freebsd-hackers Thu Jul 15 5:11:53 1999 Delivered-To: freebsd-hackers@freebsd.org Received: from dirac.th.physik.uni-bonn.de (dirac.th.physik.uni-bonn.de [131.220.161.119]) by hub.freebsd.org (Postfix) with ESMTP id 683B515544 for ; Thu, 15 Jul 1999 05:11:45 -0700 (PDT) (envelope-from conrad@th.physik.uni-bonn.de) Received: from merlin.th.physik.uni-bonn.de (merlin.th.physik.uni-bonn.de [131.220.161.121]) by dirac.th.physik.uni-bonn.de (8.8.8/8.8.8) with SMTP id OAA23425 for ; Thu, 15 Jul 1999 14:10:52 +0200 (CEST) (envelope-from conrad@merlin.th.physik.uni-bonn.de) Received: (qmail 10100 invoked by uid 145); 15 Jul 1999 12:10:52 -0000 Received: from localhost (sendmail-bs@127.0.0.1) by localhost with SMTP; 15 Jul 1999 12:10:52 -0000 Date: Thu, 15 Jul 1999 14:10:52 +0200 (CEST) From: Jan Conrad To: freebsd-hackers@freebsd.org Cc: Sheldon Hearn , Jan Conrad Subject: NFS problems due to getcwd/realpath Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG Hi everybody, after wondering for two years why FreeBSD (2.2.x ... 3.2) might lock up when an NFS server is down, I think I have found one reason for that (see kern/12609 - I now know it doesn't belong to kern - sorry). It is the implementation of getcwd (src/lib/libc/gen/getcwd.c). When examining the parent dir of a mounted filesystem, getcwd lstats every directory entry prior to the mountpoint to find out the name of the mountpoint (but it would only need the inodes's device to do a rough check....). If one of the prior entries point to another NFS mountpoint and that one is down, getcwd will wait till the mountpoint is up again.... This of course applies to all routines which use getcwd, e.g. realpath. This is especially funny since mountd calls realpath (from the RPC handler!!!!) to check mount points, so when to machines mount dirs from each other, they can lock up, e.g. at boottime (see kern/12609...) I don't fully understand whether the problem is still present in 3.x, since getcwd may call __getcwd to do the job, but as I understand from the sources, __getcwd may fail and then you're back with the problem. Anyhow, how can this be resolved (except for symlinking all mountpoints)? Must getcwd really do an lstat to find out an inodes device?? Is there no other syscall to do that? (I mean: this information must be present somewhere, without going over the net, right?) Unfortunately I don't now such a syscall. In my opinion getcwd should be implemented differently, but maybe some people have a differen opinion on that (And I am not sure how to do that properly). Any suggestions? best regards Jan -- Physikalisches Institut der Universitaet Bonn Nussallee 12 D-53115 Bonn GERMANY To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message