Date: Wed, 26 Dec 2007 16:46:48 -0800 (PST) From: Raja Sivaramakrishnan <srajag00@yahoo.com> To: freebsd-fs@freebsd.org Subject: namei lookup vnode locking Message-ID: <701759.46468.qm@web90511.mail.mud.yahoo.com>
next in thread | raw e-mail | index | archive | help
Hello, I encountered an issue with FreeBSD 6.1 and would appreciate some feedback on this. The problem happens when a perl script running on some client system does a telnet into the FreeBSD box, exits from the login shell and immediately exits the perl script too. After the script exits, there is a deadlock on the FreeBSD box that prevents new processes (such as ps, top etc.) from starting. Upon investigation, this seems to be caused due to the following sequence of events on the FreeBSD system. 1) login process exits. exit call in the kernel closes all file descriptors. One of these is the fd for /dev/ttyp0, used for the telnet session. login locks the vnode for /dev/ttyp0 and waits for 5 minutes in order for the tty to drain (ttywait() call). 2) The tty is supposed to be drained by telnetd. However, telnetd sees the network connection go down when the perl script exits. As a result, it jumps to cleanup code, where it tries to do chmod on /dev/ttyp0. chmod syscall attempts to lock /dev/ttyp0, but fails as the lock is held by login, which puts telnetd process to sleep. However, telnetd holds the lock on the vnode for /dev. It appears that the lock was acquired when doing the namei lookup for /dev/ttyp0. The current state is that there is output in the tty that has to be read by telnetd, but it can't because it is sleeping for the /dev/ttyp0 lock. telnetd is holding the /dev vnode lock while sleeping. 3) As a result, any process that needs the /dev vnode lock is put to sleep for 5 minutes (ttywait waits for a default of 5 minutes). Even if a process wants to open an unrelated device file, /dev/foo, it is not able to do so because the /dev lock is held by telnetd. Few questions: 1) Does namei lookup need to acquire an exclusive lock on intermediate vnodes when looking up a pathname i.e. if telnetd is trying to lookup /dev/ttyp0, does it need to get an exclusive lock on /dev? Can it be a shared lock that will allow at least other readers to make progress? 2) Besides relaxing the locking above, any other thoughts on how to fix this? Reducing the tty timeout from the close routine is another option, but that only limits the duration of the deadlock. Thanks, Raja ____________________________________________________________________________________ Never miss a thing. Make Yahoo your home page. http://www.yahoo.com/r/hs
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?701759.46468.qm>