Date: Thu, 24 Apr 2003 21:20:56 -0700 (PDT) From: Don Lewis <truckman@FreeBSD.org> To: gordont@gnf.org Cc: current@FreeBSD.org Subject: Re: LOR in NFS server Message-ID: <200304250421.h3P4KuXB033816@gw.catspoiler.org> In-Reply-To: <20030424212641.GU9682@roark.gnf.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On 24 Apr, Gordon Tetlow wrote: > I generated it while running nessus against my local machine. > > lock order reversal > 1st 0xc9384c44 inp (inp) @ /local/usr.src/sys/netinet/tcp_input.c:649 > 2nd 0xc05aa84c tcp (tcp) @ /local/usr.src/sys/netinet/tcp_usrreq.c:621 > Stack backtrace: > backtrace(c04e9f03,c05aa84c,c04f0770,c04f0770,c04f1ae4) at backtrace+0x17 > witness_lock(c05aa84c,8,c04f1ae4,26d,0) at witness_lock+0x692 > _mtx_lock_flags(c05aa84c,0,c04f1ae4,26d,0) at _mtx_lock_flags+0xb2 > tcp_usr_rcvd(c8a63800,80,c04ea514,df0e9a9c,3b9aca00) at tcp_usr_rcvd+0x30 > soreceive(c8a63800,df0e9ad8,df0e9ae4,df0e9adc,0) at soreceive+0x86a > nfsrv_rcv(c8a63800,c6d4fb00,4,34,10430) at nfsrv_rcv+0x8a > sowakeup(c8a63800,c8a6384c,c04f11d5,434,108) at sowakeup+0x97 > tcp_input(c21f5400,14,c0304f91,df0e9c5c,c02f60ba) at tcp_input+0x1341 > ip_input(c21f5400,0,c04efede,e9,c21bd280) at ip_input+0x7b0 > swi_net(0,0,c04e4eed,217,c21c73c0) at swi_net+0x111 > ithread_loop(c21c6100,df0e9d48,c04e4d5d,314,c21c8d10) at ithread_loop+0x16c > fork_exit(c02ec2d0,c21c6100,df0e9d48) at fork_exit+0xc0 > fork_trampoline() at fork_trampoline+0x1a > --- trap 0x1, eip = 0, esp = 0xdf0e9d7c, ebp = 0 --- Hmn ... does NFS over TCP even work with a -current box as the server? It looks like tcp_input() has grabbed the locks in tcbinfo and inp, and then tcp_usr_rcvd() attempts to grab the same locks. I can think of three possible ways of fixing this problem. 1) Drop the locks in tcp_input() before calling sorwakeup() and grab them again if necessary. One has to be careful not to break anything by doing this. This also adds overhead for non-NFS traffic. 2) Never call soreceive() from nfsrv_rcv(), always wake nfsd instead. This has the advantage of minimizing the amount of time that the locks are held, but increases overhead under lightly loaded conditions. 3) Somehow tell tcp_usr_rcvd() not to attempt to grab the locks in this specific case.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200304250421.h3P4KuXB033816>