From owner-freebsd-bugs@FreeBSD.ORG Thu Feb 22 06:10:18 2007 Return-Path: X-Original-To: freebsd-bugs@hub.freebsd.org Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 40A1316A401 for ; Thu, 22 Feb 2007 06:10:18 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [69.147.83.40]) by mx1.freebsd.org (Postfix) with ESMTP id 2ECBA13C481 for ; Thu, 22 Feb 2007 06:10:18 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.13.4/8.13.4) with ESMTP id l1M6AHIn070993 for ; Thu, 22 Feb 2007 06:10:17 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.13.4/8.13.4/Submit) id l1M6AH0q070989; Thu, 22 Feb 2007 06:10:17 GMT (envelope-from gnats) Date: Thu, 22 Feb 2007 06:10:17 GMT Message-Id: <200702220610.l1M6AH0q070989@freefall.freebsd.org> To: freebsd-bugs@FreeBSD.org From: Bruce Evans Cc: Subject: Re: kern/109277: kernel ppp(4) botches clist reservation in RELENG_6 X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Bruce Evans List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Feb 2007 06:10:18 -0000 The following reply was made to PR kern/109277; it has been noted by GNATS. From: Bruce Evans To: Dmitry Pryanishnikov Cc: FreeBSD-gnats-submit@FreeBSD.org, freebsd-bugs@FreeBSD.org, rwatson@freebsd.org Subject: Re: kern/109277: kernel ppp(4) botches clist reservation in RELENG_6 Date: Thu, 22 Feb 2007 17:02:23 +1100 (EST) On Thu, 22 Feb 2007, Dmitry Pryanishnikov wrote: > On Mon, 19 Feb 2007, Bruce Evans wrote: >> On Sun, 18 Feb 2007, Dmitry Pryanishnikov wrote: >>> Last such a crash was "panic: clist reservation botch" (see >>> cblock_alloc() >>> function in /sys/kern/tty_subr.c), this was RELENG_6 as of 1-Feb-2007, >>>> Fix: >>> I'm ready to provide further debugging information on this issue. >>> Unfortunately, I'm not familiar enough with the locking concepts >>> in modern FreeBSD kernels (and in tty subsystem particularly) >>> in order to make the fix myself. >> >> Tty locking is especially simple and not very good -- everything must >> be Giant-locked to work. However, the default for network drivers is >> now not to use Giant locking. ppp doesn't seem to be aware of this. > > I was under the wrong impression that the addition of IFF_NEEDSGIANT to > ppp's ifp->if_flags (made 30-Mar-2006 by rwatson@) cured exactly this > problem. But apparently it's not enough: simple addition of the > GIANT_REQUIRED at start of the cblock_alloc() reveals that Giant isn't held > at this point. Ah, I remembered that something was supposed to be fix but couldn't find the details since I only looked in ppp files. >> The only simple fix seems to be to pessimize all network drivers by >> configuring Giant locking for them all -- see netisr.c. I'm not sure >> if this is enough -- it is probably necessary to Giant-lock all calls >> into ppp (especially ioctls), but things in netisr.c only logically >> affect isrs. Hopefully IFF_NEEDSGIANT for some interfaces makes pessimizing all of them unnecessary. But IIRC the support for the MP-unsafe interfaces is supposed to go away soon. > I've tried to solve the problem from the other end. Upon some analysis > of the ppp(4) code I concluded that: a) all calls of the tty functions > are concentrated in the single ppp(4) source file ppp_tty.c, and > b) _almost_ all these calls are protected by the now dummy pair of function > calls "s = spltty(); / splx(s);". So I went ahead and converted those > pairs to the conditional Giant acquisition: I think some protection was also provided by splimp(), but that is in the network-specific parts (if_ppp.c) and is now more likely than the tty-specific parts to be coverted by Giant/IFF_NEEDSGIANT. > --- ppp_tty.c.orig Tue Dec 12 23:56:22 2006 > +++ ppp_tty.c Wed Feb 21 21:39:50 2007 > @@ -86,6 +86,8 @@ > #include > #include > #include > +#include > +#include > > #ifdef PPP_FILTER > #include > @@ -93,6 +95,12 @@ > #include > #include > > +#undef spltty > +#undef splx > +#define spltty() mtx_owned(&Giant) ? 0 : (mtx_lock(&Giant), 1) You could alos change this to GIANT_REQUIRED to find cases where IFF_NEEDSGIANT is helping. > +#define splx(relflg) if (relflg) mtx_unlock(&Giant) > + > + > static int pppopen(struct cdev *dev, struct tty *tp); > static int pppclose(struct tty *tp, int flag); > static int pppread(struct tty *tp, struct uio *uio, int flag); > @@ -556,7 +564,9 @@ > /* XXX as above. */ > if (CCOUNT(&tp->t_outq) == 0) { > ++sc->sc_stats.ppp_obytes; > + s = spltty(); > (void) putc(PPP_FLAG, &tp->t_outq); > + splx(s); > } > > /* Calculate the FCS for the first mbuf's worth. */ > @@ -579,7 +589,9 @@ > n = cp - start; > if (n) { > /* NetBSD (0.9 or later), 4.3-Reno or similar. */ > + s = spltty(); > ndone = n - b_to_q(start, n, &tp->t_outq); > + splx(s); > len -= ndone; > start += ndone; > sc->sc_stats.ppp_obytes += ndone; These above is in pppasyncstart(). Apparently, IFF_NEEDSGIANT doesn't help for it (or ip_output()?). Not having spls in the above at first looks a bug in the old code, but actually, putc() and b_to_q() call spltty() internally so spltty() was only needed here when more than one thing needs to be protected by it. The fine-grained locking here might have been a bad method even with spls (it already gave the bugs documented in the XXX comments). > This actually insures that Giant is held within tty layer (GIANT_REQUIRED > check in cblock_alloc() confirms this), but this causes lock order reversals > like these: > > lock order reversal: (Giant after non-sleepable) > 1st 0xc39904c8 inp (rawinp) @ /usr/RELENG_6/src/sys/netinet/raw_ip.c:289 > 2nd 0xc06488a0 Giant (Giant) @ /usr/RELENG_6/src/sys/net/ppp_tty.c:567 > KDB: stack backtrace: > kdb_backtrace(0,ffffffff,c06556a0,c0657ba8,c0625fe4,...) at 0xc04b2f59 = > kdb_backtrace+0x29 > witness_checkorder(c06488a0,9,c05fff6e,237) at 0xc04bdb00 = > witness_checkorder+0x578 > _mtx_lock_flags(c06488a0,0,c05fff6e,237) at 0xc0493180 = _mtx_lock_flags+0x78 > pppasyncstart(c35ab800,c3692d0c,0,c05ffb7e,3e3) at 0xc050c6e4 = > pppasyncstart+0x90 > pppoutput(c3692c00,c3804700,c39764d0,c3987bdc,0,...) at 0xc0509d7a = > pppoutput+0x326 > ip_output(c3804700,0,dae08b24,20,0,...) at 0xc052a764 = ip_output+0xa64 > rip_output(c3804700,c3a8b000,12e6cc1,40,c3804700,...) at 0xc052cb3b = > rip_output+0x29f > rip_send(c3a8b000,0,c3804700,c3a45470,0,...) at 0xc052d62f = rip_send+0x93 > sosend(c3a8b000,c3a45470,dae08c3c,c3804700,0,0,c6895780) at 0xc04d4ff7 = > sosend+0x5eb > kern_sendit(c6895780,3,dae08cbc,0,0,0) at 0xc04da28c = kern_sendit+0x104 > sendit(c6895780,3,dae08cbc,0,804e0b4,...) at 0xc04da15f = sendit+0x163 > sendto(c6895780,dae08d04) at 0xc04da3bd = sendto+0x4d > syscall(3b,3b,3b,804e074,40,...) at 0xc05ca317 = syscall+0x22f > Xint0x80_syscall() at 0xc05b89af = Xint0x80_syscall+0x1f > --- syscall (133, FreeBSD ELF32, sendto), eip = 0x481401eb, esp = 0xbfbee90c, > ebp = 0xbfbee958 --- > > lock order reversal: (Giant after non-sleepable) > 1st 0xc398457c inp (udpinp) @ /usr/RELENG_6/src/sys/netinet/udp_usrreq.c:804 > 2nd 0xc06488a0 Giant (Giant) @ /usr/RELENG_6/src/sys/net/ppp_tty.c:567 > KDB: stack backtrace: > kdb_backtrace(0,ffffffff,c0657838,c0657ba8,c0625fe4,...) at 0xc04b2f59 = > kdb_backtrace+0x29 > witness_checkorder(c06488a0,9,c05fff6e,237) at 0xc04bdb00 = > witness_checkorder+0x578 > _mtx_lock_flags(c06488a0,0,c05fff6e,237) at 0xc0493180 = _mtx_lock_flags+0x78 > pppasyncstart(c35ab800,c3692d0c,0,c05ffb7e,3e3) at 0xc050c6e4 = > pppasyncstart+0x90 > pppoutput(c3692c00,c3803900,c39764d0,c3987bdc,0,...) at 0xc0509d7a = > pppoutput+0x326 > ip_output(c3803900,0,dae02b00,0,0,...) at 0xc052a764 = ip_output+0xa64 > udp_output(c39844ec,c3803900,c384a330,0,c6895480,...) at 0xc053a5b2 = > udp_output+0x4b6 > udp_send(c3a8b000,0,c3803900,c384a330,0,...) at 0xc053ab0a = udp_send+0x1a > sosend(c3a8b000,c384a330,dae02c3c,c3803900,0,0,c6895480) at 0xc04d4ff7 = > sosend+0x5eb > kern_sendit(c6895480,4,dae02cbc,0,0,0) at 0xc04da28c = kern_sendit+0x104 > sendit(c6895480,4,dae02cbc,0,bfbfe890,...) at 0xc04da15f = sendit+0x163 > sendto(c6895480,dae02d04) at 0xc04da3bd = sendto+0x4d > syscall(3b,3b,3b,bfbfed4c,bfbfe860,...) at 0xc05ca317 = syscall+0x22f > Xint0x80_syscall() at 0xc05b89af = Xint0x80_syscall+0x1f > --- syscall (133, FreeBSD ELF32, sendto), eip = 0x481341eb, esp = 0xbfbfe80c, > ebp = 0xbfbfe838 --- > > So, is it possible to avoid these LORs w/o massive ppp/tty/network code > rewriting? I think Giant needs to be acquired early in the networking code for if_output like it is for if_start. It seems to only be be acquired for if_start and if_ioctl, and not for if_output, if_input, if_watchdog, if_init or if_resolvemulti. ppp doesn't use any of the others or if_start. Slip uses if_start and seems to have the same problem with if_output. Bruce