From owner-freebsd-bugs@FreeBSD.ORG Sun Feb 18 14:40:06 2007 Return-Path: X-Original-To: freebsd-bugs@hub.freebsd.org Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 1F2F616A409 for ; Sun, 18 Feb 2007 14:40:06 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [69.147.83.40]) by mx1.freebsd.org (Postfix) with ESMTP id F16A513C4A7 for ; Sun, 18 Feb 2007 14:40:05 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.13.4/8.13.4) with ESMTP id l1IEe52C038288 for ; Sun, 18 Feb 2007 14:40:05 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.13.4/8.13.4/Submit) id l1IEe59K038282; Sun, 18 Feb 2007 14:40:05 GMT (envelope-from gnats) Resent-Date: Sun, 18 Feb 2007 14:40:05 GMT Resent-Message-Id: <200702181440.l1IEe59K038282@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-bugs@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, Dmitry Pryanishnikov Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 0FD0616A407 for ; Sun, 18 Feb 2007 14:33:24 +0000 (UTC) (envelope-from dmitry@atlantis.dp.ua) Received: from postman.atlantis.dp.ua (postman.atlantis.dp.ua [193.108.47.1]) by mx1.freebsd.org (Postfix) with ESMTP id 6523513C467 for ; Sun, 18 Feb 2007 14:33:23 +0000 (UTC) (envelope-from dmitry@atlantis.dp.ua) Received: from homelynx.homenet (p177.atlantis.dp.ua [193.19.241.177]) by postman.atlantis.dp.ua (8.13.1/8.13.1) with ESMTP id l1IEHxp5069663 for ; Sun, 18 Feb 2007 16:18:00 +0200 (EET) (envelope-from dmitry@atlantis.dp.ua) Received: from homelynx.homenet (localhost [127.0.0.1]) by homelynx.homenet (8.13.8/8.13.8) with ESMTP id l1IEHtxB001880 for ; Sun, 18 Feb 2007 16:17:55 +0200 (EET) (envelope-from dmitry@homelynx.homenet) Received: (from dmitry@localhost) by homelynx.homenet (8.13.8/8.13.8/Submit) id l1IEHsCJ001879; Sun, 18 Feb 2007 16:17:54 +0200 (EET) (envelope-from dmitry) Message-Id: <200702181417.l1IEHsCJ001879@homelynx.homenet> Date: Sun, 18 Feb 2007 16:17:54 +0200 (EET) From: Dmitry Pryanishnikov To: FreeBSD-gnats-submit@FreeBSD.org X-Send-Pr-Version: 3.113 Cc: Subject: kern/109277: kernel ppp(4) botches clist reservation in RELENG_6 X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Dmitry Pryanishnikov List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 18 Feb 2007 14:40:06 -0000 >Number: 109277 >Category: kern >Synopsis: kernel ppp(4) botches clist reservation in RELENG_6 >Confidential: no >Severity: serious >Priority: medium >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Sun Feb 18 14:40:05 GMT 2007 >Closed-Date: >Last-Modified: >Originator: Dmitry Pryanishnikov >Release: FreeBSD 6.2-STABLE i386 >Organization: Atlantis ISP >Environment: System: FreeBSD homelynx.homenet 6.2-STABLE FreeBSD 6.2-STABLE #0: Sun Feb 18 05:55:06 EET 2007 root@homelynx.homenet:/usr/obj/usr/RELENG_6/src/sys/lynx i386 Hardware: Intel D845EBG2 mainboard + Pentium(R) 4 CPU 2.80GHz + RAM 512Mb, ECC check+correction enabled. System is rock-stable when NOT using ppp(4). >Description: Very rare (maybe, once a month) spontaneous crashes occur during the active simultaneous use of kernel ppp and system console. When console is in X.org mode, system just silently reboots. OTOH, there is a certain chance to get valid crash dump when system console is in text mode. Last such a crash was "panic: clist reservation botch" (see cblock_alloc() function in /sys/kern/tty_subr.c), this was RELENG_6 as of 1-Feb-2007, backtrace was: panic(c05f55c8,0,c04cd3ee,20,38,...) at 0xc049a8a4 = panic+0xa8 b_to_q(c37fd6a8,24,c36d6838,c36d6838,0,...) at 0xc04cd60e = b_to_q+0xce pppasyncstart(c62bfc00,c36cd50c,0,c05f9daf,3e3) at 0xc0508ff4 = pppasyncstart+0x 108 pppoutput(c36cd400,c37fd600,c39b7a70,c39debdc,0,...) at 0xc0506a36 = pppoutput+0 x326 ip_output(c37fd600,0,d9bc79b8,0,0,c3a7e654) at 0xc0526ab4 = ip_output+0xa64 tcp_output(c3a81cb0) at 0xc052eee5 = tcp_output+0xe05 tcp_input(c37fde00,14,d9bc7b80,0,0,...) at 0xc052d467 = tcp_input+0x28df ip_input(c37fde00,c37fde74,0,8c,c37fde00,...) at 0xc05248ad = ip_input+0x75d div_send(c3a826f4,0,c37fde00,c6a27120,0,...) at 0xc079bc1b = div_send+0x17b sosend(c3a826f4,c6a27120,d9bc7c40,c37fde00,0,0,c382c000) at 0xc04d1fd3 = sosend+ 0x5eb kern_sendit(c382c000,3,d9bc7cbc,0,0,0) at 0xc04d71a4 = kern_sendit+0x104 sendit(c382c000,3,d9bc7cbc,0,bfbdebfc,...) at 0xc04d7077 = sendit+0x147 sendto(c382c000,d9bc7d04) at 0xc04d72d5 = sendto+0x4d syscall(3b,3b,bfbe003b,1,8c,...) at 0xc05c62c7 = syscall+0x22f Xint0x80_syscall() at 0xc05b495f = Xint0x80_syscall+0x1f I've decided to look thru closed PRs and found kern/25632, which describes a similar problem (yes, that was RELENG_4 kernel vs. USB stack interaction, but the result - bothched clist reservation - was the same). So there's apparently a lack of proper locking during the operations with clist in kernel ppp within modern (at least RELENG_6) kernel. >How-To-Repeat: I've shamelessly stolen the idea of cblock_alloc() recursion detection for the kern/25632: --- tty_subr.c.orig Fri Jan 7 01:35:40 2005 +++ tty_subr.c Sun Feb 18 14:37:29 2007 @@ -94,17 +94,30 @@ * Remove a cblock from the cfreelist queue and return a pointer * to it. */ +static int someone_here = 0; +#define N1MAX 100000 static __inline struct cblock * cblock_alloc() { struct cblock *cblockp; + int n1; + for (n1=0; n1c_next; cblockp->c_next = NULL; cfreecount -= CBSIZE; + for (n1=0; n1, arg=0xc36e2660, frame=0xd5633d38) at /usr/RELENG_6/src/sys/kern/kern_fork.c:821 #20 0xc05b551c in fork_trampoline () at /usr/RELENG_6/src/sys/i386/i386/exception.s:208 Looks like ppp(4) enters cblock_alloc(), then gets preempted, then ttyinput() reenters cblock_alloc(). >Fix: I'm ready to provide further debugging information on this issue. Unfortunately, I'm not familiar enough with the locking concepts in modern FreeBSD kernels (and in tty subsystem particularly) in order to make the fix myself. >Release-Note: >Audit-Trail: >Unformatted: