From owner-freebsd-stable@FreeBSD.ORG Sun Jan 23 16:51:34 2005 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9E92016A4CE; Sun, 23 Jan 2005 16:51:34 +0000 (GMT) Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2B32543D48; Sun, 23 Jan 2005 16:51:34 +0000 (GMT) (envelope-from robert@fledge.watson.org) Received: from fledge.watson.org (localhost [127.0.0.1]) by fledge.watson.org (8.13.1/8.13.1) with ESMTP id j0NGpFC7049713; Sun, 23 Jan 2005 11:51:15 -0500 (EST) (envelope-from robert@fledge.watson.org) Received: from localhost (robert@localhost)j0NGpFTc049710; Sun, 23 Jan 2005 16:51:15 GMT (envelope-from robert@fledge.watson.org) Date: Sun, 23 Jan 2005 16:51:14 +0000 (GMT) From: Robert Watson X-Sender: robert@fledge.watson.org To: stable@FreeBSD.org Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: imp@FreeBSD.org Subject: NULL pointer deref in sioopen() suggests a close/open race on sio device? X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 23 Jan 2005 16:51:34 -0000 Ran into the following panic on a 5-STABLE box this morning, which occurred after hitting Ctrl-D to close a login session on a serial console (ttyd0 at 9600 bps): login: Jan 23 10:43:27 fledge login: 2 LOGIN FAILURES ON ttyd0 Fatal trap 12: page fault while in kernel mode fault virtual address = 0x1c fault code = supervisor write, page not present instruction pointer = 0x8:0xc051537b stack pointer = 0x10:0xe7345988 frame pointer = 0x10:0xe7345994 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 45092 (getty) [thread pid 45092 tid 100201 ] Stopped at knote+0x27: cmpxchgl %ecx,0x1c(%edx) db> show pcpu cpuid = 0 curthread = 0xc290d190: pid 45092 "getty" curpcb = 0xe7345da0 fpcurthread = 0xc290d190: pid 45092 "getty" idlethread = 0xc22644b0: pid 11 "idle" APIC ID = 0 currentldt = 0x30 db> trace Tracing pid 45092 tid 100201 td 0xc290d190 knote(c264e098,0,0,c290d190,e73459c4) at knote+0x27 ttwwakeup(c264e000) at ttwwakeup+0xc8 comstart(c264e000) at comstart+0x385 comparam(c264e000,c264e0a4,c264e000,3,0) at comparam+0x253 sioopen(c079f060,3,2000,c290d190,c078e6a0) at sioopen+0x1df spec_open(e7345a84,e7345b40,c058d585,e7345a84,180) at spec_open+0x2b6 spec_vnoperate(e7345a84) at spec_vnoperate+0x13 vn_open_cred(e7345be4,e7345ce4,c08,c2261d80,0) at vn_open_cred+0x419 vn_open(e7345be4,e7345ce4,c08,0,c4289b58) at vn_open+0x1e kern_open(c290d190,804f8e0,0,3,bfbfee18) at kern_open+0xe3 open(c290d190,e7345d14,3,0,292) at open+0x18 syscall(2f,2f,2f,804f8e0,0) at syscall+0x27b Xint0x80_syscall() at Xint0x80_syscall+0x1f --- syscall (5, FreeBSD ELF32, open), eip = 0x280d155b, esp = 0xbfbfedec, ebp = 0xbfbfee18 --- The ps list is a bit boring, but the primary interesting thing is that it looks like the close was going on in one thread just about when the sio swi was scheduled to run also: db> ps pid proc uarea uid ppid pgrp flag stat wmesg wchan cmd 45092 c6762388 e7387000 0 1 1 0004000 [CPU 0] getty ... 132 c235954c e4fbf000 0 0 0 000020c [RUNQ] swi5: clock sio I didn't have a kernel with debugging symbols on-hand, but the above address in knote() is a cmpxchg early in the function, which means it's likely the conditional call to mtx_lock() hitting a NULL mutex pointer for kl_lock. This in turn suggests that something has called ttyrel/tty_close on the TTY in a race with the open, or otherwise NULL'd that pointer via knlist_destroy(). Anyone have any pointers on this one? The TTY code is not my forte... Robert N M Watson