From owner-freebsd-stable@FreeBSD.ORG Fri Jan 5 12:13:33 2007 Return-Path: X-Original-To: stable@FreeBSD.org Delivered-To: freebsd-stable@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 92DE716A407 for ; Fri, 5 Jan 2007 12:13:33 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42]) by mx1.freebsd.org (Postfix) with ESMTP id 0BFAF13C458 for ; Fri, 5 Jan 2007 12:13:30 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [209.31.154.41]) by cyrus.watson.org (Postfix) with ESMTP id A75A348822; Fri, 5 Jan 2007 07:13:30 -0500 (EST) Date: Fri, 5 Jan 2007 12:13:30 +0000 (GMT) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: Ceri Davies In-Reply-To: <20070105111954.GA51511@submonkey.net> Message-ID: <20070105120539.H46119@fledge.watson.org> References: <20070105111954.GA51511@submonkey.net> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: stable@FreeBSD.org Subject: Re: (audit?) Panic in 6.2-PRERELEASE X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 05 Jan 2007 12:13:33 -0000 On Fri, 5 Jan 2007, Ceri Davies wrote: > For the last two mornings, my system decided to panic() in the exact same > place. I have dumps from both but they almost exactly the same. Any > pointers on where to go next are welcomed. > > Here's the first, and I don't see much in there: In principle, kern_fstat() should not call audit_arg_auditon(), so either we're looking at a compile problem or at stack corruption. Am I correct in thinking that this is running on a cyrus server? Much as I would love to trust the contents of ub there, I suspect they can't be trusted. Could you print the contents of *fp in kern_fstat() in both of those stacks? I'd particularly like to know the value of fp->f_type, and then depending on the type, possibly the contents of *(struct vnode *)fp->f_vnode for DTYPE_VNODE/TYPE_FIFO or *(struct socket *)fp->f_data in the case of DTYPE_SOCKET. Thanks, Robert N M Watson Computer Laboratory University of Cambridge > > {root@shrike}-{~} # uname -a > FreeBSD shrike.private.submonkey.net 6.2-PRERELEASE FreeBSD 6.2-PRERELEASE #69: Fri Dec 29 00:25:52 GMT 2006 root@shrike.private.submonkey.net:/usr/obj/usr/src/sys/SHRIKE i386 > {root@shrike}-{~} # kgdb /usr/obj/usr/src/sys/SHRIKE/kernel.debug /var/crash/vmcore.29 > kgdb: kvm_nlist(_stopped_cpus): > kgdb: kvm_nlist(_stoppcbs): > [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"] > GNU gdb 6.1.1 [FreeBSD] > Copyright 2004 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and you are > welcome to change it and/or distribute copies of it under certain conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for details. > This GDB was configured as "i386-marcel-freebsd". > > Unread portion of the kernel message buffer: > > > Fatal trap 12: page fault while in kernel mode > fault virtual address = 0x53892047 > fault code = supervisor write, page not present > instruction pointer = 0x20:0xc05cda7c > stack pointer = 0x28:0xd610dc48 > frame pointer = 0x28:0xd610dc60 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, def32 1, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = 65381 (imapd) > trap number = 12 > panic: page fault > Uptime: 5d19h44m40s > Dumping 503 MB (2 chunks) > chunk 0: 1MB (160 pages) ... ok > chunk 1: 503MB (128752 pages) 487 471 455 439 423 407 391 375 359 343 327 311 295 279 263 247 231 215 199 183 167 151 135 119 103 87 71 55 39 23 7 > > #0 doadump () at pcpu.h:165 > 165 pcpu.h: No such file or directory. > in pcpu.h > (kgdb) where > #0 doadump () at pcpu.h:165 > #1 0xc04e85aa in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409 > #2 0xc04e8840 in panic (fmt=0xc066f61a "%s") at /usr/src/sys/kern/kern_shutdown.c:565 > #3 0xc0653ed4 in trap_fatal (frame=0xd610dc08, eva=1401495623) > at /usr/src/sys/i386/i386/trap.c:837 > #4 0xc0653c3b in trap_pfault (frame=0xd610dc08, usermode=0, eva=1401495623) > at /usr/src/sys/i386/i386/trap.c:745 > #5 0xc0653899 in trap (frame= > {tf_fs = 8, tf_es = 40, tf_ds = 40, tf_edi = -1024544384, tf_esi = -1024544384, tf_ebp = -703538080, tf_isp = -703538124, tf_ebx = 0, tf_edx = -703538092, tf_ecx = 4, tf_eax = 0, tf_trapno = 12, tf_err = 2, tf_eip = -1067656580, tf_cs = 32, tf_eflags = 66050, tf_esp = -1068742797, tf_ss = -1022955520}) at /usr/src/sys/i386/i386/trap.c:435 > #6 0xc064287a in calltrap () at /usr/src/sys/i386/i386/exception.s:139 > #7 0xc05cda7c in audit_arg_auditon () at /usr/src/sys/security/audit/audit_arg.c:586 > #8 0xc04c470d in fstat (td=0xc2eeb180, uap=0xd610dc74) at /usr/src/sys/kern/kern_descrip.c:1075 > #9 0xc0654203 in syscall (frame= > {tf_fs = 59, tf_es = 59, tf_ds = 59, tf_edi = -1077949408, tf_esi = 135666752, tf_ebp = -1077949448, tf_isp = -703537820, tf_ebx = 135432156, tf_edx = -1077949112, tf_ecx = 135826416, tf_eax = 189, tf_trapno = 0, tf_err = 2, tf_eip = 675755895, tf_cs = 51, tf_eflags = 662, tf_esp = -1077949732, tf_ss = 59}) at /usr/src/sys/i386/i386/trap.c:983 > #10 0xc06428cf in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:200 > #11 0x00000033 in ?? () > Previous frame inner to this frame (corrupt stack?) > (kgdb) up 8 > #8 0xc04c470d in fstat (td=0xc2eeb180, uap=0xd610dc74) at /usr/src/sys/kern/kern_descrip.c:1075 > 1075 error = kern_fstat(td, uap->fd, &ub); > (kgdb) p ub > $1 = {st_dev = 89, st_ino = 1907905, st_mode = 33152, st_nlink = 1, st_uid = 60, st_gid = 60, > st_rdev = 7624272, st_atimespec = {tv_sec = 1167893059, tv_nsec = -703537996}, st_mtimespec = { > tv_sec = -703537916, tv_nsec = -1024544384}, st_ctimespec = {tv_sec = 43018, tv_nsec = 43018}, > st_size = -3021672509244264064, st_blocks = -1067658896, st_blksize = 43018, st_flags = 4, > st_gen = 3, st_lspare = 0, st_birthtimespec = {tv_sec = -1, tv_nsec = 4}} > (kgdb) p td > $2 = (struct thread *) 0xc2eeb180 > (kgdb) p uap->fd > $3 = 89 > (kgdb) > > The second one seems more promising, in that the fd seems to be rubbish. > > {root@shrike}-{~} # kgdb /usr/obj/usr/src/sys/SHRIKE/kernel.debug /var/crash/vmcore.30 > kgdb: kvm_nlist(_stopped_cpus): > kgdb: kvm_nlist(_stoppcbs): > [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"] > GNU gdb 6.1.1 [FreeBSD] > Copyright 2004 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and you are > welcome to change it and/or distribute copies of it under certain conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for details. > This GDB was configured as "i386-marcel-freebsd". > > Unread portion of the kernel message buffer: > > > Fatal trap 12: page fault while in kernel mode > fault virtual address = 0x53892047 > fault code = supervisor write, page not present > instruction pointer = 0x20:0xc05cda7c > stack pointer = 0x28:0xd617ec48 > frame pointer = 0x28:0xd617ec60 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, def32 1, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = 9943 (imapd) > trap number = 12 > panic: page fault > Uptime: 22h39m3s > Dumping 503 MB (2 chunks) > chunk 0: 1MB (160 pages) ... ok > chunk 1: 503MB (128752 pages) 487 471 455 439 423 407 391 375 359 343 327 311 295 279 263 247 231 215 199 183 167 151 135 119 103 87 71 55 39 23 7 > > #0 doadump () at pcpu.h:165 > 165 pcpu.h: No such file or directory. > in pcpu.h > (kgdb) where > #0 doadump () at pcpu.h:165 > #1 0xc04e85aa in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409 > #2 0xc04e8840 in panic (fmt=0xc066f61a "%s") at /usr/src/sys/kern/kern_shutdown.c:565 > #3 0xc0653ed4 in trap_fatal (frame=0xd617ec08, eva=1401495623) > at /usr/src/sys/i386/i386/trap.c:837 > #4 0xc0653c3b in trap_pfault (frame=0xd617ec08, usermode=0, eva=1401495623) > at /usr/src/sys/i386/i386/trap.c:745 > #5 0xc0653899 in trap (frame= > {tf_fs = 8, tf_es = 40, tf_ds = 40, tf_edi = -1022323968, tf_esi = -1022323968, tf_ebp = -703075232, tf_isp = -703075276, tf_ebx = 0, tf_edx = -703075244, tf_ecx = 4, tf_eax = 0, tf_trapno = 12, tf_err = 2, tf_eip = -1067656580, tf_cs = 32, tf_eflags = 66050, tf_esp = -1068742797, tf_ss = -1022327760}) at /usr/src/sys/i386/i386/trap.c:435 > #6 0xc064287a in calltrap () at /usr/src/sys/i386/i386/exception.s:139 > #7 0xc05cda7c in audit_arg_auditon () at /usr/src/sys/security/audit/audit_arg.c:586 > #8 0xc04c470d in fstat (td=0xc3109300, uap=0xd617ec74) at /usr/src/sys/kern/kern_descrip.c:1075 > #9 0xc0654203 in syscall (frame= > {tf_fs = 59, tf_es = 59, tf_ds = 59, tf_edi = 135488384, tf_esi = -1077948560, tf_ebp = -1077948888, tf_isp = -703074972, tf_ebx = 135432156, tf_edx = -1077948712, tf_ecx = 25, tf_eax = 189, tf_trapno = 0, tf_err = 2, tf_eip = 675755895, tf_cs = 51, tf_eflags = 662, tf_esp = -1077949124, tf_ss = 59}) at /usr/src/sys/i386/i386/trap.c:983 > #10 0xc06428cf in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:200 > #11 0x00000033 in ?? () > Previous frame inner to this frame (corrupt stack?) > (kgdb) up 8 > #8 0xc04c470d in fstat (td=0xc3109300, uap=0xd617ec74) at /usr/src/sys/kern/kern_descrip.c:1075 > 1075 error = kern_fstat(td, uap->fd, &ub); > (kgdb) p uap->fd > $1 = -1023449232 > (kgdb) > > Ceri > -- > That must be wonderful! I don't understand it at all. > -- Moliere >