Date: Mon, 23 Nov 2009 16:28:21 -0500 (EST) From: Charles Sprickman <spork@bway.net> To: stable@freebsd.org Subject: Re: panic in 7.2 (ffs_alloc.c?) Message-ID: <alpine.OSX.2.00.0911231625220.19128@hotlap.local> In-Reply-To: <alpine.OSX.2.00.0911220037320.19128@hotlap.local> References: <alpine.OSX.2.00.0911220037320.19128@hotlap.local>
next in thread | previous in thread | raw e-mail | index | archive | help
Just a follow-up... The machine was waiting for a manual fsck - this crash seemed to scramble things up pretty good, it hit the jail partition hard and seemed to touch others that were quiet at the time. I'm re-running mstone with an even heavier load to see if I can reproduce this again. Full verbose dmesg: http://pastie.org/711839 Should I bother with a PR or anything on this? Doesn't look like a hardware issue to me. It seems like there could be a nasty bug waiting in the UFS2 code somewhere, does anyone want to persue this at all? I have the dump available for anyone that wants it. Thanks, Charles On Sun, 22 Nov 2009, Charles Sprickman wrote: > Howdy, > > I'm not expert at getting info out of a dump, but I'll do my best to provide > some information. > > This is a Dell PE2970 w/PERC6/i RAID running FreeBSD 7.2/amd64. Brand new > box, has been doing very light work for about two weeks. Last night I > started a very long mstone run on a jailed mail server and found that quite a > way into this burn-in, the box paniced. I was going to put it in service > Monday (after punishing it all weekend). Looking for some input on what the > root cause is and whether going to a -stable snapshot might be worthwhile. > > I can tell you there was a good deal of disk activity at the time in the jail > - mstone was simulating 100 POP and SMTP clients hitting the machine at once. > This is qmail+courier. So messages are coming in, hitting the queue, hitting > a user's maildir, getting read and deleted via the POP "client" over and over > again. I do see lots of "ffs_*" stuff in the backtrace, which is a little > scary. > > Here's my stab at a kgdb session (also @ pastie for easier reading: > http://pastie.org/709671): > > [root@bigmail /usr/obj/usr/src/sys/BWAY7-64]# kgdb kernel.debug > /var/crash/vmcore.0 > GNU gdb 6.1.1 [FreeBSD] > Copyright 2004 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and you are > welcome to change it and/or distribute copies of it under certain conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for details. > This GDB was configured as "amd64-marcel-freebsd"... > > Unread portion of the kernel message buffer: > > > Fatal trap 12: page fault while in kernel mode > cpuid = 0; apic id = 00 > fault virtual address = 0x12d4b9f5c > fault code = supervisor read data, page not present > instruction pointer = 0x8:0xffffffff8050382e > stack pointer = 0x10:0xffffffff281a75b0 > frame pointer = 0x10:0xffffff000455f800 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = 6324 (vdelivermail) > trap number = 12 > panic: page fault > cpuid = 0 > Uptime: 12d0h32m3s > Physical memory: 6130 MB > Dumping 725 MB: 710 694 678 662 646 630 614 598 582 566 550 534 518 502 486 > 470 454 438 422 406 390 374 358 342 326 310 294 278 262 246 230 214 198 182 > 166 150 134 118 102 86 70 54 38 22 6 > > Reading symbols from /boot/kernel/nullfs.ko...Reading symbols from > /boot/kernel/nullfs.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/nullfs.ko > Reading symbols from /boot/kernel/fdescfs.ko...Reading symbols from > /boot/kernel/fdescfs.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/fdescfs.ko > #0 doadump () at pcpu.h:195 > 195 __asm __volatile("movq %%gs:0,%0" : "=r" (td)); > #3 0xffffffff8034cba2 in panic (fmt=0x104 <Address 0x104 out of bounds>) > at /usr/src/sys/kern/kern_shutdown.c:574 > #4 0xffffffff80574823 in trap_fatal (frame=0xffffff00046c8000, eva=Variable > "eva" is not available. > ) > at /usr/src/sys/amd64/amd64/trap.c:757 > #5 0xffffffff80574bf5 in trap_pfault (frame=0xffffffff281a7500, usermode=0) > at /usr/src/sys/amd64/amd64/trap.c:673 > #6 0xffffffff80575534 in trap (frame=0xffffffff281a7500) > at /usr/src/sys/amd64/amd64/trap.c:444 > #7 0xffffffff8055969e in calltrap () > at /usr/src/sys/amd64/amd64/exception.S:209 > #8 0xffffffff8050382e in ffs_realloccg (ip=0xffffff00267f75c0, lbprev=0, > bprev=6288224785898156086, bpref=593305256, osize=0, nsize=2048, > flags=33619968, cred=0xffffff00927fe800, bpp=0xffffffff281a7800) > at /usr/src/sys/ufs/ffs/ffs_alloc.c:1349 > #9 0xffffffff80506e8e in ffs_balloc_ufs2 (vp=0xffffff0027a64dc8, > startoffset=Variable "startoffset" is not available. > ) > at /usr/src/sys/ufs/ffs/ffs_balloc.c:692 > #10 0xffffffff805223e5 in ffs_write (ap=0xffffffff281a7a10) > at /usr/src/sys/ufs/ffs/ffs_vnops.c:724 > #11 0xffffffff805a0645 in VOP_WRITE_APV (vop=0xffffffff80793d20, > a=0xffffffff281a7a10) at vnode_if.c:691 > #12 0xffffffff803dd731 in vn_write (fp=0xffffff001027cd00, > uio=0xffffffff281a7b00, active_cred=Variable "active_cred" is not > available. > ) at vnode_if.h:373 > #13 0xffffffff80388768 in dofilewrite (td=0xffffff00046c8000, fd=5, > fp=0xffffff001027cd00, auio=dwarf2_read_address: Corrupted DWARF > expression. > ) at file.h:257 > #14 0xffffffff80388a6e in kern_writev (td=0xffffff00046c8000, fd=5, > auio=0xffffffff281a7b00) at /usr/src/sys/kern/sys_generic.c:402 > #15 0xffffffff80388aec in write (td=0x800, uap=0x12d4b9f50) > at /usr/src/sys/kern/sys_generic.c:318 > #16 0xffffffff80596a66 in ia32_syscall (frame=0xffffffff281a7c80) > at /usr/src/sys/amd64/ia32/ia32_syscall.c:182 > #17 0xffffffff80559ad0 in Xint0x80_syscall () at ia32_exception.S:65 > #18 0x0000000028167928 in ?? () > Previous frame inner to this frame (corrupt stack?) > > Full dmesg, verbose boot and kernel config at pastie as well. Actually no > verbose boot... I rebooted the box after setting verbose boot with > "nextboot" and it didn't come back. Hrmph. No remote console, so I don't > know what's up, perhaps waiting on some manual fsck action. > > Thanks, > > Charles > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?alpine.OSX.2.00.0911231625220.19128>