From owner-freebsd-hackers@FreeBSD.ORG Thu Oct 26 20:59:27 2006 Return-Path: X-Original-To: freebsd-hackers@freebsd.org Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8738516A407 for ; Thu, 26 Oct 2006 20:59:27 +0000 (UTC) (envelope-from micahjon@ywave.com) Received: from relay2.av-mx.com (relay2.av-mx.com [137.118.16.124]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9645443D53 for ; Thu, 26 Oct 2006 20:59:26 +0000 (GMT) (envelope-from micahjon@ywave.com) X-Virus-Scan-Time: 0 Received: from [137.118.16.62] (HELO mx1.av-mx.com) by relay2.av-mx.com (CommuniGate Pro SMTP 4.2.10) with SMTP id 449124146 for freebsd-hackers@freebsd.org; Thu, 26 Oct 2006 16:59:25 -0400 Received: (qmail 10010 invoked from network); 26 Oct 2006 20:59:23 -0000 Received: from dsl28213.ywave.com (HELO ?192.168.1.66?) (micahjon@ywave.com@216.227.115.213) by 0 with SMTP; 26 Oct 2006 20:59:23 -0000 X-CLIENT-IP: 216.227.115.213 X-CLIENT-HOST: dsl28213.ywave.com Message-ID: <454121AA.9020703@ywave.com> Date: Thu, 26 Oct 2006 13:59:22 -0700 From: Micah User-Agent: Thunderbird 1.5.0.7 (X11/20061019) MIME-Version: 1.0 To: Kostik Belousov References: <453F882B.2040009@ywave.com> <453F98C4.2060402@ywave.com> <20061026083816.GS45605@deviant.kiev.zoral.com.ua> <4540E0D3.3010800@ywave.com> <20061026180455.GG45605@deviant.kiev.zoral.com.ua> In-Reply-To: <20061026180455.GG45605@deviant.kiev.zoral.com.ua> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit Cc: FreeBSD Hackers Subject: Re: System panic under load (additional information) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Oct 2006 20:59:27 -0000 Kostik Belousov wrote: > On Thu, Oct 26, 2006 at 09:22:43AM -0700, Micah wrote: >> Kostik Belousov wrote: >>> I saw several similar reports. >>> >>> Please, submit me the output of the "print *mp" in the same frame. >>> Also, I'm interested in kernel config. >>> >>> Is the problem reproducible ? >> It seems now that any time I compile openoffice the system will >> eventually panic. Other disk intensive jobs, like my nightly photo-album >> update, may or may not trigger it. I currently have two dumps for >> 6.1p10. I thought this might be failing hardware because the system has >> worked fine for nearly a year, but the fact that it panics on the same >> line of code every time makes me wonder. > > I'm very suspicious to the claim of failing hw since trace is the same all > times (is this true) ? This looks like memory corruption. They're not identical, see below. I didn't compare the traces until now since the proximate cause was the same line of code. > First, I would recommend to update to RELENG_6 due to a number of VFS fixes. > Second, could you set kern.maxvnodes in the /boot/loader.conf ? Check the > value choosen by kernel by "sysctl kern.maxvnodes", and then set it > to the 2/3 of the reported number and reboot. I'll try the update first by itself to see. > If I not gather any useful info from that action, I most likely provide you > with debugging patch. > > -- > Kostik Belousov. Thanks for the help, trisha# kgdb -q kernel.debug /home/crash/vmcore.3 [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"] Unread portion of the kernel message buffer: Fatal trap 12: page fault while in kernel mode fault virtual address = 0x5ccc0028 fault code = supervisor read, page not present instruction pointer = 0x20:0xc0593d34 stack pointer = 0x28:0xf042c7ac frame pointer = 0x28:0xf042c7cc code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 55305 (bash) trap number = 12 panic: page fault Uptime: 1h0m55s Dumping 1534 MB (2 chunks) chunk 0: 1MB (159 pages) ... ok chunk 1: 1534MB (392672 pages) 1518 1502 1486 1470 1454 1438 1422 1406 1390 1374 1358 1342 1326 1310 1294 1278 1262 1246 1230 1214 1198 1182 1166 1150 1134 1118 1102 1086 1070 1054 1038 1022 1006 990 974 958 942 926 910 894 878 862 846 830 814 798 782 766 750 734 718 702 686 670 654 638 622 606 590 574 558 542 526 510 494 478 462 446 430 414 398 382 366 350 334 318 302 286 270 254 238 222 206 190 174 158 142 126 110 94 78 62 46 30 14 #0 doadump () at pcpu.h:165 165 __asm __volatile("movl %%fs:0,%0" : "=r" (td)); (kgdb) bt #0 doadump () at pcpu.h:165 #1 0xc0535a3f in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:402 #2 0xc0535d66 in panic (fmt=0xc0711f66 "%s") at /usr/src/sys/kern/kern_shutdown.c:558 #3 0xc06ed08c in trap_fatal (frame=0xf042c76c, eva=0) at /usr/src/sys/i386/i386/trap.c:836 #4 0xc06ecd92 in trap_pfault (frame=0xf042c76c, usermode=0, eva=1556873256) at /usr/src/sys/i386/i386/trap.c:744 #5 0xc06ec95d in trap (frame= {tf_fs = 8, tf_es = 40, tf_ds = 40, tf_edi = 4, tf_esi = 4, tf_ebp = -264058932, tf_isp = -264058984, tf_ebx = 1556873216, tf_edx = -980369408, tf_ecx = -979030016, tf_eax = 10436743, tf_trapno = 12, tf_err = 0, tf_eip = -1067893452, tf_cs = 32, tf_eflags = 66054, tf_esp = -979030016, tf_ss = 10436743}) at /usr/src/sys/i386/i386/trap.c:434 #6 0xc06d9f8a in calltrap () at /usr/src/sys/i386/i386/exception.s:139 #7 0xc0593d34 in vfs_hash_get (mp=0xc5a53000, hash=10436743, flags=2, td=0xc8409180, vpp=0xf042c8b4, fn=0, arg=0x0) at /usr/src/sys/kern/vfs_hash.c:73 #8 0xc067b0b9 in ffs_vget (mp=0xc5a53000, ino=10436743, flags=2, vpp=0xf042c8b4) at pcpu.h:162 #9 0xc065a9ee in ffs_valloc (pvp=0xc67d0770, mode=33188, cred=0xc643f480, vpp=0xf042c8b4) at /usr/src/sys/ufs/ffs/ffs_alloc.c:936 #10 0xc0689b4e in ufs_makeinode (mode=33188, dvp=0xc67d0770, vpp=0xf042cbd4, cnp=0xf042cbe8) at /usr/src/sys/ufs/ufs/ufs_vnops.c:2181 #11 0xc0686756 in ufs_create (ap=0xc590c000) at /usr/src/sys/ufs/ufs/ufs_vnops.c:171 #12 0xc07010e3 in VOP_CREATE_APV (vop=0x9f4087, a=0xf042ca50) at vnode_if.c:204 #13 0xc05acca5 in vn_open_cred (ndp=0xf042cbc0, flagp=0xf042ccc0, cmode=420, cred=0xc643f480, fdidx=3) at vnode_if.h:111 #14 0xc05aca93 in vn_open (ndp=0xc5a53000, flagp=0x9f4087, cmode=10436743, fdidx=10436743) at /usr/src/sys/kern/vfs_vnops.c:91 #15 0xc05a3bd8 in kern_open (td=0xc8409180, path=0x9f4087
, pathseg=10436743, flags=1538, mode=438) at /usr/src/sys/kern/vfs_syscalls.c:1002 #16 0xc05a3ad6 in open (td=0x9f4087, uap=0xf042cd04) at /usr/src/sys/kern/vfs_syscalls.c:968 #17 0xc06ed462 in syscall (frame= {tf_fs = 59, tf_es = 59, tf_ds = 59, tf_edi = 0, tf_esi = 135315904, tf_ebp = -1077955784, tf_isp = -264057500, tf_ebx = 1537, tf_edx = 135287968, tf_ecx = 13, tf_eax = 5, tf_trapno = 12, tf_err = 2, tf_eip = 1211056535, tf_cs = 51, tf_eflags = 582, tf_esp = -1077956084, tf_ss = 59}) ---Type to continue, or q to quit--- at /usr/src/sys/i386/i386/trap.c:981 #18 0xc06d9fdf in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:200 #19 0x00000033 in ?? () Previous frame inner to this frame (corrupt stack?) trisha# kgdb -q kernel.debug /home/crash/vmcore.4 [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"] Unread portion of the kernel message buffer: Fatal trap 12: page fault while in kernel mode fault virtual address = 0x5ccc0028 fault code = supervisor read, page not present instruction pointer = 0x20:0xc0593d34 stack pointer = 0x28:0xf039389c frame pointer = 0x28:0xf03938bc code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 38817 (find) trap number = 12 panic: page fault Uptime: 47m24s Dumping 1534 MB (2 chunks) chunk 0: 1MB (159 pages) ... ok chunk 1: 1534MB (392672 pages) 1518 1502 1486 1470 1454 1438 1422 1406 1390 1374 1358 1342 1326 1310 1294 1278 1262 1246 1230 1214 1198 1182 1166 1150 1134 1118 1102 1086 1070 1054 1038 1022 1006 990 974 958 942 926 910 894 878 862 846 830 814 798 782 766 750 734 718 702 686 670 654 638 622 606 590 574 558 542 526 510 494 478 462 446 430 414 398 382 366 350 334 318 302 286 270 254 238 222 206 190 174 158 142 126 110 94 78 62 46 30 14 #0 doadump () at pcpu.h:165 165 __asm __volatile("movl %%fs:0,%0" : "=r" (td)); (kgdb) bt #0 doadump () at pcpu.h:165 #1 0xc0535a3f in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:402 #2 0xc0535d66 in panic (fmt=0xc0711f66 "%s") at /usr/src/sys/kern/kern_shutdown.c:558 #3 0xc06ed08c in trap_fatal (frame=0xf039385c, eva=0) at /usr/src/sys/i386/i386/trap.c:836 #4 0xc06ecd92 in trap_pfault (frame=0xf039385c, usermode=0, eva=1556873256) at /usr/src/sys/i386/i386/trap.c:744 #5 0xc06ec95d in trap (frame= {tf_fs = 8, tf_es = 40, tf_ds = 40, tf_edi = 4, tf_esi = 4, tf_ebp = -264685380, tf_isp = -264685432, tf_ebx = 1556873216, tf_edx = -980381696, tf_ecx = -980383744, tf_eax = 9825235, tf_trapno = 12, tf_err = 0, tf_eip = -1067893452, tf_cs = 32, tf_eflags = 66054, tf_esp = -980383744, tf_ss = 9825235}) at /usr/src/sys/i386/i386/trap.c:434 #6 0xc06d9f8a in calltrap () at /usr/src/sys/i386/i386/exception.s:139 #7 0xc0593d34 in vfs_hash_get (mp=0xc5908800, hash=9825235, flags=2, td=0xc7915480, vpp=0xf039399c, fn=0, arg=0x0) at /usr/src/sys/kern/vfs_hash.c:73 #8 0xc067b0b9 in ffs_vget (mp=0xc5908800, ino=9825235, flags=2, vpp=0xf039399c) at pcpu.h:162 #9 0xc06834b3 in ufs_lookup (ap=0xf0393a40) at /usr/src/sys/ufs/ufs/ufs_lookup.c:572 #10 0xc0701063 in VOP_CACHEDLOOKUP_APV (vop=0x95ebd3, a=0xc5909000) at vnode_if.c:150 #11 0xc059006a in vfs_cache_lookup (ap=0x95ebd3) at vnode_if.h:82 #12 0xc0700fd8 in VOP_LOOKUP_APV (vop=0xc0765080, a=0xf0393aec) at vnode_if.c:99 #13 0xc059547b in lookup (ndp=0xf0393b94) at vnode_if.h:56 #14 0xc0594c08 in namei (ndp=0xf0393b94) at /usr/src/sys/kern/vfs_lookup.c:203 #15 0xc05a68bf in kern_lstat (td=0xc7915480, path=0xc5909000 "pW2È\220©|Æ\220Y\204ÆÐMúÆ :KÇÀÜeÆ\200\210ùÈ\220Ù\035Ç :\002ÇP¥ùÆ@t:ÈPu:È R³Æ \032ÀÆ\220y&Æ zæÆ0\2031ÈÐ\215èÆ0ÃæÆ\200H\036ÇpW3È`\206\221ÆÀ\234ýÇ", pathseg=3314585600, sbp=0x95ebd3) at /usr/src/sys/kern/vfs_syscalls.c:2125 #16 0xc05a683f in lstat (td=0x95ebd3, uap=0xf0393d04) at /usr/src/sys/kern/vfs_syscalls.c:2109 #17 0xc06ed462 in syscall (frame= {tf_fs = 59, tf_es = 59, tf_ds = 59, tf_edi = 134582344, tf_esi = 134582272, tf_ebp = -1077951640, tf_isp = -264684188, tf_ebx = 672453672, tf_edx = 134582272, tf_ecx = 134561792, tf_eax = 190, tf_trapno = 0, tf_err = 2, tf_eip = 672342295, tf_cs = 51, tf_eflags = 582, tf_esp = -1077951796, tf_ss = 59}) ---Type to continue, or q to quit--- at /usr/src/sys/i386/i386/trap.c:981 #18 0xc06d9fdf in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:200 #19 0x00000033 in ?? () Previous frame inner to this frame (corrupt stack?)