From owner-freebsd-stable@FreeBSD.ORG Mon Aug 15 16:14:29 2011 Return-Path: Delivered-To: freebsd-stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1490C106564A; Mon, 15 Aug 2011 16:14:29 +0000 (UTC) (envelope-from prvs=1208040d95=killing@multiplay.co.uk) Received: from mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23]) by mx1.freebsd.org (Postfix) with ESMTP id 564A48FC19; Mon, 15 Aug 2011 16:14:27 +0000 (UTC) X-MDAV-Processed: mail1.multiplay.co.uk, Mon, 15 Aug 2011 17:13:11 +0100 X-Spam-Processed: mail1.multiplay.co.uk, Mon, 15 Aug 2011 17:13:10 +0100 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on mail1.multiplay.co.uk X-Spam-Level: X-Spam-Status: No, score=-5.0 required=6.0 tests=USER_IN_WHITELIST shortcircuit=ham autolearn=disabled version=3.2.5 Received: from r2d2 ([188.220.16.49]) by mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23]) (MDaemon PRO v10.0.4) with ESMTP id md50014607352.msg; Mon, 15 Aug 2011 17:13:08 +0100 X-MDRemoteIP: 188.220.16.49 X-Return-Path: prvs=1208040d95=killing@multiplay.co.uk X-Envelope-From: killing@multiplay.co.uk Message-ID: <94438CD02F1447EAB4889D16BFC610B5@multiplay.co.uk> From: "Steven Hartland" To: "Andriy Gapon" References: <47F0D04ADF034695BC8B0AC166553371@multiplay.co.uk><4E4380C0.7070908@FreeBSD.org><4E43E272.1060204@FreeBSD.org><62BF25D0ED914876BEE75E2ADF28DDF7@multiplay.co.uk><4E440865.1040500@FreeBSD.org><6F08A8DE780545ADB9FA93B0A8AA4DA1@multiplay.co.uk><4E441314.6060606@FreeBSD.org><2C4B0D05C8924F24A73B56EA652FA4B0@multiplay.co.uk><4E48D967.9060804@FreeBSD.org><9D034F992B064E8092E5D1D249B3E959@multiplay.co.uk><4E490DAF.1080009@FreeBSD.org><796FD5A096DE4558B57338A8FA1E125B@multiplay.co.uk><4E491D01.1090902@FreeBSD.org><570C5495A5E242F7946E806CA7AC5D68@multiplay.co.uk> <4E493CFE.6010207@FreeBSD.org> Date: Mon, 15 Aug 2011 17:13:43 +0100 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=original Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.5931 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6109 Cc: freebsd-stable@FreeBSD.org Subject: Re: debugging frequent kernel panics on 8.2-RELEASE X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Aug 2011 16:14:29 -0000 ----- Original Message ----- From: "Andriy Gapon" To: "Steven Hartland" Cc: Sent: Monday, August 15, 2011 4:36 PM Subject: Re: debugging frequent kernel panics on 8.2-RELEASE > on 15/08/2011 17:56 Steven Hartland said the following: >> >> ----- Original Message ----- From: "Andriy Gapon" >> To: "Steven Hartland" >> Cc: >> Sent: Monday, August 15, 2011 2:20 PM >> Subject: Re: debugging frequent kernel panics on 8.2-RELEASE >> >> >>> on 15/08/2011 15:51 Steven Hartland said the following: >>>> ----- Original Message ----- From: "Andriy Gapon" >>>> >>>> >>>>> on 15/08/2011 13:34 Steven Hartland said the following: >>>>>> (kgdb) list *0xffffffff8053b691 >>>>>> 0xffffffff8053b691 is in vm_fault (/usr/src/sys/vm/vm_fault.c:239). >>>>>> 234 /* >>>>>> 235 * Find the backing store object and offset into it to begin the >>>>>> 236 * search. >>>>>> 237 */ >>>>>> 238 fs.map = map; >>>>>> 239 result = vm_map_lookup(&fs.map, vaddr, fault_type, &fs.entry, >>>>>> 240 &fs.first_object, &fs.first_pindex, &prot, &wired); >>>>>> 241 if (result != KERN_SUCCESS) { >>>>>> 242 if (result != KERN_PROTECTION_FAILURE || >>>>>> 243 (fault_flags & VM_FAULT_WIRE_MASK) != >>>>>> VM_FAULT_USER_WIRE) { >>>>>> >>>>> >>>>> Interesting... thanks! > [snip] >> (kgdb) x/512a 0xffffff8d8f357210 > > This is not conclusive, but that stack looks like the following recursive chain: > vm_fault -> {vm_map_lookup, vm_map_growstack} -> trap -> trap_pfault -> vm_fault > So I suspect that increasing kernel stack size won't help here much. > Where does this chain come from? I have no answer at the moment, maybe other > developers could help here. I suspect that we shouldn't be getting that trap in > vm_map_growstack or should handle it in a different way. > Just in case its relevant I've checked other crashes and all rip entries point to: vm_fault (/usr/src/sys/vm/vm_fault.c:239). A more typical layout is from a selection of machines is:- Unread portion of the kernel message buffer: Fatal double fault rip = 0xffffffff8053b061 rsp = 0xffffff86ccf8ffb0 rbp = 0xffffff86ccf90210 cpuid = 8; apic id = 10 panic: double fault cpuid = 8 KDB: stack backtrace: #0 0xffffffff803bb28e at kdb_backtrace+0x5e #1 0xffffffff80389187 at panic+0x187 #2 0xffffffff8057fc86 at dblfault_handler+0x96 #3 0xffffffff805689dd at Xdblfault+0xad Uptime: 2d21h25m4s Physical memory: 24555 MB Dumping 4184 MB:... ---- Unread portion of the kernel message buffer: Fatal double fault rip = 0xffffffff8053b061 rsp = 0xffffff86cc742fb0 rbp = 0xffffff86cc743210 cpuid = 8; apic id = 10 panic: double fault cpuid = 8 KDB: stack backtrace: #0 0xffffffff803bb28e at kdb_backtrace+0x5e #1 0xffffffff80389187 at panic+0x187 #2 0xffffffff8057fc86 at dblfault_handler+0x96 #3 0xffffffff805689dd at Xdblfault+0xad Uptime: 2d4h30m58s Physical memory: 24555 MB Dumping 5088 MB:... ---- Fatal double fault rip = 0xffffffff8053b061 rsp = 0xffffff86caeabfb0 rbp = 0xffffff86caeac210 cpuid = 8; apic id = 10 panic: double fault cpuid = 8 KDB: stack backtrace: #0 0xffffffff803bb28e at kdb_backtrace+0x5e #1 0xffffffff80389187 at panic+0x187 #2 0xffffffff8057fc86 at dblfault_handler+0x96 #3 0xffffffff805689dd at Xdblfault+0xad Uptime: 3d1h56m45s Physical memory: 24555 MB Dumping 4690 MB:... ---- Fatal double fault rip = 0xffffffff8053b061 rsp = 0xffffff86cb1c7fb0 rbp = 0xffffff86cb1c8210 cpuid = 4; apic id = 04 panic: double fault cpuid = 4 KDB: stack backtrace: #0 0xffffffff803bb28e at kdb_backtrace+0x5e #1 0xffffffff80389187 at panic+0x187 #2 0xffffffff8057fc86 at dblfault_handler+0x96 #3 0xffffffff805689dd at Xdblfault+0xad Uptime: 1d13h41m19s Physical memory: 24555 MB Dumping 3626 MB:... And in case any of the changes to loader.conf or sysctl.conf are relevant here they are:- [loader.conf] zfs_load="YES" vfs.root.mountfrom="zfs:tank/root" # fix swap zone exhausted, increase kern.maxswzone kern.maxswzone=67108864 # Reduce the minimum arc level we want our apps to have the memory vfs.zfs.arc_min="512M" [/loader.conf] [sysctl.conf] vfs.read_max=32 net.inet.tcp.inflight.enable=0 net.inet.tcp.sendspace=65536 kern.ipc.maxsockbuf=524288 kern.maxfiles=50000 kern.ipc.nmbclusters=51200 [/sysctl.conf] Regards Steve ================================================ This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmaster@multiplay.co.uk.