From owner-freebsd-amd64@FreeBSD.ORG Tue Jan 29 13:12:45 2008 Return-Path: Delivered-To: amd64@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E4EEC16A469 for ; Tue, 29 Jan 2008 13:12:45 +0000 (UTC) (envelope-from gnn@neville-neil.com) Received: from outbound0.mx.meer.net (outbound0.mx.meer.net [209.157.153.23]) by mx1.freebsd.org (Postfix) with ESMTP id CB82913C447 for ; Tue, 29 Jan 2008 13:12:45 +0000 (UTC) (envelope-from gnn@neville-neil.com) Received: from mail.meer.net (mail.meer.net [209.157.152.14]) by outbound0.mx.meer.net (8.12.10/8.12.6) with ESMTP id m0TCWJ7T021286 for ; Tue, 29 Jan 2008 04:32:19 -0800 (PST) (envelope-from gnn@neville-neil.com) Received: from mail2.meer.net (mail2.meer.net [64.13.141.16]) by mail.meer.net (8.13.3/8.13.3/meer) with ESMTP id m0TCWIXc022836 for ; Tue, 29 Jan 2008 04:32:18 -0800 (PST) (envelope-from gnn@neville-neil.com) Received: from minion.local.neville-neil.com (61.204.211.246.customerlink.pwd.ne.jp [61.204.211.246]) (authenticated bits=0) by mail2.meer.net (8.14.1/8.14.1) with ESMTP id m0TCWHtc098168 for ; Tue, 29 Jan 2008 04:32:17 -0800 (PST) (envelope-from gnn@neville-neil.com) Date: Tue, 29 Jan 2008 21:32:16 +0900 Message-ID: From: gnn@freebsd.org To: amd64@freebsd.org User-Agent: Wanderlust/2.15.5 (Almost Unreal) SEMI/1.14.6 (Maruoka) FLIM/1.14.8 (=?ISO-8859-4?Q?Shij=F2?=) APEL/10.7 Emacs/22.1.50 (i386-apple-darwin8.10.1) MULE/5.0 (SAKAKI) MIME-Version: 1.0 (generated by SEMI 1.14.6 - "Maruoka") Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: Subject: Recent problems with 6-STABLE... X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 29 Jan 2008 13:12:46 -0000 Hi, I have two boxes running 6-STABLE, post 6.3 release, which have both spontaneously rebooted, one under load and one not under load. I have attached dmesg and some traceback information, from the one trace that looked interesting. Any thoughts or hints would be apprecated. To save you scanning all the dmesg first these are dual processor XEON boxes, each processor has 4 cores. Best, George Trace: (kgdb) where #0 doadump () at pcpu.h:172 #1 0x0000000000000004 in ?? () #2 0xffffffff802e38b7 in boot (howto=3D260) at /usr/src/sys/kern/kern_shutdown.c:409 #3 0xffffffff802e3f51 in panic (fmt=3D0xffffff01b5318260 "X=B3-=B4\001=FF= =FF=FF=B0\232=F3z") at /usr/src/sys/kern/kern_shutdown.c:565 #4 0xffffffff8047263f in trap_fatal (frame=3D0xffffff01b5318260,=20 eva=3D18446742981515785048) at /usr/src/sys/amd64/amd64/trap.c:669 #5 0xffffffff80472b42 in trap (frame=3D {tf_rdi =3D 0, tf_rsi =3D -1092176739744, tf_rdx =3D 3249104711138595= 7, tf_rcx =3D 56752, tf_r8 =3D -2140952488, tf_r9 =3D -2140952488, tf_rax = =3D 1, tf_rbx =3D 4, tf_rbp =3D -1092193766568, tf_r10 =3D 34365284352, tf_= r11 =3D -2140952488, tf_r12 =3D 0, tf_r13 =3D -1092193766568, tf_r14 =3D 0,= tf_r15 =3D -1092176739744, tf_trapno =3D 9, tf_addr =3D 0, tf_flags =3D -1= 157743792, tf_err =3D 0, tf_rip =3D -2144770494, tf_cs =3D 8, tf_rflags =3D= 66054, tf_rsp =3D -1157743920, tf_ss =3D 16}) at /usr/src/sys/amd64/amd64/trap.c:470 #6 0xffffffff80459d3b in calltrap () at /usr/src/sys/amd64/amd64/exception.S:168 #7 0xffffffff80296642 in pfs_exit (arg=3D0x0, p=3D0xffffff01b42db358) at /usr/src/sys/fs/pseudofs/pseudofs_vncache.c:288 #8 0xffffffff802c355f in exit1 (td=3D0xffffff01b5318260, rv=3D0) at /usr/src/sys/kern/kern_exit.c:230 #9 0xffffffff802c48ee in sys_exit (td=3D0x0, uap=3D0xffffff01b5318260) at /usr/src/sys/kern/kern_exit.c:99 #10 0xffffffff80473531 in syscall (frame=3D {tf_rdi =3D 0, tf_rsi =3D 0, tf_rdx =3D 4285347, tf_rcx =3D 343652843= 52, tf_r8 =3D 0, tf_r9 =3D 0, tf_rax =3D 1, tf_rbx =3D 140737488342848, tf_= rbp =3D 0, tf_r10 =3D 0, tf_r11 =3D 0, tf_r12 =3D 140737488349113, tf_r13 = =3D 0, tf_r14 =3D 0, tf_r15 =3D 140737488343208, tf_trapno =3D 12, tf_addr = =3D 5368856, tf_flags =3D 12, tf_err =3D 2, tf_rip =3D 34369792716, tf_cs = =3D 43, tf_rflags =3D 518, tf_rsp =3D 140737488342376, tf_ss =3D 35}) at /usr/src/sys/amd64/amd64/trap.c:807 #11 0xffffffff80459f38 in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:287 #12 0x0000000800996acc in ?? () Previous frame inner to this frame (corrupt stack?) Trap/Panic: Fatal trap 9: general protection fault while in kernel mode cpuid =3D 3; apic id =3D 03 instruction pointer =3D 0x8:0xffffffff80296642 stack pointer =3D 0x10:0xffffffffbafe3ac0 frame pointer =3D 0x10:0xffffff01b42db358 code segment =3D base 0x0, limit 0xfffff, type 0x1b =3D DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags =3D interrupt enabled, resume, IOPL =3D 0 current process =3D 56752 (sh) trap number =3D 9 panic: general protection fault cpuid =3D 3 Uptime: 1d13h26m9s Dumping 8190 MB (3 chunks) Dmesg: Copyright (c) 1992-2008 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 6.3-STABLE #1: Wed Jan 23 19:01:23 EST 2008 root@ntradee56:/usr/obj/usr/src/sys/LOCAL Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) Xeon(R) CPU X5365 @ 3.00GHz (3000.12-MHz K8-class = CPU) Origin =3D "GenuineIntel" Id =3D 0x6fb Stepping =3D 11 Features=3D0xbfebfbff Features2=3D0x4e3bd AMD Features=3D0x20100800 AMD Features2=3D0x1 Cores per package: 4 real memory =3D 9395240960 (8960 MB) avail memory =3D 8293945344 (7909 MB) ACPI APIC Table: FreeBSD/SMP: Multiprocessor System Detected: 8 CPUs cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 cpu2 (AP): APIC ID: 2 cpu3 (AP): APIC ID: 3 cpu4 (AP): APIC ID: 4 cpu5 (AP): APIC ID: 5 cpu6 (AP): APIC ID: 6 cpu7 (AP): APIC ID: 7 ioapic0 irqs 0-23 on motherboard ioapic1 irqs 24-47 on motherboard acpi0: on motherboard acpi0: Power Button (fixed) Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x1008-0x100b on acpi0 cpu0: on acpi0 acpi_throttle0: on cpu0 cpu1: on acpi0 acpi_throttle1: on cpu1 acpi_throttle1: failed to attach P_CNT device_attach: acpi_throttle1 attach returned 6 cpu2: on acpi0 acpi_throttle2: on cpu2 acpi_throttle2: failed to attach P_CNT device_attach: acpi_throttle2 attach returned 6 cpu3: on acpi0 acpi_throttle3: on cpu3 acpi_throttle3: failed to attach P_CNT device_attach: acpi_throttle3 attach returned 6 cpu4: on acpi0 acpi_throttle4: on cpu4 acpi_throttle4: failed to attach P_CNT device_attach: acpi_throttle4 attach returned 6 cpu5: on acpi0 acpi_throttle5: on cpu5 acpi_throttle5: failed to attach P_CNT device_attach: acpi_throttle5 attach returned 6 cpu6: on acpi0 acpi_throttle6: on cpu6 acpi_throttle6: failed to attach P_CNT device_attach: acpi_throttle6 attach returned 6 cpu7: on acpi0 acpi_throttle7: on cpu7 acpi_throttle7: failed to attach P_CNT device_attach: acpi_throttle7 attach returned 6 pcib0: port 0xcf8-0xcff on acpi0 pci0: on pcib0 pcib1: at device 2.0 on pci0 pci1: on pcib1 pcib2: irq 16 at device 0.0 on pci1 pci2: on pcib2 pcib3: irq 16 at device 0.0 on pci2 pci3: on pcib3 pcib4: irq 18 at device 2.0 on pci2 pci4: on pcib4 em0: port 0x2000-0x2= 01f mem 0xd8020000-0xd803ffff,0xd8000000-0xd801ffff irq 18 at device 0.0 on= pci4 em0: Using MSI interrupt em0: Ethernet address: 00:30:48:62:97:1e em1: port 0x2020-0x2= 03f mem 0xd8060000-0xd807ffff,0xd8040000-0xd805ffff irq 19 at device 0.1 on= pci4 em1: Using MSI interrupt em1: Ethernet address: 00:30:48:62:97:1f pcib5: at device 0.3 on pci1 pci5: on pcib5 ahd0: port 0x3400-0x34ff,0x3000-0x3= 0ff mem 0xd8100000-0xd8101fff irq 30 at device 3.0 on pci5 ahd0: [GIANT-LOCKED] aic7902: Ultra320 Wide Channel A, SCSI Id=3D7, PCI-X 67-100Mhz, 512 SCBs ahd1: port 0x3c00-0x3cff,0x3800-0x3= 8ff mem 0xd8102000-0xd8103fff irq 31 at device 3.1 on pci5 ahd1: [GIANT-LOCKED] aic7902: Ultra320 Wide Channel B, SCSI Id=3D7, PCI-X 67-100Mhz, 512 SCBs pcib6: at device 4.0 on pci0 pci6: on pcib6 pcib7: at device 6.0 on pci0 pci7: on pcib7 pci0: at device 8.0 (no driver attached) uhci0: port 0x1800-0x181f irq 17 at device = 29.0 on pci0 uhci0: [GIANT-LOCKED] usb0: on uhci0 usb0: USB revision 1.0 uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 2 ports with 2 removable, self powered uhci1: port 0x1820-0x183f irq 19 at device = 29.1 on pci0 uhci1: [GIANT-LOCKED] usb1: on uhci1 usb1: USB revision 1.0 uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub1: 2 ports with 2 removable, self powered uhci2: port 0x1840-0x185f irq 18 at device = 29.2 on pci0 uhci2: [GIANT-LOCKED] usb2: on uhci2 usb2: USB revision 1.0 uhub2: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub2: 2 ports with 2 removable, self powered uhci3: port 0x1860-0x187f irq 16 at device = 29.3 on pci0 uhci3: [GIANT-LOCKED] usb3: on uhci3 usb3: USB revision 1.0 uhub3: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub3: 2 ports with 2 removable, self powered ehci0: mem 0xd8600000-0xd86003ff irq 17= at device 29.7 on pci0 ehci0: [GIANT-LOCKED] usb4: EHCI version 1.0 usb4: companion controllers, 2 ports each: usb0 usb1 usb2 usb3 usb4: on ehci0 usb4: USB revision 2.0 uhub4: Intel EHCI root hub, class 9/0, rev 2.00/1.00, addr 1 uhub4: 8 ports with 8 removable, self powered pcib8: at device 30.0 on pci0 pci8: on pcib8 pci8: at device 1.0 (no driver attached) isab0: at device 31.0 on pci0 isa0: on isab0 atapci0: port 0x1f0-0x1f7,0x3f6,0x170-0= x177,0x376,0x1880-0x188f at device 31.1 on pci0 ata0: on atapci0 ata1: on atapci0 atapci1: port 0x18c0-0x18c7,0x1894-0x1897,0x1898-0x= 189f,0x1890-0x1893,0x18a0-0x18bf mem 0xd8600400-0xd86007ff irq 19 at device= 31.2 on pci0 atapci1: AHCI Version 01.10 controller with 6 ports detected ata2: on atapci1 ata3: on atapci1 ata4: on atapci1 ata5: on atapci1 ata6: on atapci1 ata7: on atapci1 pci0: at device 31.3 (no driver attached) acpi_button0: on acpi0 atkbdc0: port 0x60,0x64 irq 1 on acpi0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] psm0: irq 12 on atkbdc0 psm0: [GIANT-LOCKED] psm0: model IntelliMouse Explorer, device ID 4 sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acp= i0 sio0: type 16550A sio1: <16550A-compatible COM port> port 0x2f8-0x2ff irq 3 on acpi0 sio1: type 16550A fdc0: port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0 fdc0: [FAST] fd0: <1440-KB 3.5" drive> on fdc0 drive 0 orm0: at iomem 0xc0000-0xcafff,0xcb000-0xcbfff on isa0 ppc0: cannot reserve I/O port range sc0: at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=3D0x300> vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 Timecounters tick every 1.000 msec acd0: CDRW at ata0-slave UDMA33 Waiting 5 seconds for SCSI devices to settle ses0 at ahd0 bus 0 target 6 lun 0 ses0: Fixed Processor SCSI-2 device=20 ses0: 3.300MB/s transfers ses0: SAF-TE Compliant Device SMP: AP CPU #2 Launched! SMP: AP CPU #1 Launched! SMP: AP CPU #3 Launched! SMP: AP CPU #5 Launched! SMP: AP CPU #7 Launched! SMP: AP CPU #4 Launched! SMP: AP CPU #6 Launched! da0 at ahd0 bus 0 target 0 lun 0 da0: Fixed Direct Access SCSI-3 device=20 da0: 320.000MB/s transfers (160.000MHz, offset 63, 16bit), Tagged Queueing = Enabled da0: 140014MB (286749488 512 byte sectors: 255H 63S/T 17849C) Trying to mount root from ufs:/dev/da0s1a