From owner-freebsd-amd64@FreeBSD.ORG Sat Jan 26 23:26:29 2008 Return-Path: Delivered-To: freebsd-amd64@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B0BCF16A41A for ; Sat, 26 Jan 2008 23:26:29 +0000 (UTC) (envelope-from freebsd@penx.com) Received: from Elmer.dco.penx.com (elmer-sprint.dco.penx.com [65.173.215.114]) by mx1.freebsd.org (Postfix) with ESMTP id 5842A13C478 for ; Sat, 26 Jan 2008 23:26:29 +0000 (UTC) (envelope-from freebsd@penx.com) Received: from [172.19.10.240] (sylvester.dco.penx.com [172.19.10.240]) by Elmer.dco.penx.com (8.14.2/8.14.2) with ESMTP id m0QMwJ3i012515 for ; Sat, 26 Jan 2008 15:58:19 -0700 (MST) (envelope-from freebsd@penx.com) From: Dennis Glatting To: freebsd-amd64@freebsd.org Content-Type: text/plain Date: Sat, 26 Jan 2008 15:58:19 -0700 Message-Id: <1201388299.84900.12.camel@Sylvester.dco.penx.com> Mime-Version: 1.0 X-Mailer: Evolution 2.12.3 FreeBSD GNOME Team Port Content-Transfer-Encoding: 7bit Subject: Multi processor locking problem under 7.0 X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 26 Jan 2008 23:26:29 -0000 I have several systems of two different types running 7.0. One is an IBM 3550 and the other a Dell 2950. The IBMs more than the Dells consistently seem to have a kernel locking problem during dump. Specifically, if I execute this command: dump 0uaLCf 64 /dev/null /usr Dump consistently stops in Phase IV. However, if I set machdep.hlt_logical_cpus=1, dump does not stop. At the end of this message is my boot information. When logical_cpus=0, the following is typical of what is displayed by top when dump stops: PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 926 root 1 4 0 75476K 71744K sbwait 0 0:04 0.00% dump 928 root 1 20 0 75348K 67740K pause 1 0:02 0.00% dump 929 root 1 20 0 75348K 67740K pause 1 0:02 0.00% dump 927 root 1 20 0 75348K 67740K pause 1 0:02 0.00% dump 919 root 1 8 0 75348K 67144K wait 0 0:00 0.00% dump Fooling around a bit I have found that if I truss dump, the dump continues. On the Dells, if I force disk activity during the dump, such as executing a ls -lR /usr > /dev/null, the dump finishes. I am unsure how to proceed in debugging this problem. It has been around for a while but I am now installing the IBMs and the dump problem is a no-starter. Please contact me directly on how to proceed. Thanks. Marvin# dmesg Copyright (c) 1992-2008 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 7.0-PRERELEASE #0: Sat Jan 26 12:31:52 CST 2008 root@Marvin.pki2.com:/usr/src/sys/amd64/compile/MARVIN Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) Xeon(R) CPU E5335 @ 2.00GHz (1995.01-MHz K8-class CPU) Origin = "GenuineIntel" Id = 0x6fb Stepping = 11 Features=0xbfebfbff Features2=0x4e33d AMD Features=0x20100800 AMD Features2=0x1 Cores per package: 4 usable memory = 8577040384 (8179 MB) avail memory = 8281866240 (7898 MB) ACPI APIC Table: FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 cpu2 (AP): APIC ID: 2 cpu3 (AP): APIC ID: 3 ioapic0 irqs 0-23 on motherboard kbd1 at kbdmux0 ath_hal: 0.9.20.3 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, RF5413) hptrr: HPT RocketRAID controller driver v1.1 (Jan 26 2008 12:31:44) acpi0: on motherboard acpi0: [ITHREAD] acpi0: Power Button (fixed) Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x588-0x58b on acpi0 acpi_hpet0: iomem 0xfed00000-0xfed003ff on acpi0 Timecounter "HPET" frequency 14318180 Hz quality 900 cpu0: on acpi0 p4tcc0: on cpu0 cpu1: on acpi0 p4tcc1: on cpu1 cpu2: on acpi0 p4tcc2: on cpu2 cpu3: on acpi0 p4tcc3: on cpu3 pcib0: on acpi0 pci0: on pcib0 pcib1: at device 2.0 on pci0 pci16: on pcib1 pcib2: at device 0.0 on pci16 pci17: on pcib2 pcib3: at device 0.0 on pci17 pci19: on pcib3 pcib4: at device 1.0 on pci17 pci18: on pcib4 pcib5: at device 0.3 on pci16 pci20: on pcib5 pcib6: at device 3.0 on pci0 pci35: on pcib6 pcib7: at device 4.0 on pci0 pci7: on pcib7 pcib8: at device 5.0 on pci0 pci34: on pcib8 pcib9: at device 6.0 on pci0 pci3: on pcib9 pcib10: at device 0.0 on pci3 pci4: on pcib10 bce0: mem 0xc8000000-0xc9ffffff irq 18 at device 0.0 on pci4 miibus0: on bce0 brgphy0: PHY 1 on miibus0 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto bce0: Ethernet address: 00:1a:64:94:3c:30 bce0: [ITHREAD] bce0: ASIC (0x57081020); Rev (B2); Bus (PCI-X, 64-bit, 133MHz); F/W (0x03040405); Flags( MFW MSI ) pcib11: at device 7.0 on pci0 pci2: on pcib11 aac0: port 0x4000-0x40ff mem 0xcce00000-0xccffffff,0xcafe0000-0xcaffffff irq 17 at device 0.0 on pci2 aac0: Enabling 64-bit address support aac0: New comm. interface enabled aac0: [ITHREAD] aac0: ServeRAID 8k-l , aac driver 2.0.0-1 pci0: at device 8.0 (no driver attached) pcib12: irq 16 at device 28.0 on pci0 pci5: on pcib12 pcib13: at device 0.0 on pci5 pci6: on pcib13 bce1: mem 0xce000000-0xcfffffff irq 16 at device 0.0 on pci6 miibus1: on bce1 brgphy1: PHY 1 on miibus1 brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto bce1: Ethernet address: 00:1a:64:94:3c:32 bce1: [ITHREAD] bce1: ASIC (0x57081020); Rev (B2); Bus (PCI-X, 64-bit, 133MHz); F/W (0x03040405); Flags( MFW MSI ) uhci0: port 0x2200-0x221f irq 23 at device 29.0 on pci0 uhci0: [GIANT-LOCKED] uhci0: [ITHREAD] usb0: on uhci0 usb0: USB revision 1.0 uhub0: on usb0 uhub0: 2 ports with 2 removable, self powered uhci1: port 0x2600-0x261f irq 22 at device 29.1 on pci0 uhci1: [GIANT-LOCKED] uhci1: [ITHREAD] usb1: on uhci1 usb1: USB revision 1.0 uhub1: on usb1 uhub1: 2 ports with 2 removable, self powered uhci2: port 0x2a00-0x2a1f irq 23 at device 29.2 on pci0 uhci2: [GIANT-LOCKED] uhci2: [ITHREAD] usb2: on uhci2 usb2: USB revision 1.0 uhub2: on usb2 uhub2: 2 ports with 2 removable, self powered ehci0: mem 0xf9000000-0xf90003ff irq 23 at device 29.7 on pci0 ehci0: [GIANT-LOCKED] ehci0: [ITHREAD] usb3: EHCI version 1.0 usb3: companion controllers, 2 ports each: usb0 usb1 usb2 usb3: on ehci0 usb3: USB revision 2.0 uhub3: on usb3 uhub3: 6 ports with 6 removable, self powered pcib14: at device 30.0 on pci0 pci1: on pcib14 vgapci0: port 0x3000-0x30ff mem 0xd0000000-0xd7ffffff,0xdfff0000-0xdfffffff irq 22 at device 1.0 on pci1 isab0: at device 31.0 on pci0 isa0: on isab0 atapci0: port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x480-0x48f at device 31.1 on pci0 ata0: on atapci0 ata0: [ITHREAD] ata1: on atapci0 ata1: [ITHREAD] pci0: at device 31.3 (no driver attached) sio0: configured irq 4 not in bitmap of probed irqs 0 sio0: port may not be enabled sio0: configured irq 4 not in bitmap of probed irqs 0 sio0: port may not be enabled sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 sio0: type 16550A sio0: [FILTER] orm0: at iomem 0xc0000-0xcafff,0xcb000-0xcffff on isa0 atkbdc0: at port 0x60,0x64 on isa0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] atkbd0: [ITHREAD] ppc0: cannot reserve I/O port range sc0: at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> sio1: configured irq 3 not in bitmap of probed irqs 0 sio1: port may not be enabled vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 ukbd0: on uhub1 kbd2 at ukbd0 uhid0: on uhub1 ukbd1: on uhub2 kbd3 at ukbd1 uhid1: on uhub2 uhid2: on uhub2 Timecounters tick every 1.000 msec ipfw2 (+ipv6) initialized, divert loadable, rule-based forwarding disabled, default to accept, logging limited to 4096 packets/entry by default hptrr: no controller detected. acd0: CDRW at ata0-master UDMA33 aacd0: on aac0 aacd0: 139890MB (286494720 sectors) SMP: AP CPU #1 Launched! SMP: AP CPU #2 Launched! SMP: AP CPU #3 Launched! Trying to mount root from ufs:/dev/aacd0s1a