From owner-freebsd-bugs@FreeBSD.ORG Thu Dec 6 12:10:02 2007 Return-Path: Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9C6CA16A421 for ; Thu, 6 Dec 2007 12:10:02 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 7B04413C4CE for ; Thu, 6 Dec 2007 12:10:02 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.14.2/8.14.2) with ESMTP id lB6CA277058725 for ; Thu, 6 Dec 2007 12:10:02 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.2/8.14.1/Submit) id lB6CA2GN058724; Thu, 6 Dec 2007 12:10:02 GMT (envelope-from gnats) Resent-Date: Thu, 6 Dec 2007 12:10:02 GMT Resent-Message-Id: <200712061210.lB6CA2GN058724@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-bugs@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, Andrey Sudakov Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 12F3C16A468 for ; Thu, 6 Dec 2007 12:01:26 +0000 (UTC) (envelope-from nobody@FreeBSD.org) Received: from www.freebsd.org (www.freebsd.org [IPv6:2001:4f8:fff6::21]) by mx1.freebsd.org (Postfix) with ESMTP id F418013C4D1 for ; Thu, 6 Dec 2007 12:01:25 +0000 (UTC) (envelope-from nobody@FreeBSD.org) Received: from www.freebsd.org (localhost [127.0.0.1]) by www.freebsd.org (8.14.2/8.14.2) with ESMTP id lB6C1LdD007822 for ; Thu, 6 Dec 2007 12:01:21 GMT (envelope-from nobody@www.freebsd.org) Received: (from nobody@localhost) by www.freebsd.org (8.14.2/8.14.1/Submit) id lB6C1Lgu007821; Thu, 6 Dec 2007 12:01:21 GMT (envelope-from nobody) Message-Id: <200712061201.lB6C1Lgu007821@www.freebsd.org> Date: Thu, 6 Dec 2007 12:01:21 GMT From: Andrey Sudakov To: freebsd-gnats-submit@FreeBSD.org X-Send-Pr-Version: www-3.1 Cc: Subject: misc/118459: Freeze under high-load with SMP until keypressed X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Dec 2007 12:10:02 -0000 >Number: 118459 >Category: misc >Synopsis: Freeze under high-load with SMP until keypressed >Confidential: no >Severity: serious >Priority: low >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Thu Dec 06 12:10:02 UTC 2007 >Closed-Date: >Last-Modified: >Originator: Andrey Sudakov >Release: FreeBSD 6.2 >Organization: Kaspersky labs >Environment: >Description: Under very high load (no CPU %idle time, high %interrupt(network) or %system (context switching) time) server sometimes completly stops: no processes are running (i see no logs entries), no network responses (no ping responses). When any key on console is hit server unfreezes and continues to work untill it freezes again. The problem shows only when SMP is enabled in kernel. Without SMP option everything works fine. The problem persist in a varios hardware configurations (i386, amd64) at least on IBM servers (x3650, x3250) or some Supermicro Pentium D machines. ============== amd64 IBM x3650: Copyright (c) 1992-2007 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 6.2-RELEASE-p8 #4: Fri Nov 16 13:32:35 MSK 2007 root@www8.doamin.com:/usr/obj/usr/src/sys/WWW8 WARNING: MPSAFE network stack disabled, expect reduced performance. Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) Xeon(R) CPU X5355 @ 2.66GHz (2660.01-MHz K8-class CPU) Origin = "GenuineIntel" Id = 0x6f7 Stepping = 7 Features=0xbfebfbff Features2=0x4e3bd,CX16,,,> AMD Features=0x20100800 AMD Features2=0x1 Cores per package: 4 real memory = 9663676416 (9216 MB) avail memory = 8287969280 (7904 MB) ACPI APIC Table: FreeBSD/SMP: Multiprocessor System Detected: 8 CPUs cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 cpu2 (AP): APIC ID: 2 cpu3 (AP): APIC ID: 3 cpu4 (AP): APIC ID: 4 cpu5 (AP): APIC ID: 5 cpu6 (AP): APIC ID: 6 cpu7 (AP): APIC ID: 7 ioapic0 irqs 0-23 on motherboard kbd1 at kbdmux0 acpi0: on motherboard acpi_bus_number: can't get _ADR acpi_bus_number: can't get _ADR acpi0: Power Button (fixed) acpi_bus_number: can't get _ADR acpi_bus_number: can't get _ADR Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x588-0x58b on acpi0 cpu0: on acpi0 acpi_throttle0: on cpu0 cpu1: on acpi0 acpi_throttle1: on cpu1 acpi_throttle1: failed to attach P_CNT device_attach: acpi_throttle1 attach returned 6 cpu2: on acpi0 acpi_throttle2: on cpu2 acpi_throttle2: failed to attach P_CNT device_attach: acpi_throttle2 attach returned 6 cpu3: on acpi0 acpi_throttle3: on cpu3 acpi_throttle3: failed to attach P_CNT device_attach: acpi_throttle3 attach returned 6 cpu4: on acpi0 acpi_throttle4: on cpu4 acpi_throttle4: failed to attach P_CNT device_attach: acpi_throttle4 attach returned 6 cpu5: on acpi0 acpi_throttle5: on cpu5 acpi_throttle5: failed to attach P_CNT device_attach: acpi_throttle5 attach returned 6 cpu6: on acpi0 acpi_throttle6: on cpu6 acpi_throttle6: failed to attach P_CNT device_attach: acpi_throttle6 attach returned 6 cpu7: on acpi0 acpi_throttle7: on cpu7 acpi_throttle7: failed to attach P_CNT device_attach: acpi_throttle7 attach returned 6 pcib0: on acpi0 pci0: on pcib0 pcib1: at device 2.0 on pci0 pci26: on pcib1 pcib2: at device 0.0 on pci26 pci27: on pcib2 pcib3: at device 0.0 on pci27 pci28: on pcib3 pcib4: irq 17 at device 1.0 on pci27 pci36: on pcib4 pcib5: at device 0.3 on pci26 pci37: on pcib5 pcib6: at device 3.0 on pci0 pci4: on pcib6 aac0: port 0x5000-0x50ff mem 0xc9e00000-0xc9ffffff,0xc7fe0000-0xc7ffffff irq 17 at device 0.0 on pci4 aac0: Enabling 64-bit address support aac0: New comm. interface enabled aac0: Adaptec Raid Controller 2.0.0-1 pcib7: at device 4.0 on pci0 pci16: on pcib7 pcib8: at device 5.0 on pci0 pci69: on pcib8 pcib9: at device 6.0 on pci0 pci7: on pcib9 pcib10: at device 7.0 on pci0 pci68: on pcib10 pci0: at device 8.0 (no driver attached) pcib11: at device 28.0 on pci0 pci2: on pcib11 pcib12: at device 0.0 on pci2 pci3: on pcib12 bce0: mem 0xce000000-0xcfffffff irq 16 at device 0.0 on pci3 bce0: ASIC ID 0x57081020; Revision (B2); PCI-X 64-bit 133MHz miibus0: on bce0 brgphy0: on miibus0 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto bce0: Ethernet address: 00:1a:64:63:a7:71 bce0: [GIANT-LOCKED] pcib13: at device 28.1 on pci0 pci5: on pcib13 pcib14: at device 0.0 on pci5 pci6: on pcib14 bce1: mem 0xca000000-0xcbffffff irq 17 at device 0.0 on pci6 bce1: ASIC ID 0x57081020; Revision (B2); PCI-X 64-bit 133MHz miibus1: on bce1 brgphy1: on miibus1 brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto bce1: Ethernet address: 00:1a:64:63:a7:73 bce1: [GIANT-LOCKED] uhci0: port 0x2200-0x221f irq 23 at device 29.0 on pci0 uhci0: [GIANT-LOCKED] usb0: on uhci0 usb0: USB revision 1.0 uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 2 ports with 2 removable, self powered uhci1: port 0x2600-0x261f irq 22 at device 29.1 on pci0 uhci1: [GIANT-LOCKED] usb1: on uhci1 usb1: USB revision 1.0 uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub1: 2 ports with 2 removable, self powered uhci2: port 0x2a00-0x2a1f irq 23 at device 29.2 on pci0 uhci2: [GIANT-LOCKED] usb2: on uhci2 usb2: USB revision 1.0 uhub2: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub2: 2 ports with 2 removable, self powered uhci3: port 0x2e00-0x2e1f irq 22 at device 29.3 on pci0 uhci3: [GIANT-LOCKED] usb3: on uhci3 usb3: USB revision 1.0 uhub3: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub3: 2 ports with 2 removable, self powered ehci0: mem 0xf9000000-0xf90003ff irq 23 at device 29.7 on pci0 ehci0: [GIANT-LOCKED] usb4: EHCI version 1.0 usb4: companion controllers, 2 ports each: usb0 usb1 usb2 usb3 usb4: on ehci0 usb4: USB revision 2.0 uhub4: Intel EHCI root hub, class 9/0, rev 2.00/1.00, addr 1 uhub4: 8 ports with 8 removable, self powered pcib15: at device 30.0 on pci0 pci1: on pcib15 pci1: at device 6.0 (no driver attached) isab0: at device 31.0 on pci0 isa0: on isab0 atapci0: port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x480-0x48f at device 31.2 on pci0 ata0: on atapci0 ata1: on atapci0 pci0: at device 31.3 (no driver attached) sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 sio0: type 16550A ipmi0: on isa0 ipmi0: KCS mode found at io 0xca8 alignment 0x4 on isa orm0: at iomem 0xc0000-0xcafff,0xcb000-0xcffff on isa0 atkbdc0: at port 0x60,0x64 on isa0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] ppc0: cannot reserve I/O port range sc0: at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> sio1: configured irq 3 not in bitmap of probed irqs 0 sio1: port may not be enabled vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 ukbd0: Cypress Cypress USB Keyboard / PS2 Mouse, rev 1.00/0.01, addr 2, iclass 3/1 kbd2 at ukbd0 ums0: Cypress Cypress USB Keyboard / PS2 Mouse, rev 1.00/0.01, addr 2, iclass 3/1 ums0: 3 buttons and Z dir. ukbd1: IBM IBM RSA2, rev 1.10/0.01, addr 2, iclass 3/1 kbd3 at ukbd1 ums1: IBM IBM RSA2, rev 1.10/0.01, addr 2, iclass 3/1 ums1: X report 0x0002 not supported device_attach: ums1 attach returned 6 Timecounters tick every 1.000 msec acd0: CDRW at ata1-master UDMA33 aacd0: on aac0 aacd0: 858000MB (1757184000 sectors) ipmi0: IPMI device rev. 0, firmware rev. 1.2, version 2.0 ipmi0: Number of channels 3 ipmi0: Attached watchdog SMP: AP CPU #1 Launched! SMP: AP CPU #3 Launched! SMP: AP CPU #2 Launched! SMP: AP CPU #5 Launched! SMP: AP CPU #6 Launched! SMP: AP CPU #4 Launched! SMP: AP CPU #7 Launched! Trying to mount root from ufs:/dev/aacd0s1a >How-To-Repeat: My test server freeze after 8 days 'stress' with ~1000 load average. >Fix: Not to use SMP on high-load >Release-Note: >Audit-Trail: >Unformatted: