From owner-freebsd-stable@FreeBSD.ORG Fri Apr 24 19:01:53 2009 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 06D19106570A for ; Fri, 24 Apr 2009 19:01:53 +0000 (UTC) (envelope-from martisch@uos.de) Received: from smtp-auth.serv.Uni-Osnabrueck.DE (sanode12eth0.rz.Uni-Osnabrueck.DE [131.173.17.152]) by mx1.freebsd.org (Postfix) with ESMTP id 135B98FC0C for ; Fri, 24 Apr 2009 19:01:51 +0000 (UTC) (envelope-from martisch@uos.de) Received: from loki.local (xdslcw214.osnanet.de [89.166.150.214]) (authenticated bits=0) by smtp-auth.serv.Uni-Osnabrueck.DE (8.13.1/8.13.1) with ESMTP id n3OIdjBv027962 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Fri, 24 Apr 2009 20:39:47 +0200 Message-Id: From: Martin Schmidt To: freebsd-stable@freebsd.org Content-Type: multipart/signed; boundary=Apple-Mail-17-298180184; micalg=sha1; protocol="application/pkcs7-signature" Mime-Version: 1.0 (Apple Message framework v930.3) Date: Fri, 24 Apr 2009 20:39:40 +0200 X-Mailer: Apple Mail (2.930.3) X-PMX-Version: 5.5.2.365749, Antispam-Engine: 2.6.1.350677, Antispam-Data: 2009.4.24.182531 (Univ. Osnabrueck) X-PMX-Spam: Gauge=XIIII, Probability=15%, Report=URI_HOSTNAME_CONTAINS_EQUALS 1, CTYPE_MULTIPART_NO_QUOTE 0.5, BODY_SIZE_10000_PLUS 0, ECARD_WORD 0, TO_NO_NAME 0, __BOUNCE_CHALLENGE_SUBJ 0, __C230066_P5 0, __CT 0, __CTYPE_HAS_BOUNDARY 0, __CTYPE_MULTIPART 0, __HAS_MSGID 0, __HAS_X_MAILER 0, __MIME_VERSION 0, __MSGID_APPLEMAIL 0, __SANE_MSGID 0, __STOCK_PHRASE_24 0, __STOCK_PHRASE_7 0, __TO_MALFORMED_2 0 X-PMX-Spam-Level: XIIII X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: RE: 7.1-STABLE Sun Mar 29 01:06:46 ADT 2009 Locks up ... X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 Apr 2009 19:01:53 -0000 --Apple-Mail-17-298180184 Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Hi Marc and List, i had similar issues with FreeBSD 7.2-PRERELEASE. Server (zfs,nfs) seems to hang in intervals of about 8 hours. kernel is still there but no connections can be made to nfs/ssh and login on local console doesn't seem to work due to incredible slowness. breaking to the debugger takes a moment but works. (compiling kernel with WITNESS didnt help) the server had been solid before with 7 stable kernel from around 19 October 2008. I now added these lines to /boot/loader.conf hw.pci.enable_msi=0 hw.pci.enable_msix=0 to disable Message Signaled Interrupts. Which are used by the 3ware twa driver and igb network driver on our server. With this the server had run 3 days with no hangs. I then enabled msi again and had a hang within 24 hours. Disabled again and now the server is online without an issue for 6 days. Im not 100% sure yet if this really is the sole source of the problems (e.g. workload might be another factor). But i guess its worth a try to check if it might help you too. If this is a known problem or there are any other hints to solve this problem or if the server configuration just seems wrong, i appreciate the feedback. regards, Martin pciconf (with msi): hostb0@pci0:0:0:0: class=0x060000 card=0xa28015d9 chip=0x40038086 rev=0x20 hdr=0x00 cap 01[50] = powerspec 3 supports D0 D3 current D0 cap 05[58] = MSI supports 2 messages cap 10[6c] = PCI-Express 2 root port pcib1@pci0:0:1:0: class=0x060400 card=0xa28015d9 chip=0x40218086 rev=0x20 hdr=0x01 cap 01[50] = powerspec 3 supports D0 D3 current D0 cap 05[58] = MSI supports 2 messages cap 10[6c] = PCI-Express 2 root port cap 0d[b0] = PCI Bridge card=0xa28015d9 pcib2@pci0:0:3:0: class=0x060400 card=0xa28015d9 chip=0x40238086 rev=0x20 hdr=0x01 cap 01[50] = powerspec 3 supports D0 D3 current D0 cap 05[58] = MSI supports 2 messages cap 10[6c] = PCI-Express 2 root port cap 0d[b0] = PCI Bridge card=0xa28015d9 pcib3@pci0:0:5:0: class=0x060400 card=0xa28015d9 chip=0x40258086 rev=0x20 hdr=0x01 cap 01[50] = powerspec 3 supports D0 D3 current D0 cap 05[58] = MSI supports 2 messages cap 10[6c] = PCI-Express 2 root port cap 0d[b0] = PCI Bridge card=0xa28015d9 pcib4@pci0:0:7:0: class=0x060400 card=0xa28015d9 chip=0x40278086 rev=0x20 hdr=0x01 cap 01[50] = powerspec 3 supports D0 D3 current D0 cap 05[58] = MSI supports 2 messages cap 10[6c] = PCI-Express 2 root port cap 0d[b0] = PCI Bridge card=0xa28015d9 pcib8@pci0:0:9:0: class=0x060400 card=0xa28015d9 chip=0x40298086 rev=0x20 hdr=0x01 cap 01[50] = powerspec 3 supports D0 D3 current D0 cap 05[58] = MSI supports 2 messages cap 10[6c] = PCI-Express 2 root port cap 0d[b0] = PCI Bridge card=0xa28015d9 none0@pci0:0:15:0: class=0x088000 card=0xa28015d9 chip=0x402f8086 rev=0x20 hdr=0x00 cap 01[50] = powerspec 3 supports D0 D3 current D0 cap 11[58] = MSI-X supports 4 messages in map 0x10 cap 10[6c] = PCI-Express 2 type 0 hostb1@pci0:0:16:0: class=0x060000 card=0xa28015d9 chip=0x40308086 rev=0x20 hdr=0x00 hostb2@pci0:0:16:1: class=0x060000 card=0xa28015d9 chip=0x40308086 rev=0x20 hdr=0x00 hostb3@pci0:0:16:2: class=0x060000 card=0xa28015d9 chip=0x40308086 rev=0x20 hdr=0x00 hostb4@pci0:0:16:3: class=0x060000 card=0xa28015d9 chip=0x40308086 rev=0x20 hdr=0x00 hostb5@pci0:0:16:4: class=0x060000 card=0xa28015d9 chip=0x40308086 rev=0x20 hdr=0x00 hostb6@pci0:0:17:0: class=0x060000 card=0xa28015d9 chip=0x40318086 rev=0x20 hdr=0x00 hostb7@pci0:0:21:0: class=0x060000 card=0xa28015d9 chip=0x40358086 rev=0x20 hdr=0x00 hostb8@pci0:0:21:1: class=0x060000 card=0xa28015d9 chip=0x40358086 rev=0x20 hdr=0x00 hostb9@pci0:0:22:0: class=0x060000 card=0xa28015d9 chip=0x40368086 rev=0x20 hdr=0x00 hostb10@pci0:0:22:1: class=0x060000 card=0xa28015d9 chip=0x40368086 rev=0x20 hdr=0x00 pcib9@pci0:0:28:0: class=0x060400 card=0xa28015d9 chip=0x26908086 rev=0x09 hdr=0x01 cap 10[40] = PCI-Express 1 root port cap 05[80] = MSI supports 1 message cap 0d[90] = PCI Bridge card=0xa28015d9 cap 01[a0] = powerspec 2 supports D0 D3 current D0 uhci0@pci0:0:29:0: class=0x0c0300 card=0xa28015d9 chip=0x26888086 rev=0x09 hdr=0x00 uhci1@pci0:0:29:1: class=0x0c0300 card=0xa28015d9 chip=0x26898086 rev=0x09 hdr=0x00 uhci2@pci0:0:29:2: class=0x0c0300 card=0xa28015d9 chip=0x268a8086 rev=0x09 hdr=0x00 ehci0@pci0:0:29:7: class=0x0c0320 card=0xa28015d9 chip=0x268c8086 rev=0x09 hdr=0x00 cap 01[50] = powerspec 2 supports D0 D3 current D0 cap 0a[58] = EHCI Debug Port at offset 0xa0 in map 0x14 pcib10@pci0:0:30:0: class=0x060401 card=0xa28015d9 chip=0x244e8086 rev=0xd9 hdr=0x01 cap 0d[50] = PCI Bridge card=0xa28015d9 isab0@pci0:0:31:0: class=0x060100 card=0xa28015d9 chip=0x26708086 rev=0x09 hdr=0x00 atapci0@pci0:0:31:1: class=0x01018a card=0xa28015d9 chip=0x269e8086 rev=0x09 hdr=0x00 atapci1@pci0:0:31:2: class=0x010601 card=0xa28015d9 chip=0x26818086 rev=0x09 hdr=0x00 cap 01[70] = powerspec 2 supports D0 D3 current D0 cap 12[a8] = unknown none1@pci0:0:31:3: class=0x0c0500 card=0xa28015d9 chip=0x269b8086 rev=0x09 hdr=0x00 twa0@pci0:1:0:0: class=0x010400 card=0x100413c1 chip=0x100413c1 rev=0x01 hdr=0x00 cap 01[40] = powerspec 2 supports D0 D1 D2 D3 current D0 cap 05[50] = MSI supports 32 messages, 64 bit cap 10[70] = PCI-Express 1 legacy endpoint pcib5@pci0:4:0:0: class=0x060400 card=0xa28015d9 chip=0x35008086 rev=0x01 hdr=0x01 cap 10[44] = PCI-Express 1 upstream port cap 01[70] = powerspec 2 supports D0 D3 current D0 cap 0d[80] = PCI Bridge card=0xa28015d9 pcib7@pci0:4:0:3: class=0x060400 card=0xa28015d9 chip=0x350c8086 rev=0x01 hdr=0x01 cap 10[44] = PCI-Express 1 PCI bridge cap 01[6c] = powerspec 2 supports D0 D3 current D0 cap 0d[80] = PCI Bridge card=0xa28015d9 cap 07[d8] = PCI-X bridge supports pcib6@pci0:5:0:0: class=0x060400 card=0xa28015d9 chip=0x35108086 rev=0x01 hdr=0x01 cap 10[44] = PCI-Express 1 downstream port cap 05[60] = MSI supports 1 message, 64 bit cap 01[70] = powerspec 2 supports D0 D3 current D0 cap 0d[80] = PCI Bridge card=0xa28015d9 twa1@pci0:6:0:0: class=0x010400 card=0x100413c1 chip=0x100413c1 rev=0x01 hdr=0x00 cap 01[40] = powerspec 2 supports D0 D1 D2 D3 current D0 cap 05[50] = MSI supports 32 messages, 64 bit cap 10[70] = PCI-Express 1 legacy endpoint igb0@pci0:8:0:0: class=0x020000 card=0x10a715d9 chip=0x10a78086 rev=0x02 hdr=0x00 cap 01[40] = powerspec 2 supports D0 D3 current D0 cap 05[50] = MSI supports 1 message, 64 bit cap 11[60] = MSI-X supports 10 messages in map 0x1c enabled cap 10[a0] = PCI-Express 2 endpoint igb1@pci0:8:0:1: class=0x020000 card=0x10a715d9 chip=0x10a78086 rev=0x02 hdr=0x00 cap 01[40] = powerspec 2 supports D0 D3 current D0 cap 05[50] = MSI supports 1 message, 64 bit cap 11[60] = MSI-X supports 10 messages in map 0x1c enabled cap 10[a0] = PCI-Express 2 endpoint vgapci0@pci0:10:1:0: class=0x030000 card=0xa28015d9 chip=0x515e1002 rev=0x02 hdr=0x00 cap 01[50] = powerspec 2 supports D0 D1 D2 D3 current D0 vmstat -i (with msi): mstat -i interrupt total rate irq1: atkbd0 2 0 irq14: ata0 216 0 irq17: atapci1 172855 200 irq23: ehci0 12 0 irq48: twa0 1472 1 irq54: twa1 1895 2 cpu0: timer 1722548 1998 irq256: igb0 772 0 irq257: igb0 2673 3 irq258: igb0 485 0 irq259: igb0 2121 2 irq260: igb0 1319 1 irq261: igb0 2 0 cpu1: timer 1714417 1988 cpu2: timer 1713997 1988 cpu3: timer 1714220 1988 Total 7049006 8177 vmstat -i (without msi): interrupt total rate irq1: atkbd0 2 0 irq14: ata0 216 0 irq17: atapci1 210359 536 irq23: ehci0 11 0 irq48: twa0 1331 3 irq54: twa1 1751 4 irq56: igb0 3733 9 cpu0: timer 783575 1998 cpu1: timer 775435 1978 cpu2: timer 775251 1977 cpu3: timer 775364 1977 Total 3327028 8487 dmesg (without msi): Copyright (c) 1992-2009 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 7.2-PRERELEASE #6: Mon Apr 13 13:30:07 CEST 2009 adm...@space.neurobiopsychologie.Uni-Osnabrueck.DE:/usr/obj/usr/ src/sys/SPACE Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) Xeon(R) CPU E5410 @ 2.33GHz (2327.51-MHz K8- class CPU) Origin = "GenuineIntel" Id = 0x10676 Stepping = 6 Features = 0xbfebfbff < FPU ,VME ,DE ,PSE ,TSC ,MSR ,PAE ,MCE ,CX8 ,APIC ,SEP ,MTRR ,PGE ,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> Features2 = 0xce3bd > AMD Features=0x20100800 AMD Features2=0x1 Cores per package: 4 usable memory = 4280475648 (4082 MB) avail memory = 4107509760 (3917 MB) ACPI APIC Table: FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 cpu2 (AP): APIC ID: 2 cpu3 (AP): APIC ID: 3 ioapic0 irqs 0-23 on motherboard ioapic1 irqs 24-47 on motherboard ioapic2 irqs 48-71 on motherboard kbd1 at kbdmux0 acpi0: on motherboard acpi0: [ITHREAD] acpi0: Power Button (fixed) Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x1008-0x100b on acpi0 acpi_hpet0: iomem 0xfed00000-0xfed003ff on acpi0 Timecounter "HPET" frequency 14318180 Hz quality 900 pcib0: port 0xcf8-0xcff on acpi0 pci0: on pcib0 pcib1: irq 48 at device 1.0 on pci0 pci1: on pcib1 3ware device driver for 9000 series storage controllers, version: 3.70.05.001 twa0: <3ware 9000 series Storage Controller> port 0x2000-0x20ff mem 0xd8000000-0xd9ffffff,0xdc100000-0xdc100fff irq 48 at device 0.0 on pci1 twa0: [ITHREAD] twa0: INFO: (0x04: 0x0001): Controller reset occurred: resets=3 twa0: INFO: (0x15: 0x1300): Controller details:: Model 9650SE-8LPML, 8 ports, Firmware FE9X 4.06.00.004, BIOS BE9X 4.05.00.015 pcib2: irq 50 at device 3.0 on pci0 pci2: on pcib2 pcib3: irq 52 at device 5.0 on pci0 pci3: on pcib3 pcib4: irq 54 at device 7.0 on pci0 pci4: on pcib4 pcib5: irq 54 at device 0.0 on pci4 pci5: on pcib5 pcib6: irq 54 at device 0.0 on pci5 pci6: on pcib6 twa1: <3ware 9000 series Storage Controller> port 0x3000-0x30ff mem 0xda000000-0xdbffffff,0xdc400000-0xdc400fff irq 54 at device 0.0 on pci6 twa1: [ITHREAD] twa1: INFO: (0x04: 0x0001): Controller reset occurred: resets=3 twa1: INFO: (0x15: 0x1300): Controller details:: Model 9650SE-8LPML, 8 ports, Firmware FE9X 4.06.00.004, BIOS BE9X 4.05.00.015 pcib7: at device 0.3 on pci4 pci7: on pcib7 pcib8: irq 56 at device 9.0 on pci0 pci8: on pcib8 igb0: port 0x4000-0x401f mem 0xdc020000-0xdc03ffff,0xdc000000-0xdc01ffff, 0xdc080000-0xdc083fff irq 56 at device 0.0 on pci8 igb0: [FILTER] igb0: Ethernet address: 00:30:48:c2:35:76 igb1: port 0x4020-0x403f mem 0xdc060000-0xdc07ffff,0xdc040000-0xdc05ffff, 0xdc084000-0xdc087fff irq 70 at device 0.1 on pci8 igb1: [FILTER] igb1: Ethernet address: 00:30:48:c2:35:77 pci0: at device 15.0 (no driver attached) pcib9: irq 16 at device 28.0 on pci0 pci9: on pcib9 uhci0: port 0x1800-0x181f irq 20 at device 29.0 on pci0 uhci0: [GIANT-LOCKED] uhci0: [ITHREAD] usb0: on uhci0 usb0: USB revision 1.0 uhub0: on usb0 uhub0: 2 ports with 2 removable, self powered uhci1: port 0x1820-0x183f irq 21 at device 29.1 on pci0 uhci1: [GIANT-LOCKED] uhci1: [ITHREAD] usb1: on uhci1 usb1: USB revision 1.0 uhub1: on usb1 uhub1: 2 ports with 2 removable, self powered uhci2: port 0x1840-0x185f irq 22 at device 29.2 on pci0 uhci2: [GIANT-LOCKED] uhci2: [ITHREAD] usb2: on uhci2 usb2: USB revision 1.0 uhub2: on usb2 uhub2: 2 ports with 2 removable, self powered ehci0: mem 0xdc704000-0xdc7043ff irq 23 at device 29.7 on pci0 ehci0: [GIANT-LOCKED] ehci0: [ITHREAD] usb3: EHCI version 1.0 usb3: companion controllers, 2 ports each: usb0 usb1 usb2 usb3: on ehci0 usb3: USB revision 2.0 uhub3: on usb3 uhub3: 6 ports with 6 removable, self powered ums0: on uhub3 ums0: 3 buttons and Z dir. ukbd0: on uhub3 kbd2 at ukbd0 pcib10: at device 30.0 on pci0 pci10: on pcib10 vgapci0: port 0x5000-0x50ff mem 0xd0000000-0xd7ffffff,0xdc200000-0xdc20ffff irq 18 at device 1.0 on pci10 isab0: at device 31.0 on pci0 isa0: on isab0 atapci0: port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x1860-0x186f at device 31.1 on pci0 ata0: on atapci0 ata0: [ITHREAD] atapci1: port 0x18b0-0x18b7,0x18a8-0x18ab, 0x18a0-0x18a7,0x1874-0x1877,0x1880-0x189f mem 0xdc704400-0xdc7047ff irq 17 at device 31.2 on pci0 atapci1: [ITHREAD] atapci1: AHCI Version 01.10 controller with 6 ports detected ata2: on atapci1 ata2: [ITHREAD] ata3: on atapci1 ata3: [ITHREAD] ata4: on atapci1 ata4: [ITHREAD] ata5: on atapci1 ata5: [ITHREAD] ata6: on atapci1 ata6: [ITHREAD] ata7: on atapci1 ata7: [ITHREAD] pci0: at device 31.3 (no driver attached) acpi_button0: on acpi0 atkbdc0: port 0x60,0x64 irq 1 on acpi0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] atkbd0: [ITHREAD] psm0: irq 12 on atkbdc0 psm0: [GIANT-LOCKED] psm0: [ITHREAD] psm0: model IntelliMouse, device ID 3 sio0: configured irq 4 not in bitmap of probed irqs 0 sio0: port may not be enabled sio0: configured irq 4 not in bitmap of probed irqs 0 sio0: port may not be enabled sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 sio0: type 16550A sio0: [FILTER] sio1: configured irq 3 not in bitmap of probed irqs 0 sio1: port may not be enabled sio1: configured irq 3 not in bitmap of probed irqs 0 sio1: port may not be enabled sio1: <16550A-compatible COM port> port 0x2f8-0x2ff irq 3 on acpi0 sio1: type 16550A sio1: [FILTER] fdc0: port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0 fdc0: does not respond device_attach: fdc0 attach returned 6 cpu0: on acpi0 ACPI Error (psargs-0459): [\\_SB_.BCMD] Namespace lookup failure, AE_NOT_FOUND ACPI Error (psparse-0626): Method parse/execution failed [\ \_PR_.CPU0._OSC] (Node 0xffffff0001608c20), AE_NOT_FOUND ACPI Error (psparse-0626): Method parse/execution failed [\ \_PR_.CPU0._PDC] (Node 0xffffff0001608c40), AE_NOT_FOUND ACPI Error (psargs-0459): [\\_SB_.BCMD] Namespace lookup failure, AE_NOT_FOUND ACPI Error (psparse-0626): Method parse/execution failed [\ \_PR_.CPU0._OSC] (Node 0xffffff0001608c20), AE_NOT_FOUND coretemp0: on cpu0 est0: on cpu0 p4tcc0: on cpu0 cpu1: on acpi0 coretemp1: on cpu1 est1: on cpu1 p4tcc1: on cpu1 cpu2: on acpi0 coretemp2: on cpu2 est2: on cpu2 p4tcc2: on cpu2 cpu3: on acpi0 coretemp3: on cpu3 est3: on cpu3 p4tcc3: on cpu3 fdc0: port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0 fdc0: does not respond device_attach: fdc0 attach returned 6 ipmi0: on isa0 ipmi0: KCS mode found at io 0xca2 alignment 0x1 on isa orm0: at iomem 0xc0000-0xcafff,0xcb000-0xcd7ff, 0xcd800-0xcf7ff,0xcf800-0xcffff on isa0 ppc0: cannot reserve I/O port range sc0: at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 Timecounters tick every 1.000 msec acd0: DVDROM at ata0-slave UDMA33 ad4: 238475MB at ata2-master SATA150 ad6: 238475MB at ata3-master SATA300 ipmi0: IPMI device rev. 1, firmware rev. 1.2, version 2.0 ipmi0: Number of channels 8 ipmi0: Attached watchdog da0 at twa0 bus 0 target 0 lun 0 da0: Fixed Direct Access SCSI-5 device da0: 100.000MB/s transfers da0: 715245MB (1464821760 512 byte sectors: 255H 63S/T 91180C) da1 at twa0 bus 0 target 1 lun 0 da1: Fixed Direct Access SCSI-5 device da1: 100.000MB/s transfers da1: 715245MB (1464821760 512 byte sectors: 255H 63S/T 91180C) da2 at twa0 bus 0 target 2 lun 0 da2: Fixed Direct Access SCSI-5 device da2: 100.000MB/s transfers da2: 715245MB (1464821760 512 byte sectors: 255H 63S/T 91180C) da3 at twa0 bus 0 target 3 lun 0 da3: Fixed Direct Access SCSI-5 device da3: 100.000MB/s transfers da3: 715245MB (1464821760 512 byte sectors: 255H 63S/T 91180C) da4 at twa0 bus 0 target 4 lun 0 da4: Fixed Direct Access SCSI-5 device da4: 100.000MB/s transfers da4: 715245MB (1464821760 512 byte sectors: 255H 63S/T 91180C) da5 at twa0 bus 0 target 5 lun 0 da5: Fixed Direct Access SCSI-5 device da5: 100.000MB/s transfers da5: 715245MB (1464821760 512 byte sectors: 255H 63S/T 91180C) da6 at twa0 bus 0 target 6 lun 0 da6: Fixed Direct Access SCSI-5 device da6: 100.000MB/s transfers da6: 715245MB (1464821760 512 byte sectors: 255H 63S/T 91180C) da7 at twa0 bus 0 target 7 lun 0 da7: Fixed Direct Access SCSI-5 device da7: 100.000MB/s transfers da7: 715245MB (1464821760 512 byte sectors: 255H 63S/T 91180C) da8 at twa1 bus 0 target 0 lun 0 da8: Fixed Direct Access SCSI-5 device da8: 100.000MB/s transfers da8: 715245MB (1464821760 512 byte sectors: 255H 63S/T 91180C) da9 at twa1 bus 0 target 1 lun 0 da9: Fixed Direct Access SCSI-5 device da9: 100.000MB/s transfers da9: 715245MB (1464821760 512 byte sectors: 255H 63S/T 91180C) da10 at twa1 bus 0 target 2 lun 0 da10: Fixed Direct Access SCSI-5 device da10: 100.000MB/s transfers da10: 715245MB (1464821760 512 byte sectors: 255H 63S/T 91180C) da11 at twa1 bus 0 target 3 lun 0 da11: Fixed Direct Access SCSI-5 device da11: 100.000MB/s transfers da11: 715245MB (1464821760 512 byte sectors: 255H 63S/T 91180C) da12 at twa1 bus 0 target 4 lun 0 da12: Fixed Direct Access SCSI-5 device da12: 100.000MB/s transfers da12: 715245MB (1464821760 512 byte sectors: 255H 63S/T 91180C) da13 at twa1 bus 0 target 5 lun 0 da13: Fixed Direct Access SCSI-5 device da13: 100.000MB/s transfers da13: 715245MB (1464821760 512 byte sectors: 255H 63S/T 91180C) da14 at twa1 bus 0 target 6 lun 0 da14: Fixed Direct Access SCSI-5 device da14: 100.000MB/s transfers da14: 715245MB (1464821760 512 byte sectors: 255H 63S/T 91180C) da15 at twa1 bus 0 target 7 lun 0 da15: Fixed Direct Access SCSI-5 device da15: 100.000MB/s transfers da15: 715245MB (1464821760 512 byte sectors: 255H 63S/T 91180C) SMP: AP CPU #1 Launched! SMP: AP CPU #2 Launched! SMP: AP CPU #3 Launched! On Apr 15, 5:15 am, free...@hub.org ("Marc G. Fournier") wrote: > --==========FBEC849F7CF9A3F6439C========== > Content-Type: text/plain; charset=us-ascii > Content-Transfer-Encoding: 7bit > Content-Disposition: inline > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > Hi ... > Over the past little while, two of my servers have suddenly started to hang > ... servers that up until this started, have been reasonably rock solid ... > they are generally within a day of each other for source code, and the hardware > on both are pretty much identical (HP Proliant DL360 Servers) ... > I have serial console configured on both so that I can do CR ~ ^b to get to > DDB ... except, when it hangs, all I get is: > "KDB: enter: Break sequence on console" > And it hangs there, no prompt. > I setup a simple script (see attached) to run every 5 minutes that gathers > various pieces of info that I think are pertinent, but most likely don't cover > everything ... > Whenever this happens, on either machine, vmstat show data *like* (notice the > high procs -> w values?): > procs memory page disks faults cpu > r b w avm fre flt re pi po fr sr da0 pa0 in sy cs us sy > id > 165 106 2 12699168 33840 3080 38 2 2 3082 1623 0 0 337 36961 4731 > 18 7 75 > 64 75 4 12761744 23084 46809 623 65 43 19307 116 334 0 1189 83674 11708 > 70 20 10 > 1 68 25 12773980 23068 11036 3003 9 36 4055 116 282 0 1336 78346 14869 > 56 16 28 > 0 71 25 12774236 23084 186 769 1 5 18 80 249 0 609 9298 5894 5 > 5 91 > 5 90 31 12747296 23352 626 2546 5 104 1147 368 281 0 1536 40945 19980 > 6 5 90 > Where procs -> w just seems to keep rising ... note that the output for > vmstat *5 minutes before* shows: > procs memory page disks faults cpu > r b w avm fre flt re pi po fr sr da0 pa0 in sy cs us sy > id > 35 121 0 12414692 90552 3080 32 2 1 3090 1403 0 0 337 37022 4730 > 18 7 75 > 31 93 0 12314408 62024 36550 414 46 6 34285 27 563 0 916 94851 8813 67 > 33 0 > 43 179 0 12270932 23080 24035 101 41 12 13887 36 375 0 766 61969 6945 > 69 23 7 > 92 44 0 12265524 119804 2122 2028 1 32 13051 1096092 205 0 558 19460 > 4561 19 50 32 > 38 34 0 12330068 89140 30758 103 39 119 37037 2837365 165 0 773 92041 > 7111 47 53 0 > I have one QEMU VPS running on this box, with kqemu running the latest kernel > module ... but the other machine experiencing the same issue is only running > FreeBSD jails ... > Both servers are running SCHED_4BSD, if that matters any ... ? > I'm at a loss as to what to look at / for next ... pointers would be greatly > appreciated ... > I have the various output files that the script generates available if anyone > thinks they would be useful ... > thank you ... > Marc G. Fournier Hub.Org Hosting Solutions S.A. (http://www.hub.org ) > Email . scra...@hub.org MSN . scra...@hub.org > Yahoo . yscrappy Skype: hub.org ICQ . 7615664 --Apple-Mail-17-298180184--