From owner-freebsd-stable@FreeBSD.ORG Fri Oct 20 23:05:32 2006 Return-Path: X-Original-To: stable@freebsd.org Delivered-To: freebsd-stable@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7C7C116A407 for ; Fri, 20 Oct 2006 23:05:32 +0000 (UTC) (envelope-from lauasanf@wilderness.homeip.net) Received: from mxsf26.cluster1.charter.net (mxsf26.cluster1.charter.net [209.225.28.226]) by mx1.FreeBSD.org (Postfix) with ESMTP id C022043D7D for ; Fri, 20 Oct 2006 23:05:15 +0000 (GMT) (envelope-from lauasanf@wilderness.homeip.net) Received: from mxip04a.cluster1.charter.net (mxip04a.cluster1.charter.net [209.225.28.134]) by mxsf26.cluster1.charter.net (8.12.11.20060308/8.12.11) with ESMTP id k9KN5Dmv007372 for ; Fri, 20 Oct 2006 19:05:14 -0400 Received: from 24-159-55-136.dhcp.jcsn.tn.charter.com (HELO [192.168.1.6]) ([24.159.55.136]) by mxip04a.cluster1.charter.net with ESMTP; 20 Oct 2006 19:05:14 -0400 X-IronPort-AV: i="4.09,336,1157342400"; d="scan'208"; a="221194585:sNHT23473996" Message-ID: <4539563E.7010803@wilderness.homeip.net> Date: Fri, 20 Oct 2006 18:05:34 -0500 From: Laurence Sanford User-Agent: Thunderbird 1.5.0.7 (X11/20060926) MIME-Version: 1.0 To: stable@freebsd.org Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Subject: Hard lock on 6.1-STABLE X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 20 Oct 2006 23:05:32 -0000 It's taken me a while to narrow down what this is, but today I finally narrowed it all the way down. High network load on the system causes it to hard lock, nothing but pulling the plug will get any response. The network interface is nve on an Asus A8N-SLI. The magic bullet appears to be: bit torrent downloading/seeding at least two torrents. Doesn't matter what client your using. I've done this using Azureus and Ktorrent both. FTP'ing something (either direction) from the box. I've gone so far as to throttle the ftp client to 300K/s, and it will still do it. Things worth noting: I've narrowed this down by doing stupid things to try to make it crash, such as building world+3 or 4 other large things at once, moving large files between disks, etc. Many things have triggered this (NFS activity, etc) but the only common thread I found was network activity, since it's done this with and without NFS running (I wanted to eleminate NFS since it seems to be a bit unstable at the moment) doing a multitude of tasks. The network cable connecting this system to the switch is perfect. The switch rarely shows any collisions unless network load is high on this box, then the collision light will come on nearly constantly. dmesg: root@colossus(~)# dmesg Copyright (c) 1992-2006 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 6.1-STABLE #6: Sat Sep 2 04:56:20 CDT 2006 lauasanf@colossus.cotharyus.net:/usr/obj/usr/src/sys/Colossus Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ (2010.31-MHz 686-class CPU) Origin = "AuthenticAMD" Id = 0x20fb1 Stepping = 1 Features=0x178bfbff Features2=0x1 AMD Features=0xe2500800 AMD Features2=0x3 Cores per package: 2 real memory = 1073676288 (1023 MB) avail memory = 1037369344 (989 MB) ACPI APIC Table: FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 ioapic0: Changing APIC ID to 2 ioapic0 irqs 0-23 on motherboard acpi0: on motherboard acpi_bus_number: can't get _ADR acpi_bus_number: can't get _ADR acpi0: Power Button (fixed) acpi_bus_number: can't get _ADR acpi_bus_number: can't get _ADR acpi_bus_number: can't get _ADR acpi_bus_number: can't get _ADR Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x4008-0x400b on acpi0 cpu0: on acpi0 cpu1: on acpi0 acpi_button0: on acpi0 pcib0: port 0xcf8-0xcff on acpi0 pci0: on pcib0 pci0: at device 0.0 (no driver attached) isab0: at device 1.0 on pci0 isa0: on isab0 pci0: at device 1.1 (no driver attached) ohci0: mem 0xdb102000-0xdb102fff irq 21 at device 2.0 on pci0 ohci0: [GIANT-LOCKED] usb0: OHCI version 1.0, legacy support usb0: SMM does not respond, resetting usb0: on ohci0 usb0: USB revision 1.0 uhub0: nVidia OHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 10 ports with 10 removable, self powered ehci0: mem 0xfeb00000-0xfeb000ff irq 22 at device 2.1 on pci0 ehci0: [GIANT-LOCKED] usb1: EHCI version 1.0 usb1: companion controller, 4 ports each: usb0 usb1: on ehci0 usb1: USB revision 2.0 uhub1: nVidia EHCI root hub, class 9/0, rev 2.00/1.00, addr 1 uhub1: 10 ports with 10 removable, self powered pcm0: port 0xd400-0xd4ff,0xd800-0xd8ff mem 0xdb101000-0xdb101fff irq 23 at device 4.0 on pci0 pcm0: atapci0: port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xf000-0xf00f at device 6.0 on pci0 ata0: on atapci0 ata1: on atapci0 pcib1: at device 9.0 on pci0 pci5: on pcib1 fwohci0: mem 0xdb004000-0xdb0047ff,0xdb000000-0xdb003fff irq 16 at device 11.0 on pci5 fwohci0: OHCI version 1.10 (ROM=1) fwohci0: No. of Isochronous channels is 4. fwohci0: EUI64 00:11:d8:00:00:86:18:47 fwohci0: Phy 1394a available S400, 2 ports. fwohci0: Link S400, max_rec 2048 bytes. firewire0: on fwohci0 sbp0: on firewire0 fwohci0: Initiate bus reset fwohci0: node_id=0xc800ffc0, gen=1, CYCLEMASTER mode firewire0: 1 nodes, maxhop <= 0, cable IRM = 0 (me) firewire0: bus manager 0 (me) nve0: port 0xd000-0xd007 mem 0xdb100000-0xdb100fff irq 21 at device 10.0 on pci0 nve0: Ethernet address 00:15:f2:7f:80:86 miibus0: on nve0 ukphy0: on miibus0 ukphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto nve0: Ethernet address: 00:15:f2:7f:80:86 pcib2: at device 11.0 on pci0 pci4: on pcib2 pcib3: at device 12.0 on pci0 pci3: on pcib3 pcib4: at device 13.0 on pci0 pci2: on pcib4 pcib5: at device 14.0 on pci0 pci1: on pcib5 nvidia0: mem 0xd8000000-0xd8ffffff,0xd0000000-0xd7ffffff,0xd9000000-0xd9ffffff irq 18 at device 0.0 on pci1 nvidia0: [GIANT-LOCKED] acpi_tz0: on acpi0 fdc0: port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0 fdc0: [FAST] sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 sio0: type 16550A ppc0: port 0x378-0x37f,0x778-0x77b irq 7 drq 3 on acpi0 ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode ppc0: FIFO with 16/16/16 bytes threshold ppbus0: on ppc0 plip0: on ppbus0 lpt0: on ppbus0 lpt0: Interrupt-driven port ppi0: on ppbus0 pmtimer0 on isa0 orm0: at iomem 0xc0000-0xcefff,0xd0000-0xd3fff on isa0 atkbdc0: at port 0x60,0x64 on isa0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] sc0: at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> sio1: configured irq 3 not in bitmap of probed irqs 0 sio1: port may not be enabled vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 uhid0: Logitech Inc. WingMan Force 3D, rev 1.00/1.06, addr 2, iclass 3/0 ukbd0: Microsoft Natural\M-. Ergonomic Keyboard 4000, rev 2.00/1.73, addr 3, iclass 3/1 kbd1 at ukbd0 uhid1: Microsoft Natural\M-. Ergonomic Keyboard 4000, rev 2.00/1.73, addr 3, iclass 3/1 umass0: Generic USB Storage Device, rev 1.10/1.00, addr 4 ums0: Microsoft Microsoft Trackball Optical\M-., rev 1.10/1.21, addr 5, iclass 3/1 ums0: 5 buttons and Z dir. uhid2: Jess Tech GGE909 PC Recoil Pad, rev 1.10/1.01, addr 6, iclass 3/0 Timecounters tick every 1.000 msec ad0: 43979MB at ata0-master UDMA100 ad1: 190782MB at ata0-slave UDMA100 ad2: 38166MB at ata1-master UDMA100 acd0: DVDR at ata1-slave UDMA33 SMP: AP CPU #1 Launched! cd0 at ata1 bus 0 target 1 lun 0 cd0: Removable CD-ROM SCSI-0 device cd0: 33.000MB/s transfers cd0: Attempt to query device size failed: NOT READY, Medium not present da0 at umass-sim0 bus 0 target 0 lun 0 da0: Removable Direct Access SCSI-0 device da0: 1.000MB/s transfers da0: Attempt to query device size failed: NOT READY, Medium not present da1 at umass-sim0 bus 0 target 0 lun 1 da1: Removable Direct Access SCSI-0 device da1: 1.000MB/s transfers da1: Attempt to query device size failed: NOT READY, Medium not present da2 at umass-sim0 bus 0 target 0 lun 2 da2: Removable Direct Access SCSI-0 device da2: 1.000MB/s transfers da2: 489MB (1002497 512 byte sectors: 64H 32S/T 489C) da3 at umass-sim0 bus 0 target 0 lun 3 da3: Removable Direct Access SCSI-0 device da3: 1.000MB/s transfers da3: Attempt to query device size failed: NOT READY, Medium not present (da0:umass-sim0:0:0:0): READ CAPACITY. CDB: 25 0 0 0 0 0 0 0 0 0 (da0:umass-sim0:0:0:0): CAM Status: SCSI Status Error (da0:umass-sim0:0:0:0): SCSI Status: Check Condition (da0:umass-sim0:0:0:0): NOT READY asc:3a,0 (da0:umass-sim0:0:0:0): Medium not present (da0:umass-sim0:0:0:0): Unretryable error Opened disk da0 -> 6 (da0:umass-sim0:0:0:0): READ CAPACITY. CDB: 25 0 0 0 0 0 0 0 0 0 (da0:umass-sim0:0:0:0): CAM Status: SCSI Status Error (da0:umass-sim0:0:0:0): SCSI Status: Check Condition (da0:umass-sim0:0:0:0): NOT READY asc:3a,0 (da0:umass-sim0:0:0:0): Medium not present (da0:umass-sim0:0:0:0): Unretryable error Opened disk da0 -> 6 (da0:umass-sim0:0:0:0): READ CAPACITY. CDB: 25 0 0 0 0 0 0 0 0 0 (da0:umass-sim0:0:0:0): CAM Status: SCSI Status Error (da0:umass-sim0:0:0:0): SCSI Status: Check Condition (da0:umass-sim0:0:0:0): NOT READY asc:3a,0 (da0:umass-sim0:0:0:0): Medium not present (da0:umass-sim0:0:0:0): Unretryable error Opened disk da0 -> 6 (da1:umass-sim0:0:0:1): READ CAPACITY. CDB: 25 20 0 0 0 0 0 0 0 0 (da1:umass-sim0:0:0:1): CAM Status: SCSI Status Error (da1:umass-sim0:0:0:1): SCSI Status: Check Condition (da1:umass-sim0:0:0:1): NOT READY asc:3a,0 (da1:umass-sim0:0:0:1): Medium not present (da1:umass-sim0:0:0:1): Unretryable error Opened disk da1 -> 6 (da1:umass-sim0:0:0:1): READ CAPACITY. CDB: 25 20 0 0 0 0 0 0 0 0 (da1:umass-sim0:0:0:1): CAM Status: SCSI Status Error (da1:umass-sim0:0:0:1): SCSI Status: Check Condition (da1:umass-sim0:0:0:1): NOT READY asc:3a,0 (da1:umass-sim0:0:0:1): Medium not present (da1:umass-sim0:0:0:1): Unretryable error Opened disk da1 -> 6 (da1:umass-sim0:0:0:1): READ CAPACITY. CDB: 25 20 0 0 0 0 0 0 0 0 (da1:umass-sim0:0:0:1): CAM Status: SCSI Status Error (da1:umass-sim0:0:0:1): SCSI Status: Check Condition (da1:umass-sim0:0:0:1): NOT READY asc:3a,0 (da1:umass-sim0:0:0:1): Medium not present (da1:umass-sim0:0:0:1): Unretryable error Opened disk da1 -> 6 (da3:umass-sim0:0:0:3): READ CAPACITY. CDB: 25 60 0 0 0 0 0 0 0 0 (da3:umass-sim0:0:0:3): CAM Status: SCSI Status Error (da3:umass-sim0:0:0:3): SCSI Status: Check Condition (da3:umass-sim0:0:0:3): NOT READY asc:3a,0 (da3:umass-sim0:0:0:3): Medium not present (da3:umass-sim0:0:0:3): Unretryable error Opened disk da3 -> 6 (da3:umass-sim0:0:0:3): READ CAPACITY. CDB: 25 60 0 0 0 0 0 0 0 0 (da3:umass-sim0:0:0:3): CAM Status: SCSI Status Error (da3:umass-sim0:0:0:3): SCSI Status: Check Condition (da3:umass-sim0:0:0:3): NOT READY asc:3a,0 (da3:umass-sim0:0:0:3): Medium not present (da3:umass-sim0:0:0:3): Unretryable error Opened disk da3 -> 6 (da3:umass-sim0:0:0:3): READ CAPACITY. CDB: 25 60 0 0 0 0 0 0 0 0 (da3:umass-sim0:0:0:3): CAM Status: SCSI Status Error (da3:umass-sim0:0:0:3): SCSI Status: Check Condition (da3:umass-sim0:0:0:3): NOT READY asc:3a,0 (da3:umass-sim0:0:0:3): Medium not present (da3:umass-sim0:0:0:3): Unretryable error Opened disk da3 -> 6 Trying to mount root from ufs:/dev/ad0s1a I'll answer any questions I can about this, or provide further information if needed. I'm interested in seeing if anyone else can confirm or reproduce this, or if it's a known problem.