From owner-freebsd-current@FreeBSD.ORG Mon Nov 17 09:50:59 2003 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4695F16A4CE for ; Mon, 17 Nov 2003 09:50:59 -0800 (PST) Received: from carver.gumbysoft.com (carver.gumbysoft.com [66.220.23.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2EB2143F93 for ; Mon, 17 Nov 2003 09:50:58 -0800 (PST) (envelope-from dwhite@gumbysoft.com) Received: by carver.gumbysoft.com (Postfix, from userid 1000) id 217B172DB5; Mon, 17 Nov 2003 09:50:58 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by carver.gumbysoft.com (Postfix) with ESMTP id 1E4AF72DAD; Mon, 17 Nov 2003 09:50:58 -0800 (PST) Date: Mon, 17 Nov 2003 09:50:58 -0800 (PST) From: Doug White To: Mike Durian In-Reply-To: <200311161028.48407.durian@boogie.com> Message-ID: <20031117094946.A21453@carver.gumbysoft.com> References: <200311161028.48407.durian@boogie.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: current@freebsd.org Subject: Re: hard lock-up writing to tape X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Nov 2003 17:50:59 -0000 On Sun, 16 Nov 2003, Mike Durian wrote: > I'm using -current cvsup'd as of Nov 15, 2003. When I try to do a > dump or run the btape (fill command) program from bacula, my machine > will lock up hard. Doesn't respond to ping. No access to kernel > debugger. Num lock doesn't come on. Sounds like a Giant deadlock. dwhite's Form Letter on Debugging Giant Deadlocks If you are experiencing problems with CURRENT locking up hard, it may be due to a deadlock against the Giant mutex, which controls large parts of the kernel. Symptoms include: . No response to any input . System video console . Network (ping) To debug this, you will need to set up a serial console with some special kernel options. Instructions for booting with serial console are in the Handbook, but you will have to compile with the following kernel options: options DDB options BREAK_TO_DEBUGGER options WITNESS options INVARIANTS options INVARIANTS_SUPPORT Make sure your serial console is capable of sending a Break signal. If not, use "ALT_BREAK_TO_DEBUGGER" instead of "BREAK_TO_DEBUGGER". Enable the serial console and boot the system. Turn on terminal logging. In loader, stop the boot and type "boot -v" at the OK prompt to get additional info during the boot process. Once the system is up, trigger the hang. When the system hangs, issue the Break signal (or if you have used ALT_BREAK_TO_DEBUGGER, press Enter ~ ^E b (tilde, Ctrl-E, b)). If you get the db> prompt, then your hang is probably due to a Giant deadlock. If not, then something else may be at fault. Once in db>, run the following two commands and capture their output using your terminal's logging capability: show locks tr Take these and the boot -v output, put them on a webpage, and send a message to current@freebsd.org carefully explaining what you did to trigger the hang. Good luck! > > I can perform a dump or run the btape fill program when in single > user mode, but in multi-user the machine will only stay up for > a short while before locking. > > This has been happening since I got the tape system (Sparcstorage > Library) about 3-4 weeks ago. I don't know how long the problem > existed before then as I didn't have a tape system to use. > > I've tried two types of SCSI cards: Adaptec 2930 and ASUS PCI-SC200 > (sym(4) device). Both behave the same. > > I wonder if it could be network or interrupt related. In single > user mode, the network interface is not up. > > Dmesg from my system follows: > Copyright (c) 1992-2003 The FreeBSD Project. > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 > The Regents of the University of California. All rights reserved. > FreeBSD 5.1-CURRENT #57: Sat Nov 15 15:50:50 MST 2003 > root@man.boogie.com:/disk2/obj/disk2/src/sys/BOOGIE > Preloaded elf kernel "/boot/kernel/kernel" at 0xc0a93000. > Preloaded elf module "/boot/kernel/linux.ko" at 0xc0a931f4. > Preloaded elf module "/boot/kernel/snd_pcm.ko" at 0xc0a932a0. > Preloaded elf module "/boot/kernel/snd_via82c686.ko" at 0xc0a9334c. > Preloaded elf module "/boot/kernel/sym.ko" at 0xc0a93400. > Preloaded elf module "/boot/kernel/nvidia.ko" at 0xc0a934a8. > Timecounter "i8254" frequency 1193182 Hz quality 0 > CPU: AMD Athlon(tm) processor (1002.28-MHz 686-class CPU) > Origin = "AuthenticAMD" Id = 0x642 Stepping = 2 > Features=0x183f9ff > AMD Features=0xc0440000 > real memory = 1073676288 (1023 MB) > avail memory = 1033502720 (985 MB) > Pentium Pro MTRR support enabled > npx0: [FAST] > npx0: on motherboard > npx0: INT 16 interface > acpi0: on motherboard > pcibios: BIOS version 2.10 > Using $PIR table, 8 entries at 0xc00fde30 > acpi0: Power Button (fixed) > Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 > acpi_timer0: <24-bit timer at 3.579545MHz> port 0x4008-0x400b on acpi0 > acpi_cpu0: on acpi0 > acpi_button0: on acpi0 > pcib0: port > 0x6000-0x607f,0x5000-0x500f,0x4080-0x40ff,0x4000-0x407f,0xcf8-0xcff on acpi0 > pci0: on pcib0 > pcib0: slot 7 INTD is routed to irq 11 > pcib0: slot 7 INTD is routed to irq 11 > pcib0: slot 7 INTC is routed to irq 10 > pcib0: slot 9 INTA is routed to irq 9 > pcib0: slot 9 INTA is routed to irq 9 > pcib0: slot 9 INTA is routed to irq 9 > pcib0: slot 9 INTA is routed to irq 9 > pcib0: slot 10 INTA is routed to irq 10 > pcib0: slot 11 INTA is routed to irq 11 > pcib0: slot 12 INTA is routed to irq 10 > pcib0: slot 13 INTA is routed to irq 11 > agp0: mem > 0xd0000000-0xd7ffffff at device 0.0 on pci0 > pcib1: at device 1.0 on pci0 > pci1: on pcib1 > pcib0: slot 1 INTA is routed to irq 5 > pcib1: slot 0 INTA is routed to irq 5 > nvidia0: mem > 0xd8000000-0xdfffffff,0xe0000000-0xe0ffffff irq 5 at device 0.0 on pci1 > isab0: at device 7.0 on pci0 > isa0: on isab0 > atapci0: port 0xa000-0xa00f at device 7.1 on > pci0 > atapci0: Correcting VIA config for southbridge data corruption bug > ata0: at 0x1f0 irq 14 on atapci0 > ata0: [MPSAFE] > ata1: at 0x170 irq 15 on atapci0 > ata1: [MPSAFE] > uhci0: port 0xa400-0xa41f irq 11 at device 7.2 on > pci0 > usb0: on uhci0 > usb0: USB revision 1.0 > uhub0: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 > uhub0: 2 ports with 2 removable, self powered > uhci1: port 0xa800-0xa81f irq 11 at device 7.3 on > pci0 > usb1: on uhci1 > usb1: USB revision 1.0 > uhub1: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 > uhub1: 2 ports with 2 removable, self powered > viapropm0: SMBus I/O base at 0x5000 > viapropm0: port 0x5000-0x500f at device > 7.4 on pci0 > viapropm0: SMBus revision code 0x40 > smbus0: on viapropm0 > smb0: on smbus0 > pcm0: port 0xb400-0xb403,0xb000-0xb003,0xac00-0xacff irq 10 at > device 7.5 on pci0 > pcm0: > ohci0: mem 0xe3006000-0xe3006fff irq 9 at > device 9.0 on pci0 > usb2: OHCI version 1.0, legacy support > usb2: on ohci0 > usb2: USB revision 1.0 > uhub2: (0x11c1) OHCI root hub, class 9/0, rev 1.00/1.00, addr 1 > uhub2: 1 port with 1 removable, self powered > ohci1: mem 0xe3007000-0xe3007fff irq 9 at > device 9.1 on pci0 > usb3: OHCI version 1.0, legacy support > usb3: on ohci1 > usb3: USB revision 1.0 > uhub3: (0x11c1) OHCI root hub, class 9/0, rev 1.00/1.00, addr 1 > uhub3: 1 port with 1 removable, self powered > ohci2: mem 0xe3004000-0xe3004fff irq 9 at > device 9.2 on pci0 > usb4: OHCI version 1.0, legacy support > usb4: on ohci2 > usb4: USB revision 1.0 > uhub4: (0x11c1) OHCI root hub, class 9/0, rev 1.00/1.00, addr 1 > uhub4: 1 port with 1 removable, self powered > ohci3: mem 0xe3005000-0xe3005fff irq 9 at > device 9.3 on pci0 > usb5: OHCI version 1.0, legacy support > usb5: on ohci3 > usb5: USB revision 1.0 > uhub5: (0x11c1) OHCI root hub, class 9/0, rev 1.00/1.00, addr 1 > uhub5: 1 port with 1 removable, self powered > puc0: port > 0xcc00-0xcc0f,0xc800-0xc807,0xc400-0xc407,0xc000-0xc007,0xbc00-0xbc07,0xb800-0xb807 > irq 10 at device 10.0 on pci0 > sio4: on puc0 > sio4: type 16550A > sio4: unable to activate interrupt in fast mode - using normal mode > sio5: on puc0 > sio5: type 16550A > sio5: unable to activate interrupt in fast mode - using normal mode > ppc0: on puc0 > ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode > ppc0: FIFO with 16/16/12 bytes threshold > ppbus0: on ppc0 > plip0: on ppbus0 > lpt0: on ppbus0 > lpt0: Interrupt-driven port > ppi0: on ppbus0 > atapci1: port > 0xe000-0xe00f,0xdc00-0xdc03,0xd800-0xd807,0xd400-0xd403,0xd000-0xd007 mem > 0xe3000000-0xe3003fff irq 11 at device 11.0 on pci0 > atapci1: [MPSAFE] > ata2: at 0xd000 on atapci1 > ata2: [MPSAFE] > ata3: at 0xd800 on atapci1 > ata3: [MPSAFE] > dc0: port 0xe400-0xe4ff mem 0xe3008000-0xe30083ff > irq 10 at device 12.0 on pci0 > dc0: Ethernet address: 00:03:6d:1d:fa:e6 > miibus0: on dc0 > ukphy0: on miibus0 > ukphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto > ahc0: port 0xe800-0xe8ff mem > 0xe3009000-0xe3009fff irq 11 at device 13.0 on pci0 > aic7860: Ultra Single Channel A, SCSI Id=7, 3/253 SCBs > fdc0: port > 0x3f7,0x3f0-0x3f5 irq 6 drq 2 on acpi0 > fdc0: FIFO enabled, 8 bytes threshold > fd0: <1440-KB 3.5" drive> on fdc0 drive 0 > sio0 port 0x3f8-0x3ff irq 4 on acpi0 > sio0: type 16550A > sio1 port 0x2e8-0x2ef irq 3 on acpi0 > sio1: type 16550A > ppc1 port 0x378-0x37f irq 7 on acpi0 > ppc1: Generic chipset (EPP/NIBBLE) in COMPATIBLE mode > ppbus1: on ppc1 > ppbus1: IEEE1284 device found /NIBBLE/ECP > Probing for PnP devices on ppbus1: > ppbus1: MLC,PCL,PML,SCL > plip1: on ppbus1 > lpt1: on ppbus1 > lpt1: Interrupt-driven port > ppi1: on ppbus1 > atkbdc0: port 0x64,0x60 irq 1 on acpi0 > atkbd0: flags 0x1 irq 1 on atkbdc0 > kbd0 at atkbd0 > psm0: irq 12 on atkbdc0 > psm0: model IntelliMouse Explorer, device ID 4 > orm0: