Date: Tue, 22 Dec 2009 12:16:43 +0000 From: Pete French <petefrench@ticketswitch.com> To: freebsd-stable@FreeBSD.org Subject: Disc lock up on 8.0-STABLE Message-ID: <E1NN3fP-00048P-Tb@dilbert.ticketswitch.com>
next in thread | raw e-mail | index | archive | help
I've been gradually testing 8.0 on several machines propr to deploying it live, but I currently have a machine which appears to lock-up at 3am every day. The symptoms are that the machine is still pingable, but doing anything which requires access to the disc just freezes (so you cant login for example). I've seen simiilar behaviour behore on machines when the disc syste has locked up for some reason, so am ttentatively guessing that this is the cause. The machine is an HP DL360 G5 with a ciss0 controller for the drives. I have upgraded to the latest STABLE but the freeze still happens. Am including a dmesg below, and will compile it with KDB, DDB to see what happens. The machine is booting from a UFS partition, but is using ZFS for everything else. The fcat it deadlocks at 3am makes me thing this is something to do with scheduled jobs maybe ? Then again, I have an almost identical DL360 which is running 8.0 and is rock solid. -pete. Copyright (c) 1992-2009 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 8.0-STABLE #0: Mon Dec 21 15:42:31 GMT 2009 webadmin@florentine.rattatosk:/usr/obj/usr/src/sys/GENERIC amd64 Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) Xeon(R) CPU E5345 @ 2.33GHz (2333.43-MHz K8-class CPU) Origin = "GenuineIntel" Id = 0x6fb Stepping = 11 Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> Features2=0x4e3bd<SSE3,DTES64,MON,DS_CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,DCA> AMD Features=0x20000800<SYSCALL,LM> AMD Features2=0x1<LAHF> TSC: P-state invariant real memory = 4294967296 (4096 MB) avail memory = 4104138752 (3914 MB) ACPI APIC Table: <HP ProLiant> FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs FreeBSD/SMP: 1 package(s) x 4 core(s) cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 cpu2 (AP): APIC ID: 2 cpu3 (AP): APIC ID: 3 ACPI Warning: Invalid length for Pm1aControlBlock: 32, using default 16 20090521 tbfadt-707 ioapic0 <Version 2.0> irqs 0-23 on motherboard ioapic1 <Version 2.0> irqs 24-47 on motherboard kbd1 at kbdmux0 acpi0: <HP ProLiant> on motherboard acpi0: [ITHREAD] acpi0: Power Button (fixed) Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x908-0x90b on acpi0 acpi_hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0 Timecounter "HPET" frequency 14318180 Hz quality 900 pcib0: <ACPI Host-PCI bridge> on acpi0 pci0: <ACPI PCI bus> on pcib0 pcib1: <ACPI PCI-PCI bridge> at device 2.0 on pci0 ACPI Warning: \\_SB_.PCI0.PT02._PRT: Return Package has no elements (empty) 20090521 nspredef-545 pci9: <ACPI PCI bus> on pcib1 pcib2: <ACPI PCI-PCI bridge> at device 0.0 on pci9 pci10: <ACPI PCI bus> on pcib2 pcib3: <ACPI PCI-PCI bridge> at device 0.0 on pci10 pci11: <ACPI PCI bus> on pcib3 pcib4: <PCI-PCI bridge> at device 1.0 on pci10 pci14: <PCI bus> on pcib4 pcib5: <PCI-PCI bridge> at device 2.0 on pci10 pci15: <PCI bus> on pcib5 pcib6: <ACPI PCI-PCI bridge> at device 0.3 on pci9 pci16: <ACPI PCI bus> on pcib6 pcib7: <ACPI PCI-PCI bridge> at device 3.0 on pci0 pci6: <ACPI PCI bus> on pcib7 pcib8: <PCI-PCI bridge> at device 0.0 on pci6 pci7: <PCI bus> on pcib8 pcib9: <PCI-PCI bridge> at device 4.0 on pci7 pci8: <PCI bus> on pcib9 ciss0: <HP Smart Array E200i> port 0x4000-0x40ff mem 0xfde80000-0xfdefffff,0xfde70000-0xfde77fff irq 16 at device 8.0 on pci7 ciss0: PERFORMANT Transport ciss0: got 2 MSI messages] ciss0: [ITHREAD] pcib10: <ACPI PCI-PCI bridge> at device 4.0 on pci0 pci19: <ACPI PCI bus> on pcib10 pcib11: <PCI-PCI bridge> at device 5.0 on pci0 pci22: <PCI bus> on pcib11 pcib12: <ACPI PCI-PCI bridge> at device 6.0 on pci0 pci2: <ACPI PCI bus> on pcib12 pcib13: <ACPI PCI-PCI bridge> at device 0.0 on pci2 pci3: <ACPI PCI bus> on pcib13 bce0: <HP NC373i Multifunction Gigabit Server Adapter (B2)> mem 0xf8000000-0xf9ffffff irq 18 at device 0.0 on pci3 miibus0: <MII bus> on bce0 brgphy0: <BCM5708C 10/100/1000baseTX PHY> PHY 1 on miibus0 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto bce0: Ethernet address: 00:1e:0b:5f:1f:76 bce0: [ITHREAD] bce0: ASIC (0x57081020); Rev (B2); Bus (PCI-X, 64-bit, 133MHz); B/C (1.9.6); Flags (MSI|MFW); MFW () pcib14: <ACPI PCI-PCI bridge> at device 7.0 on pci0 pci4: <ACPI PCI bus> on pcib14 pcib15: <ACPI PCI-PCI bridge> at device 0.0 on pci4 pci5: <ACPI PCI bus> on pcib15 bce1: <HP NC373i Multifunction Gigabit Server Adapter (B2)> mem 0xfa000000-0xfbffffff irq 19 at device 0.0 on pci5 miibus1: <MII bus> on bce1 brgphy1: <BCM5708C 10/100/1000baseTX PHY> PHY 1 on miibus1 brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto bce1: Ethernet address: 00:1e:0b:5f:fd:d8 bce1: [ITHREAD] bce1: ASIC (0x57081020); Rev (B2); Bus (PCI-X, 64-bit, 133MHz); B/C (1.9.6); Flags (MSI|MFW); MFW () uhci0: <Intel 631XESB/632XESB/3100 USB controller USB-1> port 0x1000-0x101f irq 16 at device 29.0 on pci0 uhci0: [ITHREAD] usbus0: <Intel 631XESB/632XESB/3100 USB controller USB-1> on uhci0 uhci1: <Intel 631XESB/632XESB/3100 USB controller USB-2> port 0x1020-0x103f irq 17 at device 29.1 on pci0 uhci1: [ITHREAD] usbus1: <Intel 631XESB/632XESB/3100 USB controller USB-2> on uhci1 uhci2: <Intel 631XESB/632XESB/3100 USB controller USB-3> port 0x1040-0x105f irq 18 at device 29.2 on pci0 uhci2: [ITHREAD] usbus2: <Intel 631XESB/632XESB/3100 USB controller USB-3> on uhci2 uhci3: <Intel 631XESB/632XESB/3100 USB controller USB-4> port 0x1060-0x107f irq 19 at device 29.3 on pci0 uhci3: [ITHREAD] usbus3: <Intel 631XESB/632XESB/3100 USB controller USB-4> on uhci3 ehci0: <Intel 63XXESB USB 2.0 controller> mem 0xf7df0000-0xf7df03ff irq 16 at device 29.7 on pci0 ehci0: [ITHREAD] usbus4: EHCI version 1.0 usbus4: <Intel 63XXESB USB 2.0 controller> on ehci0 pcib16: <ACPI PCI-PCI bridge> at device 30.0 on pci0 pci1: <ACPI PCI bus> on pcib16 vgapci0: <VGA-compatible display> port 0x3000-0x30ff mem 0xd8000000-0xdfffffff,0xf7ff0000-0xf7ffffff irq 23 at device 3.0 on pci1 pci1: <base peripheral> at device 4.0 (no driver attached) pci1: <base peripheral> at device 4.2 (no driver attached) uhci4: <UHCI (generic) USB controller> port 0x3800-0x381f irq 22 at device 4.4 on pci1 uhci4: [ITHREAD] usbus5: <UHCI (generic) USB controller> on uhci4 pci1: <serial bus> at device 4.6 (no driver attached) isab0: <PCI-ISA bridge> at device 31.0 on pci0 isa0: <ISA bus> on isab0 atapci0: <Intel 63XXESB2 UDMA100 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x500-0x50f irq 17 at device 31.1 on pci0 ata0: <ATA channel 0> on atapci0 ata0: [ITHREAD] acpi_tz0: <Thermal Zone> on acpi0 atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0 atkbd0: <AT Keyboard> irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] atkbd0: [ITHREAD] uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 uart0: [FILTER] cpu0: <ACPI CPU> on acpi0 est0: <Enhanced SpeedStep Frequency Control> on cpu0 est: CPU supports Enhanced Speedstep, but is not recognized. est: cpu_vendor GenuineIntel, msr 725072506000725 device_attach: est0 attach returned 6 p4tcc0: <CPU Frequency Thermal Control> on cpu0 cpu1: <ACPI CPU> on acpi0 est1: <Enhanced SpeedStep Frequency Control> on cpu1 est: CPU supports Enhanced Speedstep, but is not recognized. est: cpu_vendor GenuineIntel, msr 725072506000725 device_attach: est1 attach returned 6 p4tcc1: <CPU Frequency Thermal Control> on cpu1 cpu2: <ACPI CPU> on acpi0 est2: <Enhanced SpeedStep Frequency Control> on cpu2 est: CPU supports Enhanced Speedstep, but is not recognized. est: cpu_vendor GenuineIntel, msr 725072506000725 device_attach: est2 attach returned 6 p4tcc2: <CPU Frequency Thermal Control> on cpu2 cpu3: <ACPI CPU> on acpi0 est3: <Enhanced SpeedStep Frequency Control> on cpu3 est: CPU supports Enhanced Speedstep, but is not recognized. est: cpu_vendor GenuineIntel, msr 725072506000725 device_attach: est3 attach returned 6 p4tcc3: <CPU Frequency Thermal Control> on cpu3 orm0: <ISA Option ROMs> at iomem 0xc0000-0xcafff,0xe6000-0xe7fff on isa0 sc0: <System console> at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 atrtc0: <AT Real Time Clock> at port 0x70 irq 8 on isa0 ppc0: cannot reserve I/O port range uart1: <Non-standard ns8250 class UART with FIFOs> at port 0x2f8-0x2ff irq 3 on isa0 uart1: [FILTER] ZFS filesystem version 13 ZFS storage pool version 13 Timecounters tick every 1.000 msec usbus0: 12Mbps Full Speed USB v1.0 usbus1: 12Mbps Full Speed USB v1.0 usbus2: 12Mbps Full Speed USB v1.0 usbus3: 12Mbps Full Speed USB v1.0 usbus4: 480Mbps High Speed USB v2.0 usbus5: 12Mbps Full Speed USB v1.0 ugen0.1: <Intel> at usbus0 uhub0: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus0 ugen1.1: <Intel> at usbus1 uhub1: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus1 ugen2.1: <Intel> at usbus2 uhub2: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus2 ugen3.1: <Intel> at usbus3 uhub3: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus3 ugen4.1: <Intel> at usbus4 uhub4: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus4 ugen5.1: <(0x103c)> at usbus5 uhub5: <(0x103c) UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus5 uhub0: 2 ports with 2 removable, self powered uhub1: 2 ports with 2 removable, self powered uhub2: 2 ports with 2 removable, self powered uhub3: 2 ports with 2 removable, self powered uhub5: 2 ports with 2 removable, self powered ugen5.2: <HP> at usbus5 ukbd0: <Virtual Keyboard> on usbus5 kbd2 at ukbd0 ums0: <Virtual Mouse> on usbus5 ugen5.3: <HP> at usbus5 uhub6: <Virtual Hub> on usbus5 uhub4: 8 ports with 8 removable, self powered uhub6: 7 ports with 7 removable, self powered da0 at ciss0 bus 0 scbus0 target 0 lun 0 da0: <COMPAQ RAID 1 VOLUME OK> Fixed Direct Access SCSI-5 device da0: 135.168MB/s transfers da0: Command Queueing enabled da0: 69970MB (143299800 512 byte sectors: 255H 63S/T 8920C) SMP: AP CPU #1 Launched! SMP: AP CPU #2 Launched! SMP: AP CPU #3 Launched! Trying to mount root from ufs:/dev/da0s1a WARNING: / was not properly dismounted Setting hostuuid: 33393935-3234-5553-4538-30344e37364e. Setting hostid: 0xa1c8b883. Entropy harvesting: interrupts ethernet point_to_point kickstart . Starting file system checks: /dev/da0s1a: UNREF FILE I=308432 OWNER=root MODE=100644 /dev/da0s1a: SIZE=0 MTIME=Dec 21 18:07 2009 (CLEARED) /dev/da0s1a: UNREF FILE I=308546 OWNER=root MODE=140666 /dev/da0s1a: SIZE=0 MTIME=Dec 21 18:06 2009 (CLEARED) /dev/da0s1a: FREE BLK COUNT(S) WRONG IN SUPERBLK (SALVAGED) /dev/da0s1a: SUMMARY INFORMATION BAD (SALVAGED) /dev/da0s1a: BLK(S) MISSING IN BIT MAPS (SALVAGED) /dev/da0s1a: 24964 files, 642961 used, 1386070 free (32630 frags, 169180 blocks, 1.6% fragmentation) Mounting local file systems: . Setting hostname: florentine.rattatosk . Starting Network: lo0 bce0 bce1 lagg0. lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384 options=3<RXCSUM,TXCSUM> inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3 inet6 ::1 prefixlen 128 inet 127.0.0.1 netmask 0xff000000 nd6 options=3<PERFORMNUD,ACCEPT_RTADV> bce0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=1bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4> ether 00:1e:0b:5f:1f:76 media: Ethernet autoselect (none) status: no carrier bce1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=1bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4> ether 00:1e:0b:5f:1f:76 media: Ethernet autoselect (none) status: no carrier lagg0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=1bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4> ether 00:1e:0b:5f:1f:76 inet 10.48.19.0 netmask 0xffff0000 broadcast 10.48.255.255 inet 10.48.19.229 netmask 0xffff0000 broadcast 10.48.255.255 inet 10.48.19.223 netmask 0xffff0000 broadcast 10.48.255.255 inet 10.48.19.226 netmask 0xffff0000 broadcast 10.48.255.255 inet 10.48.19.224 netmask 0xffff0000 broadcast 10.48.255.255 inet 10.48.19.228 netmask 0xffff0000 broadcast 10.48.255.255 inet 10.48.19.227 netmask 0xffff0000 broadcast 10.48.255.255 inet 10.48.19.239 netmask 0xffff0000 broadcast 10.48.255.255 inet 10.48.19.230 netmask 0xffff0000 broadcast 10.48.255.255 inet 10.48.19.232 netmask 0xffff0000 broadcast 10.48.255.255 inet 10.48.19.235 netmask 0xffff0000 broadcast 10.48.255.255 inet 10.48.19.245 netmask 0xffff0000 broadcast 10.48.255.255 media: Ethernet autoselect status: no carrier laggproto lacp laggport: bce1 flags=0<> laggport: bce0 flags=0<> add net default: gateway 10.48.0.9 Starting devd. Starting ums0 moused . Creating and/or trimming log files . Starting syslogd. ELF ldconfig path: /lib /usr/lib /usr/lib/compat /usr/local/lib /usr/local/lib/compat /usr/local/lib/mysql 32-bit compatibility ldconfig path: /usr/lib32 /usr/local/lib32/compat Starting tomcat60. Starting named. Dec 22 12:06:32 florentine named[827]: the working directory is not writable Starting rpcbind. NFS access cache time=60 Clearing /tmp (X related). bce0: link state changed to UP lagg0: link state changed to UP Starting mountd. Starting nfsd. Updating motd: . Starting ntpd. bce1: link state changed to UP Starting dhcpd. Starting pdns. Dec 22 12:06:33 Reading random entropy from '/dev/urandom' Dec 22 12:06:33 florentine pdns[1211]: UDP server bound to 0.0.0.0:53 Dec 22 12:06:33 florentine pdns[1211]: UDPv6 server bound to [::]:53 Dec 22 12:06:33 florentine pdns[1211]: TCP server bound to 0.0.0.0:53 Dec 22 12:06:33 florentine pdns[1211]: TCPv6 server bound to [::]:53 Dec 22 12:06:33 florentine pdns[1211]: DNS Proxy launched, local port 26807, remote 127.0.0.1:5300 Dec 22 12:06:33 florentine pdns[1211]: Creating backend connection for TCP Starting mysql. Starting exim. Dec 22 12:06:33 florentine pdns[1211]: Backend launched with banner: OK TicketSwitch Mode 1 Performing sanity check on apache22 configuration: Syntax OK Starting apache22. Configuring syscons: blanktime . Starting sshd. Starting cron. Starting inetd. Starting background file system checks in 60 seconds.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E1NN3fP-00048P-Tb>