From owner-freebsd-questions@FreeBSD.ORG Tue Jul 25 19:45:30 2006 Return-Path: X-Original-To: freebsd-questions@freebsd.org Delivered-To: freebsd-questions@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 544AC16A4DF for ; Tue, 25 Jul 2006 19:45:30 +0000 (UTC) (envelope-from derek@computinginnovations.com) Received: from betty.computinginnovations.com (dsl081-142-072.chi1.dsl.speakeasy.net [64.81.142.72]) by mx1.FreeBSD.org (Postfix) with ESMTP id 016E943D46 for ; Tue, 25 Jul 2006 19:45:28 +0000 (GMT) (envelope-from derek@computinginnovations.com) Received: from p17.computinginnovations.com (dhcp-10-20-30-100.computinginnovations.com [10.20.30.100]) (authenticated bits=0) by betty.computinginnovations.com (8.13.6/8.12.11) with ESMTP id k6PJj2h2021011; Tue, 25 Jul 2006 14:45:02 -0500 (CDT) Message-Id: <6.0.0.22.2.20060725140403.02682e98@mail.computinginnovations.com> X-Sender: derek@mail.computinginnovations.com X-Mailer: QUALCOMM Windows Eudora Version 6.0.0.22 Date: Tue, 25 Jul 2006 14:44:44 -0500 To: "Rob Connon (Info)" , freebsd-questions@freebsd.org From: Derek Ragona In-Reply-To: <44C660AF.5040206@vfs.com> References: <44C660AF.5040206@vfs.com> Mime-Version: 1.0 X-ComputingInnovations-MailScanner-Information: Please contact the ISP for more information X-ComputingInnovations-MailScanner: Found to be clean X-ComputingInnovations-MailScanner-From: derek@computinginnovations.com X-Spam-Status: No Content-Type: text/plain; charset="us-ascii"; format=flowed X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: Subject: Re: FreeBSD 6 Hard Lock no logs X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 25 Jul 2006 19:45:30 -0000 First look for the obvious problems like low disk or swap space. If these are OK, you might need to run a script that logs various things and sift through it. I would suggest writing a shell script that sleeps for say 30 to 60 seconds, then opens a log file and does ps -ax and some other echo to give separation and see what is running, then closes the file and sleeps again. Obviously you will only be interested in the last couple of entries in this file. -Derek At 01:19 PM 7/25/2006, Rob Connon (Info) wrote: >Hi, > >I have a web/mail server thats running the lastest version of FreeBSD >6.0-RELEASE-p9 #4. In the last month or so it's started hard locking.. >when the machine locks up i can still ping it and get connects from >telneting to 80,22,etc sometimes i get a banner and sometimes i dont.. but >there are >no errors on the console or in the logs. > >The odd thing is the locking seems to happen within a certain time window >(mon,tues) and never end of the week or weekend.. i suspected it could >have been >a bad cron job but nothing falls into that time frame. > >As a test i've been rebooting the server everynight to see if that would >help the machine get past the begining of the week with out a hang and >again this morning >even though i rebooted last night at 10pm hung around 9:47am. > >The machine is a Dell PowerEdge 2550, I've had dell come and replace the >MB and have ran all their diagnostics aswell with no errors reported.. >I've been reading alot about APIC and ACPI and people having similar >issues but nothing that fits the bill... below is the dmesg and output of >vmstat -i.. another odd thing is the rate for the CPU timer is extremely >high compared to other machines with similar hardware or faster hardware. > >Any help on where to look next would be awesome. > >interrupt total rate >irq1: atkbd0 107 0 >irq6: fdc0 10 0 >irq13: npx0 1 0 >irq14: ata0 74 0 >irq16: fxp0 27110 12 >irq20: amr0 105950 48 >cpu0: timer 4385477 1999 >cpu1: timer 4369967 1992 >Total 8888696 4053 > > >Copyright (c) 1992-2005 The FreeBSD Project. >Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 > The Regents of the University of California. All rights reserved. >FreeBSD 6.0-RELEASE-p9 #4: Thu Jun 22 14:54:15 PDT 2006 > root@taurus.packetsafe.net:/usr/src/sys/i386/compile/SMP >Timecounter "i8254" frequency 1193182 Hz quality 0 >CPU: Intel(R) Pentium(R) III CPU family 1266MHz (1258.22-MHz >686-class CPU) > Origin = "GenuineIntel" Id = 0x6b1 Stepping = 1 > >Features=0x383fbff >real memory = 1073676288 (1023 MB) >avail memory = 1041612800 (993 MB) >ACPI APIC Table: >FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs >cpu0 (BSP): APIC ID: 1 >cpu1 (AP): APIC ID: 0 >ioapic0: Changing APIC ID to 2 >ioapic1: Changing APIC ID to 3 >MADT: Forcing active-low polarity and level trigger for SCI >ioapic0 irqs 0-15 on motherboard >ioapic1 irqs 16-31 on motherboard >npx0: [FAST] >npx0: on motherboard >npx0: INT 16 interface >acpi0: on motherboard >acpi0: Power Button (fixed) >pci_link0: irq 5 on acpi0 >pci_link1: irq 10 on acpi0 >pci_link2: on acpi0 >pci_link3: on acpi0 >pci_link4: irq 5 on acpi0 >pci_link5: irq 10 on acpi0 >pci_link6: on acpi0 >pci_link7: on acpi0 >pci_link8: on acpi0 >pci_link9: on acpi0 >pci_link10: on acpi0 >pci_link11: on acpi0 >pci_link12: on acpi0 >pci_link13: on acpi0 >pci_link14: on acpi0 >pci_link15: irq 10 on acpi0 >pci_link16: irq 11 on acpi0 >Timecounter "ACPI-safe" frequency 3579545 Hz quality 1000 >acpi_timer0: <32-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0 >cpu0: on acpi0 >cpu1: on acpi0 >pcib0: on acpi0 >pci0: on pcib0 >pcib1: at device 2.0 on pci0 >pci1: on pcib1 >pcib2: at device 0.0 on pci1 >pci2: on pcib2 >amr0: mem 0xf0000000-0xf7ffffff irq 20 at device >0.0 on pci2 >amr0: Firmware 197O, BIOS 3.35, 128MB RAM >pci1: at device 1.0 (no driver attached) >pci0: at device 14.0 (no driver attached) >isab0: port 0x8a0-0x8af at device 15.0 on pci0 >isa0: on isab0 >atapci0: port >0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x8b0-0x8bf at device 15.1 on pci0 >ata0: on atapci0 >ata1: on atapci0 >ohci0: mem 0xfe400000-0xfe400fff irq 11 at >device 15.2 on pci0 >ohci0: [GIANT-LOCKED] >usb0: OHCI version 1.0, legacy support >usb0: on ohci0 >usb0: USB revision 1.0 >uhub0: (0x1166) OHCI root hub, class 9/0, rev 1.00/1.00, addr 1 >uhub0: 2 ports with 2 removable, self powered >pcib3: on acpi0 >pci3: on pcib3 >bge0: mem >0xfeb00000-0xfeb0ffff irq 17 at device 8.0 on pci3 >miibus0: on bge0 >brgphy0: on miibus0 >brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, >1000baseTX-FDX, auto >bge0: Ethernet address: 00:06:5b:f5:87:93 >pcib4: on acpi0 >pci4: on pcib4 >pcib5: at device 2.0 on pci4 >pci5: on pcib5 >fxp0: port 0xbcc0-0xbcff mem >0xfe900000-0xfe900fff,0xfe700000-0xfe7fffff irq 16 at device 4.0 on pci4 >miibus1: on fxp0 >inphy0: on miibus1 >inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto >fxp0: Ethernet address: 00:06:5b:f5:87:92 >fdc0: port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0 >fdc0: [FAST] >fd0: <1440-KB 3.5" drive> on fdc0 drive 0 >atkbdc0: port 0x60,0x64 irq 1 on acpi0 >atkbd0: irq 1 on atkbdc0 >kbd0 at atkbd0 >atkbd0: [GIANT-LOCKED] >sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 >sio0: type 16550A >sio1: <16550A-compatible COM port> port 0x2f8-0x2ff irq 3 on acpi0 >sio1: type 16550A >ppc0: port 0x378-0x37f,0x778-0x77f irq 7 drq 1 >on acpi0 >ppc0: Generic chipset (ECP/PS2/NIBBLE) in COMPATIBLE mode >ppc0: FIFO with 16/16/8 bytes threshold >ppbus0: on ppc0 >plip0: on ppbus0 >lpt0: on ppbus0 >lpt0: Interrupt-driven port >ppi0: on ppbus0 >pmtimer0 on isa0 >orm0: at iomem 0xc0000-0xc7fff,0xec000-0xeffff on isa0 >sc0: at flags 0x100 on isa0 >sc0: VGA <16 virtual consoles, flags=0x300> >vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 >Timecounters tick every 1.000 msec >acd0: CDROM at ata0-master PIO4 >amrd0: on amr0 >amrd0: 104040MB (213073920 sectors) RAID 5 (optimal) >ses0 at amr0 bus 0 target 6 lun 0 >ses0: Fixed Processor SCSI-2 device >ses0: SAF-TE Compliant Device >SMP: AP CPU #1 Launched! > > -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. MailScanner thanks transtec Computers for their support.