From owner-freebsd-hackers@FreeBSD.ORG Mon Apr 7 16:52:03 2003 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 28DA437B401 for ; Mon, 7 Apr 2003 16:52:03 -0700 (PDT) Received: from yowie.cc.uq.edu.au (yowie.cc.uq.edu.au [130.102.2.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id AC0E843FA3 for ; Mon, 7 Apr 2003 16:52:01 -0700 (PDT) (envelope-from csmith@its.uq.edu.au) Received: from its.uq.edu.au (tobermory.its.uq.edu.au [130.102.152.68]) by yowie.cc.uq.edu.au (8.12.9/8.12.9) with ESMTP id h37Nq0fI029595 for ; Tue, 8 Apr 2003 09:52:00 +1000 (GMT+1000) Date: Tue, 8 Apr 2003 09:52:00 +1000 Mime-Version: 1.0 (Apple Message framework v551) Content-Type: text/plain; delsp=yes; charset=US-ASCII; format=flowed From: Christopher Smith To: freebsd-hackers@freebsd.org Content-Transfer-Encoding: 7bit Message-Id: X-Mailer: Apple Mail (2.551) Subject: Regular kernel panics on 4.7-RELEASE system X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 07 Apr 2003 23:52:03 -0000 Apologies if I've forgotten anything, this is the first time I've had a problem like this... I have a 4.7-RELEASE box that is suffering regular kernel panics. Unfortunately, they happen at about 3:15 AM every day, so I've yet to actually witness one in person :). I have, however, followed the directions at http://www.onlamp.com/pub/a/bsd/2002/04/04/Big_Scary_Daemons.html to hopefully provide someone here with enough useful information to address the problem. The machine is a Dell 2650 running primarily as a file/print server to a number of computer labs of about 400 machines (although it also functions as a rembo image server and squid proxy). It mainly stores applications, which are run off a samba share and user home directories (again, accessed via samba). It has a largish filesystem (~200G) on a Powervault 220 attached via a PERC3/DC controller (amr) that most of the data is stored on. The OS is on a pair of internal 18G drives attached to the internal PERC3/Di controller (aac). It is attached to the network with a Netgear GA620 fibre NIC (ti). I'm guessing the panic is being caused by "something" fired off by the /etc/periodic/daily scripts, since they start running at 0300, although running each one manually does not cause a panic (or hasn't the times I've tried it). It appears to be something specific to this system, as we have quite a few 2650s here running various things from squid proxies to development servers and none of them have exhibited similar problems. All the data on this system was recently migrated off a recently-decommissioned Dell 6300, which was also exhibiting identical mysterious early-morning reboots. At the time, due to workload I didn't bother chasing it up (since it was effectively not impacting on services) and was half-hoping the new machine would fix it. At least the continuing problems minimises the chance it's hardware-related :). Here is the relevant system info. If I've forgotten anything, or there is anything more anyone needs to help fix the problem, please let me know. leela# uname -a FreeBSD leela.lab.bel.uq.edu.au 4.7-RELEASE-p10 FreeBSD 4.7-RELEASE-p10 #0: Mon Apr 7 10:34:08 EST 2003 root@leela.lab.bel.uq.edu.au:/usr/src/sys/compile/LEELA i386 leela# leela# cat /var/run/dmesg.boot Copyright (c) 1992-2002 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 4.7-RELEASE-p10 #0: Mon Apr 7 10:34:08 EST 2003 root@leela.lab.bel.uq.edu.au:/usr/src/sys/compile/LEELA Timecounter "i8254" frequency 1193182 Hz CPU: Pentium 4 (2392.26-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0xf27 Stepping = 7 Features=0xbfebfbff,ACC,> real memory = 2147418112 (2097088K bytes) avail memory = 2088574976 (2039624K bytes) Changing APIC ID for IO APIC #0 from 0 to 4 on chip Changing APIC ID for IO APIC #1 from 0 to 5 on chip Changing APIC ID for IO APIC #2 from 0 to 6 on chip Programming 16 pins in IOAPIC #0 IOAPIC #0 intpin 2 -> irq 0 Programming 16 pins in IOAPIC #1 Programming 16 pins in IOAPIC #2 FreeBSD/SMP: Multiprocessor motherboard cpu0 (BSP): apic id: 0, version: 0x00050014, at 0xfee00000 cpu1 (AP): apic id: 2, version: 0x00050014, at 0xfee00000 io0 (APIC): apic id: 4, version: 0x000f0011, at 0xfec00000 io1 (APIC): apic id: 5, version: 0x000f0011, at 0xfec01000 io2 (APIC): apic id: 6, version: 0x000f0011, at 0xfec02000 Preloaded elf kernel "kernel" at 0xc030d000. Pentium Pro MTRR support enabled md0: Malloc disk Using $PIR table, 9 entries at 0xc00fc480 npx0: on motherboard npx0: INT 16 interface pcib0: on motherboard IOAPIC #1 intpin 3 -> irq 2 IOAPIC #1 intpin 7 -> irq 7 IOAPIC #1 intpin 11 -> irq 10 pci0: on pcib0 pci0: (vendor=0x1028, dev=0x000c) at 4.0 irq 2 pci0: (vendor=0x1028, dev=0x0008) at 4.1 irq 7 pci0: (vendor=0x1028, dev=0x000d) at 4.2 irq 10 pci0: at 14.0 atapci0: port 0x8b0-0x8bf,0x8d8-0x8db,0x8d0-0x8d7,0x8c8-0x8cb,0x8c0-0x8c7 at device 15.1 on pci0 ata0: at 0x1f0 irq 14 on atapci0 ata1: at 0x170 irq 15 on atapci0 pci0: at 15.2 irq 5 isab0: at device 15.3 on pci0 isa0: on isab0 pcib1: on motherboard IOAPIC #1 intpin 0 -> irq 11 pci1: on pcib1 ti0: mem 0xfcf00000-0xfcf03fff irq 11 at device 6.0 on pci1 ti0: Ethernet address: 00:02:e3:00:0d:c6 pcib2: on motherboard pci2: on pcib2 pcib8: at device 6.0 on pci2 IOAPIC #1 intpin 9 -> irq 13 pci3: on pcib8 pcib9: at device 0.0 on pci3 IOAPIC #1 intpin 8 -> irq 16 pci4: on pcib9 amr0: mem 0xf0000000-0xf7ffffff irq 16 at device 0.0 on pci4 amr0: Firmware 1.74, BIOS 3.27, 128MB RAM pci3: (vendor=0x1077, dev=0x1216) at 1.0 irq 13 pcib3: on motherboard IOAPIC #1 intpin 12 -> irq 17 IOAPIC #1 intpin 13 -> irq 18 pci5: on pcib3 bge0: mem 0xeff10000-0xeff1ffff irq 17 at device 6.0 on pci5 bge0: Ethernet address: 00:06:5b:f3:09:7d miibus0: on bge0 brgphy0: on miibus0 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto bge1: mem 0xeff00000-0xeff0ffff irq 18 at device 8.0 on pci5 bge1: Ethernet address: 00:06:5b:f3:09:7e miibus1: on bge1 brgphy1: on miibus1 brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto pcib4: on motherboard IOAPIC #1 intpin 14 -> irq 19 pci6: on pcib4 pcib10: at device 8.0 on pci6 IOAPIC #1 intpin 15 -> irq 20 pci7: on pcib10 pci7: (vendor=0x9005, dev=0x00c5) at 6.0 irq 19 pci7: (vendor=0x9005, dev=0x00c5) at 6.1 irq 20 aac0: mem 0xe0000000-0xe7ffffff irq 19 at device 8.1 on pci6 aac0: i960RX 100MHz, 118MB cache memory, optional battery present aac0: Kernel 2.7-1, Build 3170, S/N 9c38d3 pcib5: on motherboard pci8: on pcib5 pcib6: on motherboard pci9: on pcib6 pcib7: on motherboard pci10: on pcib7 orm0: