From owner-freebsd-questions@FreeBSD.ORG Wed Jan 28 13:44:26 2004 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id AFE8916A4CF for ; Wed, 28 Jan 2004 13:44:26 -0800 (PST) Received: from jorn.servebeer.com (node-c-0ab6.a2000.nl [62.194.10.182]) by mx1.FreeBSD.org (Postfix) with ESMTP id 142AE43D2F for ; Wed, 28 Jan 2004 13:44:20 -0800 (PST) (envelope-from jorn@wcborstel.nl) Received: from sauron.emea.middle-earth.org (unknown [172.16.1.2]) by jorn.servebeer.com (Postfix) with ESMTP id 92820170A3; Wed, 28 Jan 2004 22:42:30 +0100 (CET) To: juhlig@parc.com References: Message-ID: From: Jorn Argelo Content-Type: text/plain; format=flowed; charset=iso-8859-1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Date: Wed, 28 Jan 2004 22:44:19 +0100 In-Reply-To: User-Agent: Opera7.23/FreeBSD M2 build 518 cc: "questions@freebsd.org" Subject: Re: system crash tickled by 450.status-security (fwd) X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Jan 2004 21:44:26 -0000 Well, I recall FreeBSD 5.1 having problems with the RAID controller that is being used by the PE 2650 (a DELL PERC 3/Di or something wasn't it?). I don't know how it is with 4.9 though, never tried that. We were using Nagios and MRTG on that Box, which is a monitoring tool. And well, it had to get about 5 or 6 SNMP checks plus several port checks from about 175 servers, so it had quite a load. Thus it resulted into a complete system crash frequently. Unfortunately I can't give you a real solution. The funny thing was, I tried upgrading it to FreeBSD 5.1-CURRENT but that wasn't working at all. So I reinstalled it again to RELEASE, recompiled the kernel with the same configuration file as I did with the previous one, and suddenly it was all fine. It has an uptime from 31 days now. I know this message isn't going to help you too much, but I thought it might be handy to know that you were not the only one having problems with the Dell PowerEdge 2650. Cheers, Jorn On Wed, 28 Jan 2004 13:27:29 PST, John Uhlig wrote: > > We are running FreeBSD 4.9 on 2 Dell poweredge 2650's as fileservers > each with 1 TB of RAID disk file space. Both crash and reboot every few > days at approx. 3:15AM. It appears that the systems are running > /etc/periodic/ > daily/450.status-security script when the crash occurs. Running the daily > cronjobs more frequently induces the crash more often. > > We have a kernel core dump and have included some of the gdb output > below. I would appreciate any pointers or suggestions that can help > us resolve this problem. > > thanks, > John Uhlig > > =================================================================== > uname output > ==================================================================== > platoon# uname -a > FreeBSD platoon.parc.xerox.com 4.9-RELEASE-p1 FreeBSD 4.9-RELEASE-p1 #0: > Wed Jan 28 08:45:33 PST 2004 > juhlig@platoon.parc.xerox.com:/usr/obj/usr/src/sys/PARCGBNIC.debg i386 > > ================================================================= > initial gdb output > ================================================================== > SMP 4 cpus > IdlePTD at phsyical address 0x0051f000 > initial pcb at physical address 0x0044e560 > panicstr: page fault > panic messages: > --- > Fatal trap 12: page fault while in kernel mode > mp_lock = 00000002; cpuid = 0; lapic.id = 00000000 > fault virtual address = 0xbfc00000 > fault code = supervisor write, page not present > instruction pointer = 0x8:0xc0356149 > stack pointer = 0x10:0xffbe1e04 > frame pointer = 0x10:0xffbe1e10 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, def32 1, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = 701 (sed) > interrupt mask = none <- SMP: XXX > trap number = 12 > panic: page fault > mp_lock = 00000002; cpuid = 0; lapic.id = 00000000 > boot() called on cpu#0 > > syncing disks... 52 > done > Uptime: 20m43s > amr0: flushing cache...done > > dumping to dev #aacd/0x40001, offset 5243136 > > =================================================================== > List code at instruction pointer address > ==================================================================== > (kgdb) list *0xc0356149 > 0xc0356149 is in pmap_qenter (/usr/src/sys/i386/i386/pmap.c:848). > 843 void > 844 pmap_qenter(vm_offset_t va, vm_page_t *m, int count) > 845 { > 846 while (count-- > 0) { > 847 pt_entry_t *pte = vtopte(va); > 848 *pte = VM_PAGE_TO_PHYS(*m) | PG_RW | PG_V | > pgeflag; > 849 #ifdef SMP > 850 cpu_invlpg((void *)va); > 851 #else > 852 invltlb_1pg(va); > > ===================================================================== > backtrace > ===================================================================== > (kgdb) backtrace > #0 dumpsys () at /usr/src/sys/kern/kern_shutdown.c:487 > #1 0xc01d85c3 in boot (howto=256) at > /usr/src/sys/kern/kern_shutdown.c:316 > #2 0xc01d8a1c in poweroff_wait (junk=0xc03d3819, howto=-1069731121) > at /usr/src/sys/kern/kern_shutdown.c:595 > #3 0xc035a4d8 in trap_fatal (frame=0xffbe1dc4, eva=3217031168) > at /usr/src/sys/i386/i386/trap.c:974 > #4 0xc035a169 in trap_pfault (frame=0xffbe1dc4, usermode=0, > eva=3217031168) > at /usr/src/sys/i386/i386/trap.c:867 > #5 0xc0359cdb in trap (frame={tf_fs = 24, tf_es = -67108848, tf_ds = > 134545424, > tf_edi = -67978584, tf_esi = 0, tf_ebp = -4317680, tf_isp = > -4317712, tf_ebx = 3, > tf_edx = -1043397044, tf_ecx = 0, tf_eax = 1122230275, tf_trapno = > 12, tf_err = 2, > tf_eip = -1070243511, tf_cs = 8, tf_eflags = 66054, tf_esp = > 134606848, > tf_ss = 134606848}) at /usr/src/sys/i386/i386/trap.c:466 > #6 0xc0356149 in pmap_qenter (va=0, m=0xfbf2baa8, count=4) > at /usr/src/sys/i386/i386/pmap.c:848 > #7 0xc01e91fe in pipe_build_write_buffer (wpipe=0xfbf2ba80, > uio=0xffbe1ed0) > at /usr/src/sys/kern/sys_pipe.c:594 > #8 0xc01e93c4 in pipe_direct_write (wpipe=0xfbf2ba80, uio=0xffbe1ed0) > at /usr/src/sys/kern/sys_pipe.c:709 > #9 0xc01e9766 in pipe_write (fp=0xcb801000, uio=0xffbe1ed0, > cred=0xc875cc00, flags=0, > p=0xfc001080) at /usr/src/sys/kern/sys_pipe.c:827 > #10 0xc01e7ae9 in dofilewrite (p=0xfc001080, fp=0xcb801000, fd=1, > buf=0x805b000, > nbyte=16384, offset=-1, flags=0) at /usr/src/sys/sys/file.h:163 > #11 0xc01e79a2 in write (p=0xfc001080, uap=0xffbe1f80) > at /usr/src/sys/kern/sys_generic.c:329 > #12 0xc035a809 in syscall2 (frame={tf_fs = 47, tf_es = 47, tf_ds = 47, > tf_edi = 134590464, > tf_esi = 672071960, tf_ebp = -1077937248, tf_isp = -4317228, tf_ebx > = 672072428, > tf_edx = 672071960, tf_ecx = 0, tf_eax = 4, tf_trapno = 7, tf_err = > 2, > tf_eip = 672025636, tf_cs = 31, tf_eflags = 663, tf_esp = > -1077937292, tf_ss = 47}) > at /usr/src/sys/i386/i386/trap.c:1175 > #13 0xc034517b in Xint0x80_syscall () > #14 0x280e2902 in ?? () > #15 0x280e2871 in ?? () > #16 0x280df756 in ?? () > #17 0x28088fb5 in ?? () > #18 0x804b81f in ?? () > #19 0x804a926 in ?? () > #20 0x8048f96 in ?? () > > ================================================================================ > > > > > _______________________________________________ > freebsd-questions@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-questions > To unsubscribe, send any mail to > "freebsd-questions-unsubscribe@freebsd.org"