From owner-freebsd-bugs Mon Sep 9 8:40:20 2002 Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B831637B401 for ; Mon, 9 Sep 2002 08:40:03 -0700 (PDT) Received: from freefall.freebsd.org (freefall.FreeBSD.org [216.136.204.21]) by mx1.FreeBSD.org (Postfix) with ESMTP id F13A643E42 for ; Mon, 9 Sep 2002 08:40:02 -0700 (PDT) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.12.4/8.12.4) with ESMTP id g89Fe2JU023289 for ; Mon, 9 Sep 2002 08:40:02 -0700 (PDT) (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.12.4/8.12.4/Submit) id g89Fe2VR023288; Mon, 9 Sep 2002 08:40:02 -0700 (PDT) Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id CB5D337B400 for ; Mon, 9 Sep 2002 08:36:33 -0700 (PDT) Received: from www.freebsd.org (www.FreeBSD.org [216.136.204.117]) by mx1.FreeBSD.org (Postfix) with ESMTP id 77EF643E4A for ; Mon, 9 Sep 2002 08:36:33 -0700 (PDT) (envelope-from nobody@FreeBSD.org) Received: from www.freebsd.org (localhost [127.0.0.1]) by www.freebsd.org (8.12.4/8.12.4) with ESMTP id g89FaXOT093977 for ; Mon, 9 Sep 2002 08:36:33 -0700 (PDT) (envelope-from nobody@www.freebsd.org) Received: (from nobody@localhost) by www.freebsd.org (8.12.4/8.12.4/Submit) id g89FaXau093976; Mon, 9 Sep 2002 08:36:33 -0700 (PDT) Message-Id: <200209091536.g89FaXau093976@www.freebsd.org> Date: Mon, 9 Sep 2002 08:36:33 -0700 (PDT) From: Pawel Malachowski To: freebsd-gnats-submit@FreeBSD.org X-Send-Pr-Version: www-1.0 Subject: kern/42597: kernel panic, xl and bpf related Sender: owner-freebsd-bugs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org >Number: 42597 >Category: kern >Synopsis: kernel panic, xl and bpf related >Confidential: no >Severity: critical >Priority: medium >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Mon Sep 09 08:40:02 PDT 2002 >Closed-Date: >Last-Modified: >Originator: Pawel Malachowski >Release: 4.6.2-RELEASE >Organization: ZiN >Environment: FreeBSD gargantua.zin.ask 4.6.2-RELEASE FreeBSD 4.6.2-RELEASE #0: Sat Sep 7 16:47:11 CEST 2002 root@gargantua.zin.ask:/usr/obj/usr/src/sys/PM-UX-AUTO i386 >Description: Looks similar to kern/30952, kern/31710 -- but I'm not sure. My machine crashes from time to time (typical uptime is 1-4 days). This is a Celeron 950MHz on ASUS TUV4X with 1 RealTek 8139C and two 3Com 905B NICs. Kernel config is a GENERIC with additional options: pseudo-device ccd 4 device apm pseudo-device speaker pseudo-device snp 3 options IPFIREWALL options IPFIREWALL_VERBOSE options IPFIREWALL_FORWARD options IPFIREWALL_VERBOSE_LIMIT=100 options IPFIREWALL_DEFAULT_TO_ACCEPT options IPFILTER options IPFILTER_LOG options IPFILTER_DEFAULT_BLOCK options QUOTA options DUMMYNET options SHMMAXPGS=65536 options SEMMNI=40 options SEMMNS=240 options SEMUME=40 options SEMMNU=120 options IPX options NCP options NWFS options ETHER_8023 pseudo-device tap options HZ=1000 All NMB* and other options are auto-sized, this machine has 256MB of memory. panicstr: page fault panic messages: --- Fatal trap 12: page fault while in kernel mode fault virtual address = 0xa111351 fault code = supervisor read, page not present instruction pointer = 0x8:0xc02f5ccc stack pointer = 0x10:0xcdf7edec frame pointer = 0x10:0xcdf7edf8 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 14899 (trafshow) interrupt mask = net tty trap number = 12 panic: page fault syncing disks... 4 done Uptime: 1d18h3m58s dumping to dev #ad/0x30001, offset 1573024 dump ata0: resetting devices .. done // cut --- #0 dumpsys () at /usr/src/sys/kern/kern_shutdown.c:487 487 if (dumping++) { #0 dumpsys () at /usr/src/sys/kern/kern_shutdown.c:487 #1 0xc01fb2c3 in boot (howto=256) at /usr/src/sys/kern/kern_shutdown.c:316 #2 0xc01fb6e8 in poweroff_wait (junk=0xc03eb16c, howto=-1069634417) at /usr/src/sys/kern/kern_shutdown.c:595 #3 0xc0379d92 in trap_fatal (frame=0xcdf7edac, eva=168891217) at /usr/src/sys/i386/i386/trap.c:966 #4 0xc0379a65 in trap_pfault (frame=0xcdf7edac, usermode=0, eva=168891217) at /usr/src/sys/i386/i386/trap.c:859 #5 0xc0379623 in trap (frame={tf_fs = -841351152, tf_es = 16, tf_ds = -839450608, tf_edi = -1058681600, tf_esi = 6687744, tf_ebp = -839389704, tf_isp = -839389736, tf_ebx = -1058681600, tf_edx = -1058799616, tf_ecx = 168891217, tf_eax = 599456, tf_trapno = 12, tf_err = 0, tf_eip = -1070637876, tf_cs = 8, tf_eflags = 66050, tf_esp = -1058205184, tf_ss = -1051000152}) at /usr/src/sys/i386/i386/trap.c:458 #6 0xc02f5ccc in xl_newbuf (sc=0xc15b0000, c=0xc15b02a8) at /usr/src/sys/pci/if_xl.c:1727 #7 0xc02f5e82 in xl_rxeof (sc=0xc15b0000) at /usr/src/sys/pci/if_xl.c:1826 #8 0xc02f65a4 in xl_intr (arg=0xc15b0000) at /usr/src/sys/pci/if_xl.c:2061 #9 0xc03845f9 in intr_mux (arg=0xc0e35160) at /usr/src/sys/i386/isa/intr_machdep.c:582 #10 0xc036c646 in vec11 () #11 0xc0201145 in softclock () at /usr/src/sys/kern/kern_timeout.c:131 #12 0xc036c553 in doreti_swi () #13 0xc036b135 in Xint0x80_syscall () #14 0x2809c51b in ?? () #15 0x280a90ab in ?? () #16 0x804a7c0 in ?? () #17 0x804ac73 in ?? () #18 0x804ba06 in ?? () #19 0x280796b9 in ?? () #20 0x2807932f in ?? () #21 0x8049aaf in ?? () #22 0x8049659 in ?? () (kgdb) up 6 #6 0xc02f5ccc in xl_newbuf (sc=0xc15b0000, c=0xc15b02a8) at /usr/src/sys/pci/if_xl.c:1727 1727 MCLGET(m_new, M_DONTWAIT); (kgdb) list 1722 1723 MGETHDR(m_new, M_DONTWAIT, MT_DATA); 1724 if (m_new == NULL) 1725 return(ENOBUFS); 1726 1727 MCLGET(m_new, M_DONTWAIT); 1728 if (!(m_new->m_flags & M_EXT)) { 1729 m_freem(m_new); 1730 return(ENOBUFS); 1731 } I can provide more info from gdb upon request. >How-To-Repeat: Try to work harder with 3Com 905B NIC on machine as described above. >Fix: Don't know. >Release-Note: >Audit-Trail: >Unformatted: >netstat -m -M vmcore.5 -N /usr/obj/usr/src/sys/PM-UX-AUTO/kernel.debug 10/416/10048 mbufs in use (current/peak/max): 10 mbufs allocated to data 9/356/2512 mbuf clusters in use (current/peak/max) 816 Kbytes allocated to network (10% of mb_map in use) 0 requests for memory denied 0 requests for memory delayed 0 calls to protocol drain routines There are Ierrors: >netstat -i -M vmcore.5 -N /usr/obj/usr/src/sys/PM-UX-AUTO/kernel.debug | grep Link#2 xl0 1500 00:a0:24:aa:1b:95 103961268 970 94181546 0 0 This NIC was replaced with other one, Ierrors are still there. This NIC is connected to the NWay store-and-forward switch. Switch was replaced with other store-and-forward one. Even if this is a cable fault, this can't produce kernel panic. ;) It's easy to reach a 9-10MBytes/s in both directions on this interface even while Ierrors counter is increasing. There were 5 kernel panics in the near past, 4 of them were similar to this and 1 was acquire_lock() related. Let's see at these 4: current process in panic message always points to trafd or trafshow process (note, both are bpf processes). Trafshow was r ecompiled with increased #define MAX_PAGES, there are usually about 30-40 trafshow pages while program is running. Instruction pointer in panic message always points to xl_newbuf(). Machine acts as a router. Most of the traffic goes from xl0 interface to rl0 inteface. xl0 is a 100Mbit full-duplex, rl0 is forced to work at 10Mbit with full-duplex. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-bugs" in the body of the message