From owner-freebsd-hackers@FreeBSD.ORG Wed Sep 7 12:53:18 2005 Return-Path: X-Original-To: freebsd-hackers@freebsd.org Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 04AB816A426 for ; Wed, 7 Sep 2005 12:53:17 +0000 (GMT) (envelope-from vd@datamax.bg) Received: from jengal.datamax.bg (jengal.datamax.bg [82.103.104.21]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3389B43D45 for ; Wed, 7 Sep 2005 12:53:16 +0000 (GMT) (envelope-from vd@datamax.bg) Received: from sinanica.bg.datamax (sinanica.bg.datamax [192.168.10.1]) by jengal.datamax.bg (Postfix) with QMQP id 0909F87C8 for ; Wed, 7 Sep 2005 15:53:16 +0300 (EEST) Received: (nullmailer pid 56248 invoked by uid 1004); Wed, 07 Sep 2005 12:53:15 -0000 Date: Wed, 7 Sep 2005 15:53:15 +0300 From: Vasil Dimov To: freebsd-hackers@freebsd.org Message-ID: <20050907125315.GA56206@sinanica.bg.datamax> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-OS: FreeBSD 5.4-STABLE User-Agent: Mutt/1.5.10i Subject: panic: mutex "ipf state entry" already initialized X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: vd@datamax.bg List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Sep 2005 12:53:18 -0000 Hi, all I have submitted a PR for this (kern/85809), but I guess someone on this list may have some explanation: After adding: options WITNESS options INVARIANTS options INVARIANT_SUPPORT options MUTEX_DEBUG to my kernel config (for bug hunting, related to disk io->crash) I got 100% reproduceable panic at boot time, just after "starting ipmon": panic: mutex "ipf state entry" 0xc14f18fc already initialized So I found that the problem is that the `isn' structure, passed as arg to fr_stinsert() at ip_state.c:531 has the INITIALIZED flag set in its is_lock.mtx_object.lo_flags member. The problem comes from: ip_state.c:467: error = IRCOPYPTR(data, (caddr_t)&ips, sizeof(ips)); after the "copy" (int)(((ipstate_save_t *)data)->ips_is.is_lock.mtx_object.lo_flags) is zero, while (int)ips.ips_is.is_lock.mtx_object.lo_flags is not (it is 0x30000)! This is 100% verified, I compared all the 628 bytes in `data' and `&ips' before and after IRCOPYPTR - `data' was the same before and after (ofcourse), while the 628 bytes in `&ips' were changed after the copy, but had nothing similar with the ones in `data'. I was able to use kgdb, and get the backtrace and all, but after some recompilation/crash I got: # kgdb kernel.debug /var/crash/vmcore.16 kgdb: cannot read PTD # So I cannot send the backtrace (any suggestions?). Btw I think this is strange: # grep -ri memcpy_fromfs /usr/src/ 2>/dev/null /usr/src/sys/contrib/ipfilter/netinet/ip_compat.h: memcpy_fromfs(b, a, c); \ /usr/src/sys/contrib/ipfilter/netinet/ip_compat.h: memcpy_fromfs(ca, a, sizeof(ca)); /usr/src/sys/contrib/ipfilter/netinet/ip_compat.h: memcpy_fromfs(b, ca, c); /usr/src/sys/contrib/ipfilter/netinet/ip_compat.h: memcpy_fromfs(ca, b, sizeof(ca)); # Where does memcpy_fromfs come from? The blind fix I have attached seems to work on my system... uname -a FreeBSD kutelo1.bg.datamax 5.4-STABLE FreeBSD 5.4-STABLE #15: Tue Sep 6 21:31:35 EEST 2005 root@kutelo1.bg.datamax:/usr/src/sys/i386/compile/KUTELO1 i386 --- info.16 begins here --- Dump header from device /dev/ad0s1b Architecture: i386 Architecture Version: 16777216 Dump Length: 134152192B (127 MB) Blocksize: 512 Dumptime: Tue Sep 6 20:36:12 2005 Hostname: kutelo1.bg.datamax Magic: FreeBSD Kernel Dump Version String: FreeBSD 5.4-STABLE #12: Fri Sep 2 22:23:43 EEST 2005 root@kutelo1.bg.datamax:/usr/src/sys/i386/compile/KUTELO1 Panic String: mutex "ipf state entry" 0xc14f18fc already initialized Dump Parity: 3483033950 Bounds: 16 Dump Status: good --- info.16 ends here --- --- dmesg begins here --- Copyright (c) 1992-2005 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 5.4-STABLE #15: Tue Sep 6 21:31:35 EEST 2005 root@kutelo1.bg.datamax:/usr/src/sys/i386/compile/KUTELO1 WARNING: WITNESS option enabled, expect reduced performance. Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: AMD Duron(tm) processor (800.04-MHz 686-class CPU) Origin = "AuthenticAMD" Id = 0x631 Stepping = 1 Features=0x183f9ff AMD Features=0xc0440000 real memory = 134152192 (127 MB) avail memory = 121413632 (115 MB) npx0: on motherboard npx0: INT 16 interface acpi0: on motherboard acpi0: Power Button (fixed) Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x4008-0x400b on acpi0 cpu0: on acpi0 acpi_throttle0: on cpu0 acpi_button0: on acpi0 acpi_button1: on acpi0 pcib0: port 0x6000-0x607f,0x5000-0x500f,0x4080-0x40ff,0x4000-0x407f,0xcf8-0xcff on acpi0 pci0: on pcib0 agp0: mem 0xe0000000-0xe3ffffff at device 0.0 on pci0 pcib1: at device 1.0 on pci0 pci1: on pcib1 pci1: at device 0.0 (no driver attached) isab0: at device 7.0 on pci0 isa0: on isab0 atapci0: port 0xe000-0xe00f,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 at device 7.1 on pci0 atapci0: Correcting VIA config for southbridge data corruption bug ata0: channel #0 on atapci0 ata1: channel #1 on atapci0 uhci0: port 0xe400-0xe41f irq 12 at device 7.2 on pci0 usb0: on uhci0 usb0: USB revision 1.0 uhub0: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 2 ports with 2 removable, self powered uhci1: port 0xe800-0xe81f irq 12 at device 7.3 on pci0 usb1: on uhci1 usb1: USB revision 1.0 uhub1: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub1: 2 ports with 2 removable, self powered pci0: at device 7.4 (no driver attached) rl0: port 0xec00-0xecff mem 0xe7000000-0xe70000ff irq 11 at device 13.0 on pci0 miibus0: on rl0 rlphy0: on miibus0 rlphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto rl0: Ethernet address: 00:c0:26:5e:87:f2 fdc0: port 0x3f7,0x3f0-0x3f5 irq 6 drq 2 on acpi0 atkbdc0: port 0x64,0x60 irq 1 on acpi0 atkbd0: irq 1 on atkbdc0 orm0: at iomem 0xc0000-0xc7fff on isa0 sc0: at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> sio0: configured irq 4 not in bitmap of probed irqs 0 sio0: port may not be enabled sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0 sio0: type 8250 or not responding sio1: configured irq 3 not in bitmap of probed irqs 0 sio1: port may not be enabled vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 ums0: Logitech USB-PS/2 Optical Mouse, rev 2.00/11.10, addr 2, iclass 3/1 ums0: 3 buttons and Z dir. Timecounter "TSC" frequency 800036114 Hz quality 800 Timecounters tick every 20.000 msec IP Filter: v3.4.35 initialized. Default = block all, Logging = enabled witness_get: witness exhausted ipfw2 initialized, divert enabled, rule-based forwarding disabled, default to accept, logging unlimited ad0: 38166MB [77545/16/63] at ata0-master UDMA100 acd0: CDROM at ata1-master PIO4 Mounting root from ufs:/dev/ad0s1a --- dmesg ends here --- --- ip_state.c.diff begins here --- --- sys/contrib/ipfilter/netinet/ip_state.c.orig Tue Sep 6 22:13:21 2005 +++ sys/contrib/ipfilter/netinet/ip_state.c Tue Sep 6 22:13:44 2005 @@ -464,9 +464,7 @@ frentry_t *fr; char *name; - error = IRCOPYPTR(data, (caddr_t)&ips, sizeof(ips)); - if (error) - return error; + bcopy(data, &ips, sizeof(ips)); KMALLOC(isn, ipstate_t *); if (isn == NULL) --- ip_state.c.diff ends here ---