Date: Wed, 7 Jul 2004 23:47:30 -0400 (EDT) From: Robert Watson <rwatson@freebsd.org> To: Wiktor Niesiobedzki <bsd@w.evip.pl> Cc: current@freebsd.org Subject: Re: LORs with ipfw Message-ID: <Pine.NEB.3.96L.1040707234245.56368C-100000@fledge.watson.org> In-Reply-To: <20040707214417.GF26768@mail.evip.pl>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 7 Jul 2004, Wiktor Niesiobedzki wrote: > lock order reversal > 1st 0xc07287c8 IPFW static rules (IPFW static rules) @ /usr/src/sys/netinet/ip_fw2.c:1828 > 2nd 0xc065cfcc tcp (tcp) @ /usr/src/sys/netinet/ip_fw2.c:1574 > Stack backtrace: > backtrace(c05ec5a7,c065cfcc,c05ec12e,c05ec12e,c0726a3c) at backtrace+0x17 > witness_checkorder(c065cfcc,9,c0726a3c,626,806) at witness_checkorder+0x678 > _mtx_lock_flags(c065cfcc,0,c0726a3c,626,0) at _mtx_lock_flags+0x80 > check_uidgid(c15610a4,6,0,e08d1f53,1bd) at check_uidgid+0xd3 > ipfw_chk(cb9b6bf4,cb9b6c48,c1189014,1,0) at ipfw_chk+0x9e2 > ip_input(c1395c00,0,c071c576,1d0,0) at ip_input+0x375 > transmit_event(c1510c00,0,c071c576,300,2) at transmit_event+0x14b > dummynet(0,0,c05ea27a,f6,1) at dummynet+0x1a9 > softclock(0,0,c05e6b67,263,c0631d40) at softclock+0x1aa > ithread_loop(c10dd500,cb9b6d48,c05e695e,327,c10dd500) at ithread_loop+0x172 > fork_exit(c04a5b80,c10dd500,cb9b6d48) at fork_exit+0xbc > fork_trampoline() at fork_trampoline+0x8 > > This is from yesterdays CURRENT. I have compiled kernel with > CPUTYPE=athlon-xp and CFLAGS=-O2. Currently I'm not able to reproduce > this messages with CPUTYPE=i686 and empty CFLAGS. > > Does anyone has an clue, where the problem may lie here (or is it just > harmless?) This is a warning about a potentially harmful, but somewhat harder to fix issue. Basically, we currently have what amounts to a subsystem or giant lock over the ipfw rule set and its evaluation. Normally, the ipfw lock will fall "after" most other locks, including protocol control block (pcb) locks, as it will be called from other protocol code during processing. However, when using a uid/gid rule, the protocol control block for the packet is looked up by the ipfw code, which acquires pcb locks after the ipfw lock. There are a few things to think about here: (1) This lock order reversal is really a result of a layering violation -- the ipfw code is acting on packets at the IP layer, and looking up the connection from the IP layer results in cross-layer transitions that don't fit the general model. (2) The lock order reversal occurs in a situation where a race condition also occurs -- the pcb may actually be looked up twice for inbound packets, once in ipfw, and then again for delivery. While it's somewhat unlikely, the pcb could change in that window. The window is stretched out through the use of functionality like dummynet. (3) One way to think about fixing this is to avoid the need to hold the ipfw lock across the entire execution of ipfw. I've been thinking about reference-counting the rule set, such that each instance of a thread entering the ipfw code sees the rule set as read-only and can access it lock-free once it has acquired a reference, releasing the reference on exit. For long rule sets, this would help reduce contention. You can imagine various variations on the model, such as per-cpu rule set instances, etc. There are some interesting challengs in dynamic state management, however. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects robert@fledge.watson.org Principal Research Scientist, McAfee Research
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.NEB.3.96L.1040707234245.56368C-100000>