From owner-freebsd-current@FreeBSD.ORG Mon Jun 21 01:32:43 2010 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EFF1F106566C for ; Mon, 21 Jun 2010 01:32:43 +0000 (UTC) (envelope-from lstewart@freebsd.org) Received: from lauren.room52.net (lauren.room52.net [210.50.193.198]) by mx1.freebsd.org (Postfix) with ESMTP id 7EC068FC12 for ; Mon, 21 Jun 2010 01:32:43 +0000 (UTC) Received: from lawrence1.loshell.room52.net (unknown [59.167.184.191]) by lauren.room52.net (Postfix) with ESMTPSA id E6F1D7E84A; Mon, 21 Jun 2010 11:32:41 +1000 (EST) Message-ID: <4C1EC139.3090903@freebsd.org> Date: Mon, 21 Jun 2010 11:32:41 +1000 From: Lawrence Stewart User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-AU; rv:1.9.1.9) Gecko/20100405 Thunderbird/3.0.4 MIME-Version: 1.0 To: Fabian Keil References: <4C1492D0.6020704@freebsd.org> <4C1C3922.2050102@freebsd.org> <20100619195823.53a7baaa@r500.local> <4C1DED16.8020209@freebsd.org> <20100620131544.495ddecd@r500.local> <4C1E019F.6060802@freebsd.org> <20100620142841.4803dac3@r500.local> <4C1E0E14.3090506@freebsd.org> <20100620151554.58e486db@r500.local> <20100620161242.59381341@r500.local> In-Reply-To: <20100620161242.59381341@r500.local> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-current@freebsd.org Subject: Re: [CFT] SIFTR - Statistical Information For TCP Research: Uncle Lawrence needs YOU! X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jun 2010 01:32:44 -0000 On 06/21/10 00:12, Fabian Keil wrote: > Fabian Keil wrote: > >> Lawrence Stewart wrote: >> >>> On 06/20/10 22:28, Fabian Keil wrote: > >>>> Taking pf (and altq) out of the picture doesn't seem to make >>>> a difference. >>> >>> Wouldn't have expected it to. Will be very curious to know if the panic >>> is triggered in GENERIC. >> >> It's not. I, too, get pfil.c related LORs though: >> >> lock order reversal: >> 1st 0xffffffff80e5c568 PFil hook read/write mutex (PFil hook read/write mutex) @ /usr/src/sys/net/pfil.c:77 >> 2nd 0xffffffff80e5dd68 udp (udp) @ /usr/src/sys/modules/pf/../../contrib/pf/net/pf.c:3035 >> KDB: stack backtrace: >> db_trace_self_wrapper() at db_trace_self_wrapper+0x2a >> _witness_debugger() at _witness_debugger+0x2e >> witness_checkorder() at witness_checkorder+0x81e >> _rw_rlock() at _rw_rlock+0x5f >> pf_socket_lookup() at pf_socket_lookup+0x1c5 >> pf_test_udp() at pf_test_udp+0x8b0 >> pf_test() at pf_test+0x1089 >> pf_check_in() at pf_check_in+0x39 >> pfil_run_hooks() at pfil_run_hooks+0xcf >> ip_input() at ip_input+0x2ae >> swi_net() at swi_net+0x151 >> intr_event_execute_handlers() at intr_event_execute_handlers+0x66 >> ithread_loop() at ithread_loop+0xb2 >> fork_exit() at fork_exit+0x12a >> fork_trampoline() at fork_trampoline+0xe >> --- trap 0, rip = 0, rsp = 0xffffff8000044d30, rbp = 0 --- >> lock order reversal: >> 1st 0xffffffff80e5c568 PFil hook read/write mutex (PFil hook read/write mutex) @ /usr/src/sys/net/pfil.c:77 >> 2nd 0xffffffff80e5d788 tcp (tcp) @ /usr/src/sys/modules/siftr/../../netinet/siftr.c:698 >> KDB: stack backtrace: >> db_trace_self_wrapper() at db_trace_self_wrapper+0x2a >> _witness_debugger() at _witness_debugger+0x2e >> witness_checkorder() at witness_checkorder+0x81e >> _rw_rlock() at _rw_rlock+0x5f >> siftr_chkpkt() at siftr_chkpkt+0x3c4 >> pfil_run_hooks() at pfil_run_hooks+0xcf >> ip_input() at ip_input+0x2ae >> swi_net() at swi_net+0x151 >> intr_event_execute_handlers() at intr_event_execute_handlers+0x66 >> ithread_loop() at ithread_loop+0xb2 >> fork_exit() at fork_exit+0x12a >> fork_trampoline() at fork_trampoline+0xe >> --- trap 0, rip = 0, rsp = 0xffffff8000044d30, rbp = 0 --- >> >> My custom kernel normally doesn't have INVARIANTS and WITNESS >> enabled, so I'll try to enable them next. > > The culprit seem to be non-default KTR settings in the kernel > while loading alq as a module. With the following change siftr > works with my non-GENERIC kernel, too: > > commit f43b8b5171c858df7b419f6a695e9e3b53531a8e > Author: Fabian Keil > Date: Sun Jun 20 15:43:01 2010 +0200 > > Disable KTR changes. > > diff --git a/sys/amd64/conf/ZOEY b/sys/amd64/conf/ZOEY > index 6fb3480..c584317 100644 > --- a/sys/amd64/conf/ZOEY > +++ b/sys/amd64/conf/ZOEY > @@ -16,11 +16,11 @@ options ATA_CAM > device atapicam > options SC_KERNEL_CONS_ATTR=(FG_GREEN|BG_BLACK) > > -options KTR > -options KTR_ENTRIES=262144 > -options KTR_COMPILE=(KTR_SCHED) > -options KTR_MASK=(KTR_SCHED) > -options KTR_CPUMASK=0x3 > +#options KTR > +#options KTR_ENTRIES=262144 > +#options KTR_COMPILE=(KTR_SCHED) > +#options KTR_MASK=(KTR_SCHED) > +#options KTR_CPUMASK=0x3 > > options ACCEPT_FILTER_HTTP > makeoptions WITH_CTF=yes This smells very fishy. Without "options KTR_ALQ", KTR shouldn't even care if ALQ exists or not. Not only that, but ALQ isn't even used in siftr_chkpkt and you clearly manage to successfully use ALQ to write the module load message to the log file. Hmmmm... Thanks for taking the time to find the culprit though - I'll see if I can reproduce here. Could you try another thing for me and see if reducing "options KTR_ENTRIES=262144" down to a smaller number (maybe 4096?) and leaving all the other KTR options as they are above (but uncommented) makes any difference? The ktr(4) man page indicates the default is 8192 entries and I'm curious if the your allocation of so many additional entries is making something unhappy. Thanks again for your time helping with this, I really appreciate it. Cheers, Lawrence