Date: Sun, 20 Jun 2010 20:27:34 +1000 From: Lawrence Stewart <lstewart@freebsd.org> To: Fabian Keil <freebsd-listen@fabiankeil.de> Cc: freebsd-current@freebsd.org Subject: Re: [CFT] SIFTR - Statistical Information For TCP Research: Uncle Lawrence needs YOU! Message-ID: <4C1DED16.8020209@freebsd.org> In-Reply-To: <20100619195823.53a7baaa@r500.local> References: <4C1492D0.6020704@freebsd.org> <4C1C3922.2050102@freebsd.org> <20100619195823.53a7baaa@r500.local>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi Fabian, On 06/20/10 03:58, Fabian Keil wrote: > Lawrence Stewart<lstewart@freebsd.org> wrote: > >> On 06/13/10 18:12, Lawrence Stewart wrote: > >>> The time has come to solicit some external testing for my SIFTR tool. >>> I'm hoping to commit it within a week or so unless problems are discovered. > >>> I'm interested in all feedback and reports of success/failure, along >>> with details of the architecture tested and number of CPUs if you would >>> be so kind. > > I got the following hand-transcribed panic maybe a second after > sysctl net.inet.siftr.enabled=1 > > Fatal trap 12: page fault while in kernel mode > cpuid = 1; apic id = 01 > [...] > current process = 12 (swi4: clock) > [ thread pid 12 tid 100006 ] > Stopped at siftr_chkpkt+0xd0: addq $0x1,0x8(%r14) > db> where > Tracing pid 12 tid 100006 td 0xffffff00034037e0 > siftr_chkpt() at siftr_chkpkt+0xd0 > pfil_run_hooks() at pfil_run_hooks+0xb4 > ip_output() at ip_output+0x382 > tcp_output() tcp_output+0xa41 > tcp_timer_rexmt() at tcp_timer_rexmt+0x251 > softclock() at softclock+0x291 > intr_event_execute_handlers() at intr_event_execute_handlers+0x66 > ithread_loop at ithread_loop+0x8e > fork_exit() at fork_exit+0x112 > fork_trampoline() at fork_trampoline+0xe > --- trap 0, rip = 0, rsp = 0xffffff800003ad30, rbp = 0 --- So I've tracked down the line of code where the page fault is occurring: if (dir == PFIL_IN) ss->n_in++; else ss->n_out++; ss is a DPCPU (dynamic per-cpu) variable used to keep a set of stats per-cpu and is initialised at the start of the function like so: ss = DPCPU_PTR(ss); So for ss to be NULL, that implies DPCPU_PTR() is returning NULL on your machine. I know very little about the inner workings of the DPCPU_* macros, but I'm pretty sure the way I use them in SIFTR is correct or at least as intended. Could you please go ahead and retest using a GENERIC kernel and see if you can reproduce? There could be something in your custom kernel causing the offsets or linker set magic used by the DPCPU bits to break which in turn is triggering this panic in SIFTR. Whether its your custom changes breaking DPCPU or DPCPU being fragile remains to be seen, but the good news for me is that it looks like SIFTR is off the hook :) Cheers, Lawrence
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4C1DED16.8020209>