Date: Sun, 20 Jun 2010 12:42:33 +1000 From: Lawrence Stewart <lstewart@freebsd.org> To: Fabian Keil <freebsd-listen@fabiankeil.de> Cc: freebsd-current@freebsd.org Subject: Re: [CFT] SIFTR - Statistical Information For TCP Research: Uncle Lawrence needs YOU! Message-ID: <4C1D8019.5060206@freebsd.org> In-Reply-To: <20100619195823.53a7baaa@r500.local> References: <4C1492D0.6020704@freebsd.org> <4C1C3922.2050102@freebsd.org> <20100619195823.53a7baaa@r500.local>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi Fabian, Thank you for the the report. This is indeed an issue I've never seen before and exactly the sort of thing I wanted to uncover. On 06/20/10 03:58, Fabian Keil wrote: > Lawrence Stewart<lstewart@freebsd.org> wrote: > >> On 06/13/10 18:12, Lawrence Stewart wrote: > >>> The time has come to solicit some external testing for my SIFTR tool. >>> I'm hoping to commit it within a week or so unless problems are discovered. > >>> I'm interested in all feedback and reports of success/failure, along >>> with details of the architecture tested and number of CPUs if you would >>> be so kind. > > I got the following hand-transcribed panic maybe a second after > sysctl net.inet.siftr.enabled=1 > > Fatal trap 12: page fault while in kernel mode > cpuid = 1; apic id = 01 > [...] > current process = 12 (swi4: clock) > [ thread pid 12 tid 100006 ] > Stopped at siftr_chkpkt+0xd0: addq $0x1,0x8(%r14) > db> where > Tracing pid 12 tid 100006 td 0xffffff00034037e0 > siftr_chkpt() at siftr_chkpkt+0xd0 > pfil_run_hooks() at pfil_run_hooks+0xb4 > ip_output() at ip_output+0x382 > tcp_output() tcp_output+0xa41 > tcp_timer_rexmt() at tcp_timer_rexmt+0x251 > softclock() at softclock+0x291 > intr_event_execute_handlers() at intr_event_execute_handlers+0x66 > ithread_loop at ithread_loop+0x8e > fork_exit() at fork_exit+0x112 > fork_trampoline() at fork_trampoline+0xe > --- trap 0, rip = 0, rsp = 0xffffff800003ad30, rbp = 0 --- hmm I'd love to know which line of code siftr_chkpkt+0xd0 maps to. Let me read through the function carefully and see if I can spot an obvious null ptr deref. The hook function has received some major rototilling of late to get it ready for the import so I must have missed something. > This is from the third attempt, the second time I got a different > backtrace that also contained some *_iwn_* functions, the first > time I had X running, so I didn't get anything. Unfortunately > at that point the system seems to be too busted to dump core. Typically, packets are direct dispatched into the stack from the driver so it is normal to see driver functions in a thread's stack trace when it's executing in the siftr pfil hook. > I'm using: > FreeBSD 9.0-CURRENT #99 r+b768fe1: Sat Jun 19 15:01:37 CEST 2010 > fk@r500.local:/usr/obj/usr/src/sys/ZOEY amd64 > Timecounter "i8254" frequency 1193182 Hz quality 0 > CPU: Intel(R) Core(TM)2 Duo CPU T5870 @ 2.00GHz (1995.01-MHz K8-class CPU) > Origin = "GenuineIntel" Id = 0x6fd Family = 6 Model = f Stepping = 13 > Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> > Features2=0xe39d<SSE3,DTES64,MON,DS_CPL,EST,TM2,SSSE3,CX16,xTPR,PDCM> > AMD Features=0x20100800<SYSCALL,NX,LM> > AMD Features2=0x1<LAHF> > TSC: P-state invariant > real memory = 2147483648 (2048 MB) > avail memory = 1976610816 (1885 MB) > ACPI APIC Table:<LENOVO TP-7Y> > FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs > FreeBSD/SMP: 1 package(s) x 2 core(s) > cpu0 (BSP): APIC ID: 0 > cpu1 (AP): APIC ID: 1 > ioapic0: Changing APIC ID to 1 > ioapic0<Version 2.0> irqs 0-23 on motherboard > > I'm not using vanilla sources, but none of the modifications > should matter here. Yes this does not look like an issue with your sources but with the siftr code itself. Don't bother testing with GENERIC yet as I'm confident you've given me enough info to track this down. > I have powerd running and did not yet try without it. > > The system has bge0 and iwn0, but bge0 is mainly down. > > pf is compiled into the kernel, siftr is loaded as a module. > > The panic seems to occur without logging a single packet first: > fk@r500 ~ $cat /var/log/siftr.log > enable_time_secs=1276966161 enable_time_usecs=945080 siftrver=1.2.3 hz=100 tcp_rtt_scale=32 sysname=FreeBSD sysver=900014 ipmode=4 > enable_time_secs=1276966586 enable_time_usecs=314023 siftrver=1.2.3 hz=100 tcp_rtt_scale=32 sysname=FreeBSD sysver=900014 ipmode=4 > > I get the impression that this is reproducible, but only tried > three times (the last time with everything mounted read-only). Thanks again for the report and I'll be in touch as soon as I get a chance to look at it some more (hopefully later today). Cheers, Lawrence
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4C1D8019.5060206>