Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 20 Jun 2010 12:42:33 +1000
From:      Lawrence Stewart <lstewart@freebsd.org>
To:        Fabian Keil <freebsd-listen@fabiankeil.de>
Cc:        freebsd-current@freebsd.org
Subject:   Re: [CFT] SIFTR - Statistical Information For TCP Research: Uncle Lawrence needs YOU!
Message-ID:  <4C1D8019.5060206@freebsd.org>
In-Reply-To: <20100619195823.53a7baaa@r500.local>
References:  <4C1492D0.6020704@freebsd.org> <4C1C3922.2050102@freebsd.org> <20100619195823.53a7baaa@r500.local>

next in thread | previous in thread | raw e-mail | index | archive | help
Hi Fabian,

Thank you for the the report. This is indeed an issue I've never seen 
before and exactly the sort of thing I wanted to uncover.

On 06/20/10 03:58, Fabian Keil wrote:
> Lawrence Stewart<lstewart@freebsd.org>  wrote:
>
>> On 06/13/10 18:12, Lawrence Stewart wrote:
>
>>> The time has come to solicit some external testing for my SIFTR tool.
>>> I'm hoping to commit it within a week or so unless problems are discovered.
>
>>> I'm interested in all feedback and reports of success/failure, along
>>> with details of the architecture tested and number of CPUs if you would
>>> be so kind.
>
> I got the following hand-transcribed panic maybe a second after
> sysctl net.inet.siftr.enabled=1
>
> Fatal trap 12: page fault while in kernel mode
> cpuid = 1; apic id = 01
> [...]
> current process = 12 (swi4: clock)
> [ thread pid 12 tid 100006 ]
> Stopped at	siftr_chkpkt+0xd0:	addq	$0x1,0x8(%r14)
> db>  where
> Tracing pid 12 tid 100006 td 0xffffff00034037e0
> siftr_chkpt() at siftr_chkpkt+0xd0
> pfil_run_hooks() at pfil_run_hooks+0xb4
> ip_output() at ip_output+0x382
> tcp_output() tcp_output+0xa41
> tcp_timer_rexmt() at tcp_timer_rexmt+0x251
> softclock() at softclock+0x291
> intr_event_execute_handlers() at intr_event_execute_handlers+0x66
> ithread_loop at ithread_loop+0x8e
> fork_exit() at fork_exit+0x112
> fork_trampoline() at fork_trampoline+0xe
> --- trap 0, rip = 0, rsp = 0xffffff800003ad30, rbp = 0 ---

hmm I'd love to know which line of code siftr_chkpkt+0xd0 maps to. Let 
me read through the function carefully and see if I can spot an obvious 
null ptr deref. The hook function has received some major rototilling of 
late to get it ready for the import so I must have missed something.

> This is from the third attempt, the second time I got a different
> backtrace that also contained some *_iwn_* functions, the first
> time I had X running, so I didn't get anything. Unfortunately
> at that point the system seems to be too busted to dump core.

Typically, packets are direct dispatched into the stack from the driver 
so it is normal to see driver functions in a thread's stack trace when 
it's executing in the siftr pfil hook.

> I'm using:
> FreeBSD 9.0-CURRENT #99 r+b768fe1: Sat Jun 19 15:01:37 CEST 2010
>      fk@r500.local:/usr/obj/usr/src/sys/ZOEY amd64
> Timecounter "i8254" frequency 1193182 Hz quality 0
> CPU: Intel(R) Core(TM)2 Duo CPU     T5870  @ 2.00GHz (1995.01-MHz K8-class CPU)
>    Origin = "GenuineIntel"  Id = 0x6fd  Family = 6  Model = f  Stepping = 13
>    Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
>    Features2=0xe39d<SSE3,DTES64,MON,DS_CPL,EST,TM2,SSSE3,CX16,xTPR,PDCM>
>    AMD Features=0x20100800<SYSCALL,NX,LM>
>    AMD Features2=0x1<LAHF>
>    TSC: P-state invariant
> real memory  = 2147483648 (2048 MB)
> avail memory = 1976610816 (1885 MB)
> ACPI APIC Table:<LENOVO TP-7Y>
> FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
> FreeBSD/SMP: 1 package(s) x 2 core(s)
>   cpu0 (BSP): APIC ID:  0
>   cpu1 (AP): APIC ID:  1
> ioapic0: Changing APIC ID to 1
> ioapic0<Version 2.0>  irqs 0-23 on motherboard
>
> I'm not using vanilla sources, but none of the modifications
> should matter here.

Yes this does not look like an issue with your sources but with the 
siftr code itself. Don't bother testing with GENERIC yet as I'm 
confident you've given me enough info to track this down.

> I have powerd running and did not yet try without it.
>
> The system has bge0 and iwn0, but bge0 is mainly down.
>
> pf is compiled into the kernel, siftr is loaded as a module.
>
> The panic seems to occur without logging a single packet first:
> fk@r500 ~ $cat /var/log/siftr.log
> enable_time_secs=1276966161     enable_time_usecs=945080        siftrver=1.2.3  hz=100  tcp_rtt_scale=32        sysname=FreeBSD sysver=900014   ipmode=4
> enable_time_secs=1276966586     enable_time_usecs=314023        siftrver=1.2.3  hz=100  tcp_rtt_scale=32        sysname=FreeBSD sysver=900014   ipmode=4
>
> I get the impression that this is reproducible, but only tried
> three times (the last time with everything mounted read-only).

Thanks again for the report and I'll be in touch as soon as I get a 
chance to look at it some more (hopefully later today).

Cheers,
Lawrence



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4C1D8019.5060206>