From owner-freebsd-net@FreeBSD.ORG Mon Oct 13 22:30:55 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 365689C5; Mon, 13 Oct 2014 22:30:55 +0000 (UTC) Received: from mail-wg0-x22f.google.com (mail-wg0-x22f.google.com [IPv6:2a00:1450:400c:c00::22f]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 9148DA00; Mon, 13 Oct 2014 22:30:54 +0000 (UTC) Received: by mail-wg0-f47.google.com with SMTP id x13so9694349wgg.30 for ; Mon, 13 Oct 2014 15:30:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=yNW2IzjEXw2GOqFP1+6ldI8TdghfExZgsSW4wMYwvAg=; b=cMffUNiKUuuYbMj69VJwrFlGIupnk6quAy9HMovyMZ5zEdLz+oJ4istkIAguF20dV7 ccnR/7rjOJDOtQihwjhAcXXafxyvOho+9Cf2sMoC2mh0V0/x6EGaZXx+J88PyhlfzA06 lP/FrahPnJU9zkASKpOxZhD+J9qyvlrTqkMVhIMjFzwbmAlTVaHxXphudEqisJUDv4oF WAm+00I9BZKbHenvXoZPlJxq98ND0kx4Zc8t+wXiLNM6gqYh6wIh+HYUlGfM83k6eeko YDC/taiGrZ/NpDyMcoJSYw5iQ0WNPLlmdkZsjJZCEjdi5NXjOo7k9JJFVRMsgPEv7AdE 2ldQ== MIME-Version: 1.0 X-Received: by 10.181.27.161 with SMTP id jh1mr1529783wid.75.1413239452673; Mon, 13 Oct 2014 15:30:52 -0700 (PDT) Received: by 10.217.67.201 with HTTP; Mon, 13 Oct 2014 15:30:52 -0700 (PDT) In-Reply-To: References: <1410203348.1343.1.camel@bruno> <2077446.sYcZo9xEXb@ralph.baldwin.cx> <3567780.Mf6fMnzmYG@ralph.baldwin.cx> Date: Mon, 13 Oct 2014 15:30:52 -0700 Message-ID: Subject: Re: ixgbe(4) spin lock held too long From: Jason Wolfe To: John Baldwin Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: Sean Bruno , freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Oct 2014 22:30:55 -0000 On Fri, Oct 10, 2014 at 11:19 PM, Jason Wolfe wrote: > On Fri, Oct 10, 2014 at 8:53 AM, John Baldwin wrote: > >> On Thursday, October 09, 2014 02:31:32 PM Jason Wolfe wrote: >> > On Wed, Oct 8, 2014 at 12:29 PM, John Baldwin wrote: >> > > My only other thought is if a direct timeout routine ran for a long >> time. >> > > >> > > I just committed a change to current that can let you capture KTR >> traces >> > > of >> > > callout routines for use with schedgraph (r272757). Unfortunately, >> > > enabling KTR_SCHED can be slow. If you are up for trying it, I do >> think >> > > that >> > > building a kernel with KTR and KTR_SCHED enabled >> (KTR_COMPILE=KTR_SCHED >> > > and >> > > KTR_MASK=KTR_SCHED) with the kernel part of the commit I referenced >> above >> > > applied is the best bet to figure out why it is spinning so long. If >> you >> > > can >> > > try that (and if it doesn't slow things down too much) and get a panic >> > > with >> > > those options enabled, then capture the output of >> > > 'ktrdump -e /path/to/kernel -m /var/crash/vmcore.X -ct', we can use >> > > Src/tools/sched/schedgraph.py to look at that output to get a picture >> of >> > > what >> > > was going on just prior to the crash. >> > > >> > > -- >> > > John Baldwin >> > >> > As luck would have it.. caught one of the boxes with the new KTR >> > code/options after only an hour. Crashed in the same way w tid 100003 >> and >> > looking the same in callout_process >> > >> > spin lock 0xffffffff81262d00 (callout) held by 0xfffff800151fe000 (tid >> > 100003) too long >> > spin lock 0xffffffff81262d00 (callout) held by 0xfffff800151fe000 (tid >> > 100003) too long >> > >> > #4 0xffffffff8070d6fa in callout_process (now=7915682202423) at >> > /usr/src/sys/kern/kern_ >> > timeout.c:490 >> > >> > The ktrdump oddly only seems to have the last 702, though >> debug.ktr.entries >> > is set to 1024. It appears the buffer may also start after 100003 had >> > already hung? I've bumped debug.ktr.entries up in case we don't have >> > enough history here. >> > >> > http://nitrology.com/ktrdump-spinlock.txt >> >> Hmm, schedgraph.py chokes on this, but it's a bit interesting. It seems >> that >> in this time sample, CPUs 1, 2, 4, and 5 were constantly running ixgbe >> interrupt handlers. No actual thread state changes are logged (which is >> why >> schedgraph choked). >> >> Try setting the sysctl 'net.inet.tcp.per_cpu_timers' to 1 and see if that >> makes a difference. I'm guessing they are all contending on the default >> callout lock which is causing your headache. >> >> -- >> John Baldwin >> > > net.inet.tcp.per_cpu_timers=1 triggered some other fun :) Saw this same > panic across a handful of machines: > > panic: tcp_setpersist: retransmit pending > cpuid = 3 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame > 0xfffffe1f9e9ab800 > panic() at panic+0x155/frame 0xfffffe1f9e9ab880 > tcp_setpersist() at tcp_setpersist+0xa3/frame 0xfffffe1f9e9ab8b0 > tcp_timer_persist() at tcp_timer_persist+0x176/frame 0xfffffe1f9e9ab8e0 > softclock_call_cc() at softclock_call_cc+0x1ce/frame 0xfffffe1f9e9ab9e0 > softclock() at softclock+0x44/frame 0xfffffe1f9e9aba00 > intr_event_execute_handlers() at intr_event_execute_handlers+0x83/frame > 0xfffffe1f9e9aba30 > ithread_loop() at ithread_loop+0x96/frame 0xfffffe1f9e9aba70 > fork_exit() at fork_exit+0x71/frame 0xfffffe1f9e9abab0 > fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe1f9e9abab0 > --- trap 0, rip = 0, rsp = 0xfffffe1f9e9abb70, rbp = 0 --- > > (kgdb) up 3 > #3 0xffffffff808467d3 in tcp_setpersist (tp=) at > /usr/src/sys/netinet/tcp_output.c:1619 > 1619 panic("tcp_setpersist: retransmit pending"); > (kgdb) list > 1614 int t = ((tp->t_srtt >> 2) + tp->t_rttvar) >> 1; > 1615 int tt; > 1616 > 1617 tp->t_flags &= ~TF_PREVVALID; > 1618 if (tcp_timer_active(tp, TT_REXMT)) > 1619 panic("tcp_setpersist: retransmit pending"); > 1620 /* > 1621 * Start/restart persistance timer. > 1622 */ > 1623 TCPT_RANGESET(tt, t * tcp_backoff[tp->t_rxtshift], > > I have debug.ktr.entries set to 204800 on the machines compiled with KTR > options, maybe a bit more history can provide some extra context. > > Jason > Picked up a KTR dump with 52k entries spin lock 0xffffffff81262d00 (callout) held by 0xfffff81074dde000 (tid 100289) too long (kgdb) tid 100289 [Switching to thread 241 (Thread 100289)]#0 0xffffffff80a71dc8 in cpustop_handler () at /usr/src/sys/amd64/amd64/mp_machdep.c:1432 1432 savectx(&stoppcbs[cpu]); (kgdb) bt #0 0xffffffff80a71dc8 in cpustop_handler () at /usr/src/sys/amd64/amd64/mp_machdep.c:1432 #1 0xffffffff80a71d8f in ipi_nmi_handler () at /usr/src/sys/amd64/amd64/mp_machdep.c:1417 #2 0xffffffff80a8038a in trap (frame=0xffffffff811ef130) at /usr/src/sys/amd64/amd64/trap.c:190 #3 0xffffffff80a67ae3 in nmi_calltrap () at /usr/src/sys/amd64/amd64/exception.S:505 #4 0xffffffff8070d6fe in callout_process (now=1461472573876317) at /usr/src/sys/kern/kern_timeout.c:490 http://nitrology.com/ktrdump-spinlock2.txt Jason