From owner-freebsd-net@FreeBSD.ORG  Mon Oct 13 22:30:55 2014
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 365689C5;
 Mon, 13 Oct 2014 22:30:55 +0000 (UTC)
Received: from mail-wg0-x22f.google.com (mail-wg0-x22f.google.com
 [IPv6:2a00:1450:400c:c00::22f])
 (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 9148DA00;
 Mon, 13 Oct 2014 22:30:54 +0000 (UTC)
Received: by mail-wg0-f47.google.com with SMTP id x13so9694349wgg.30
 for <multiple recipients>; Mon, 13 Oct 2014 15:30:52 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type;
 bh=yNW2IzjEXw2GOqFP1+6ldI8TdghfExZgsSW4wMYwvAg=;
 b=cMffUNiKUuuYbMj69VJwrFlGIupnk6quAy9HMovyMZ5zEdLz+oJ4istkIAguF20dV7
 ccnR/7rjOJDOtQihwjhAcXXafxyvOho+9Cf2sMoC2mh0V0/x6EGaZXx+J88PyhlfzA06
 lP/FrahPnJU9zkASKpOxZhD+J9qyvlrTqkMVhIMjFzwbmAlTVaHxXphudEqisJUDv4oF
 WAm+00I9BZKbHenvXoZPlJxq98ND0kx4Zc8t+wXiLNM6gqYh6wIh+HYUlGfM83k6eeko
 YDC/taiGrZ/NpDyMcoJSYw5iQ0WNPLlmdkZsjJZCEjdi5NXjOo7k9JJFVRMsgPEv7AdE
 2ldQ==
MIME-Version: 1.0
X-Received: by 10.181.27.161 with SMTP id jh1mr1529783wid.75.1413239452673;
 Mon, 13 Oct 2014 15:30:52 -0700 (PDT)
Received: by 10.217.67.201 with HTTP; Mon, 13 Oct 2014 15:30:52 -0700 (PDT)
In-Reply-To: <CAAAm0r01DqqNLUr0Mu15o2RLLdgnKntnFNKrw8TJut8XE7KNOw@mail.gmail.com>
References: <1410203348.1343.1.camel@bruno>
 <2077446.sYcZo9xEXb@ralph.baldwin.cx>
 <CAAAm0r1N=F6wgroVJZx3CR-bzti45=VA==dv1VaVhx1hSufUEQ@mail.gmail.com>
 <3567780.Mf6fMnzmYG@ralph.baldwin.cx>
 <CAAAm0r01DqqNLUr0Mu15o2RLLdgnKntnFNKrw8TJut8XE7KNOw@mail.gmail.com>
Date: Mon, 13 Oct 2014 15:30:52 -0700
Message-ID: <CAAAm0r0+QiZ3avvChrPY=HFPtnOBfTVoGen+PLYz6TaFxc_5Nw@mail.gmail.com>
Subject: Re: ixgbe(4) spin lock held too long
From: Jason Wolfe <nitroboost@gmail.com>
To: John Baldwin <jhb@freebsd.org>
Content-Type: text/plain; charset=UTF-8
X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1
Cc: Sean Bruno <sbruno@llnw.com>, freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.18-1
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 13 Oct 2014 22:30:55 -0000

On Fri, Oct 10, 2014 at 11:19 PM, Jason Wolfe <nitroboost@gmail.com> wrote:

> On Fri, Oct 10, 2014 at 8:53 AM, John Baldwin <jhb@freebsd.org> wrote:
>
>> On Thursday, October 09, 2014 02:31:32 PM Jason Wolfe wrote:
>> > On Wed, Oct 8, 2014 at 12:29 PM, John Baldwin <jhb@freebsd.org> wrote:
>> > > My only other thought is if a direct timeout routine ran for a long
>> time.
>> > >
>> > > I just committed a change to current that can let you capture KTR
>> traces
>> > > of
>> > > callout routines for use with schedgraph (r272757).  Unfortunately,
>> > > enabling KTR_SCHED can be slow.  If you are up for trying it, I do
>> think
>> > > that
>> > > building a kernel with KTR and KTR_SCHED enabled
>> (KTR_COMPILE=KTR_SCHED
>> > > and
>> > > KTR_MASK=KTR_SCHED) with the kernel part of the commit I referenced
>> above
>> > > applied is the best bet to figure out why it is spinning so long.  If
>> you
>> > > can
>> > > try that (and if it doesn't slow things down too much) and get a panic
>> > > with
>> > > those options enabled, then capture the output of
>> > > 'ktrdump -e /path/to/kernel -m /var/crash/vmcore.X -ct', we can use
>> > > Src/tools/sched/schedgraph.py to look at that output to get a picture
>> of
>> > > what
>> > > was going on just prior to the crash.
>> > >
>> > > --
>> > > John Baldwin
>> >
>> > As luck would have it.. caught one of the boxes with the new KTR
>> > code/options after only an hour.  Crashed in the same way w tid 100003
>> and
>> > looking the same in callout_process
>> >
>> > spin lock 0xffffffff81262d00 (callout) held by 0xfffff800151fe000 (tid
>> > 100003) too long
>> > spin lock 0xffffffff81262d00 (callout) held by 0xfffff800151fe000 (tid
>> > 100003) too long
>> >
>> > #4 0xffffffff8070d6fa in callout_process (now=7915682202423) at
>> > /usr/src/sys/kern/kern_
>> > timeout.c:490
>> >
>> > The ktrdump oddly only seems to have the last 702, though
>> debug.ktr.entries
>> > is set to 1024.  It appears the buffer may also start after 100003 had
>> > already hung?  I've bumped debug.ktr.entries up in case we don't have
>> > enough history here.
>> >
>> > http://nitrology.com/ktrdump-spinlock.txt
>>
>> Hmm, schedgraph.py chokes on this, but it's a bit interesting.  It seems
>> that
>> in this time sample, CPUs 1, 2, 4, and 5 were constantly running ixgbe
>> interrupt handlers.  No actual thread state changes are logged (which is
>> why
>> schedgraph choked).
>>
>> Try setting the sysctl 'net.inet.tcp.per_cpu_timers' to 1 and see if that
>> makes a difference.  I'm guessing they are all contending on the default
>> callout lock which is causing your headache.
>>
>> --
>> John Baldwin
>>
>
> net.inet.tcp.per_cpu_timers=1 triggered some other fun :)  Saw this same
> panic across a handful of machines:
>
> panic: tcp_setpersist: retransmit pending
> cpuid = 3
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
> 0xfffffe1f9e9ab800
> panic() at panic+0x155/frame 0xfffffe1f9e9ab880
> tcp_setpersist() at tcp_setpersist+0xa3/frame 0xfffffe1f9e9ab8b0
> tcp_timer_persist() at tcp_timer_persist+0x176/frame 0xfffffe1f9e9ab8e0
> softclock_call_cc() at softclock_call_cc+0x1ce/frame 0xfffffe1f9e9ab9e0
> softclock() at softclock+0x44/frame 0xfffffe1f9e9aba00
> intr_event_execute_handlers() at intr_event_execute_handlers+0x83/frame
> 0xfffffe1f9e9aba30
> ithread_loop() at ithread_loop+0x96/frame 0xfffffe1f9e9aba70
> fork_exit() at fork_exit+0x71/frame 0xfffffe1f9e9abab0
> fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe1f9e9abab0
> --- trap 0, rip = 0, rsp = 0xfffffe1f9e9abb70, rbp = 0 ---
>
> (kgdb) up 3
> #3  0xffffffff808467d3 in tcp_setpersist (tp=<value optimized out>) at
> /usr/src/sys/netinet/tcp_output.c:1619
> 1619                    panic("tcp_setpersist: retransmit pending");
> (kgdb) list
> 1614            int t = ((tp->t_srtt >> 2) + tp->t_rttvar) >> 1;
> 1615            int tt;
> 1616
> 1617            tp->t_flags &= ~TF_PREVVALID;
> 1618            if (tcp_timer_active(tp, TT_REXMT))
> 1619                    panic("tcp_setpersist: retransmit pending");
> 1620            /*
> 1621             * Start/restart persistance timer.
> 1622             */
> 1623            TCPT_RANGESET(tt, t * tcp_backoff[tp->t_rxtshift],
>
> I have debug.ktr.entries set to 204800 on the machines compiled with KTR
> options, maybe a bit more history can provide some extra context.
>
> Jason
>

Picked up a KTR dump with 52k entries

spin lock 0xffffffff81262d00 (callout) held by 0xfffff81074dde000 (tid
100289) too long
(kgdb) tid 100289
[Switching to thread 241 (Thread 100289)]#0  0xffffffff80a71dc8 in
cpustop_handler () at /usr/src/sys/amd64/amd64/mp_machdep.c:1432
1432            savectx(&stoppcbs[cpu]);
(kgdb) bt
#0  0xffffffff80a71dc8 in cpustop_handler () at
/usr/src/sys/amd64/amd64/mp_machdep.c:1432
#1  0xffffffff80a71d8f in ipi_nmi_handler () at
/usr/src/sys/amd64/amd64/mp_machdep.c:1417
#2  0xffffffff80a8038a in trap (frame=0xffffffff811ef130) at
/usr/src/sys/amd64/amd64/trap.c:190
#3  0xffffffff80a67ae3 in nmi_calltrap () at
/usr/src/sys/amd64/amd64/exception.S:505
#4  0xffffffff8070d6fe in callout_process (now=1461472573876317) at
/usr/src/sys/kern/kern_timeout.c:490

http://nitrology.com/ktrdump-spinlock2.txt

Jason