Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 23 Oct 2014 14:12:44 -0700
From:      Jason Wolfe <nitroboost@gmail.com>
To:        John Baldwin <jhb@freebsd.org>
Cc:        Sean Bruno <sbruno@llnw.com>, freebsd-net@freebsd.org
Subject:   Re: ixgbe(4) spin lock held too long
Message-ID:  <CAAAm0r0ZFnvhN8tapRgiu6=cb2PKxuatiRFsf8=apdLz1zGVzQ@mail.gmail.com>
In-Reply-To: <1569387.ZCJSvuukWl@ralph.baldwin.cx>
References:  <1410203348.1343.1.camel@bruno> <201410161523.32415.jhb@freebsd.org> <CAAAm0r2Y359AtNyHrZ6J0TVLiws3ZTcfeYdfCimUZ8e1yHf5oA@mail.gmail.com> <1569387.ZCJSvuukWl@ralph.baldwin.cx>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Oct 18, 2014 at 4:42 AM, John Baldwin <jhb@freebsd.org> wrote:
> On Friday, October 17, 2014 11:32:13 PM Jason Wolfe wrote:
>> Producing 10G of random traffic against a server with this assertion
>> added took about 2 hours to panic, so if it turns out we need anything
>> further it should be pretty quick.
>>
>> #4 list
>> 2816                     * timer and remember to restart (more output or persist).
>> 2817                     * If there is more data to be acked, restart retransmit
>> 2818                     * timer, using current (possibly backed-off) value.
>> 2819                     */
>> 2820                    if (th->th_ack == tp->snd_max) {
>> 2821                            tcp_timer_activate(tp, TT_REXMT, 0);
>> 2822                            needoutput = 1;
>> 2823                    } else if (!tcp_timer_active(tp, TT_PERSIST))
>> 2824                            tcp_timer_activate(tp, TT_REXMT, tp->t_rxtcur);
>
> Bah, this is just a bug in my assertion.  Rather than having a separate
> tcp_timer_deactivate() routine, a delta of 0 passed to tcp_timer_activate()
> means "stop the timer".  My assertions were incorrect and need to exclude the
> stop case.  Here is an updated patch (or you can just fix yours locally):
>
> Index: tcp_timer.c
> ===================================================================
> --- tcp_timer.c (revision 273219)
> +++ tcp_timer.c (working copy)
> @@ -869,10 +869,16 @@ tcp_timer_activate(struct tcpcb *tp, int timer_typ
>                 case TT_REXMT:
>                         t_callout = &tp->t_timers->tt_rexmt;
>                         f_callout = tcp_timer_rexmt;
> +                       if (callout_active(&tp->t_timers->tt_persist) &&
> +                           delta != 0)
> +                               panic("scheduling retransmit with persist active");
>                         break;
>                 case TT_PERSIST:
>                         t_callout = &tp->t_timers->tt_persist;
>                         f_callout = tcp_timer_persist;
> +                       if (callout_active(&tp->t_timers->tt_rexmt) &&
> +                           delta != 0)
> +                               panic("scheduling persist with retransmit active");
>                         break;
>                 case TT_KEEP:
>                         t_callout = &tp->t_timers->tt_keep;
>
>
> --
> John Baldwin

John,

panic: tcp_setpersist: retransmit pending

(kgdb) bt
#0  doadump (textdump=1) at pcpu.h:219
#1  0xffffffff806facb1 in kern_reboot (howto=260) at
/usr/src/sys/kern/kern_shutdown.c:452
#2  0xffffffff806fb014 in panic (fmt=<value optimized out>) at
/usr/src/sys/kern/kern_shutdown.c:759
#3  0xffffffff808467d3 in tcp_setpersist (tp=<value optimized out>) at
/usr/src/sys/netinet/tcp_output.c:1619
#4  0xffffffff8084e7b6 in tcp_timer_persist (xtp=0xfffff804ec124c00)
at /usr/src/sys/netinet/tcp_timer.c:467
#5  0xffffffff8070d95e in softclock_call_cc (c=0xfffff804ec124ec0,
cc=0xffffffff81263380, direct=0)
    at /usr/src/sys/kern/kern_timeout.c:687
#6  0xffffffff8070dce4 in softclock (arg=<value optimized out>) at
/usr/src/sys/kern/kern_timeout.c:816
#7  0xffffffff806d16f3 in intr_event_execute_handlers (p=<value
optimized out>, ie=0xfffff80015214400)
    at /usr/src/sys/kern/kern_intr.c:1263
#8  0xffffffff806d2056 in ithread_loop (arg=0xfffff800151f7ee0) at
/usr/src/sys/kern/kern_intr.c:1276
#9  0xffffffff806cf481 in fork_exit (callout=0xffffffff806d1fc0
<ithread_loop>, arg=0xfffff800151f7ee0,
    frame=0xfffffe1f9e9b0ac0) at /usr/src/sys/kern/kern_fork.c:996
#10 0xffffffff80a67c0e in fork_trampoline () at
/usr/src/sys/amd64/amd64/exception.S:606

(kgdb) frame 3
#3  0xffffffff808467d3 in tcp_setpersist (tp=<value optimized out>) at
/usr/src/sys/netinet/tcp_output.c:1619
1619                 panic("tcp_setpersist: retransmit pending");
(kgdb) list
1614            int t = ((tp->t_srtt >> 2) + tp->t_rttvar) >> 1;
1615            int tt;
1616
1617            tp->t_flags &= ~TF_PREVVALID;
1618            if (tcp_timer_active(tp, TT_REXMT))
1619                 panic("tcp_setpersist: retransmit pending");
1620            /*
1621             * Start/restart persistance timer.
1622             */
1623            TCPT_RANGESET(tt, t * tcp_backoff[tp->t_rxtshift],

(kgdb) up
#4  0xffffffff8084e7b6 in tcp_timer_persist (xtp=0xfffff804ec124c00)
at /usr/src/sys/netinet/tcp_timer.c:467
467             tcp_setpersist(tp);
(kgdb) list
462                 (ticks - tp->t_rcvtime) >= TCPTV_PERSMAX) {
463                  TCPSTAT_INC(tcps_persistdrop);
464                  tp = tcp_drop(tp, ETIMEDOUT);
465                  goto out;
466             }
467             tcp_setpersist(tp);
468             tp->t_flags |= TF_FORCEDATA;
469             (void) tcp_output(tp);
470             tp->t_flags &= ~TF_FORCEDATA;

Jason



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAAAm0r0ZFnvhN8tapRgiu6=cb2PKxuatiRFsf8=apdLz1zGVzQ>