Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 8 Oct 2014 10:56:56 -0700
From:      Jason Wolfe <nitroboost@gmail.com>
To:        John Baldwin <jhb@freebsd.org>, freebsd-net@freebsd.org
Cc:        Sean Bruno <sbruno@llnw.com>, davide@freebsd.org, Eric van Gyzen <eric@vangyzen.net>
Subject:   Re: ixgbe(4) spin lock held too long
Message-ID:  <CAAAm0r0=7G=MgKEpXGDLqYE5%2BK9gfV3PLWJ=hLfvf2KWvPtd7w@mail.gmail.com>
In-Reply-To: <201410071428.15753.jhb@freebsd.org>
References:  <1410203348.1343.1.camel@bruno> <201410071146.26274.jhb@freebsd.org> <CAAAm0r29r5dVxpUYUCVLY=LfP5NNpjwFc4RoJQoLuA4Pwnpq%2BA@mail.gmail.com> <201410071428.15753.jhb@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Oct 7, 2014 at 11:28 AM, John Baldwin <jhb@freebsd.org> wrote:

> On Tuesday, October 07, 2014 2:06:42 pm Jason Wolfe wrote:
> > Hey John,
> >
> > Happy to do this, but the pool of boxes is about 500 large, which is the
> > reason I'm able to see a crash every day or so.  I've pulled a portion of
> > them out of service to switch them over to a a kernel with WITNESS
> options,
> > might that assist in the same manner if we catch a crash there?  If
> you're
> > thinking the KTR traces are required to diagnose this I'll get started on
> > that though.  I've also cut 10.1-RC1 with some ixgbe fixes for future
> use,
> > but noticed kern_timeout.c has been static for the past 8 months.  Thanks
> > again for your help.
>
> I don't think WITNESS will really help in this case.  ixgbe just happens to
> be trying to schedule a timer and running into a problem that the timer is
> spinning on another CPU.  While the kern_timeout code hasn't changed in a
> while, it may be an edge case provoked by another bug elsewhere.  It's also
> possible I suppose that a lot of callouts are scheduled for some reason.
>
> Hmm, maybe we can still figure this out from your dumps.  We can still get
> to
> '*cc' via an alternate route.  First, IIRC, the thread is an idle thread,
> so its name should indicate which CPU it is for:
>
> % sudo procstat -at  | grep 100003
>    11 100003 idle             idle: cpu0         0  255 run     -
>
> (Hmm, I bet that is always CPU 0).  That means we can find 'cc' by doing
> 'p cc_cpu[0]' in kgdb.  Can you do that from the last core dump along with
> 'p last' in frame 4?
>
> Grrr.  Looks like cc->cc_lastscan is already overwritten by the time you've
> crashed, so we can't work back to the original value of 'firstb'.
>
> However, there is a sysctl you can invoke to get callout stats.  Maybe run
> 'sysctl kern.callout_stats=1' on one of your existing servers and then
> check
> dmesg.  On an idle machine sitting here running stable/10 I have 86 callout
> scheduled currently:
>
> Scheduled callouts statistic snapshot:
>   Callouts:     86  Buckets:  32768*8    Bucket size: 0.003906s
>   C/Bk: med     0         avg      0.000328  max     19
>   Time: med     2.000000s avg    719.566333s max  19807.162515s
>   Prec: med     0.500000s avg     60.250613s max   1799.999875s
>   Distribution:         buckets    time    tcum    prec    pcum
>            0.000000s     2**-12       0       0       1       1
>            0.007812s     2**1         1       1       1       2
>            0.015625s     2**2         0       1       2       4
>            0.031250s     2**3         5       6       4       8
>            0.062500s     2**4         0       6      20      28
>            0.125000s     2**5         1       7       1      29
>            0.250000s     2**6         4      11      11      40
>            0.500000s     2**7         7      18      14      54
>            1.000000s     2**8        23      41      11      65
>            2.000000s     2**9         2      43       3      68
>            4.000000s     2**10       13      56       4      72
>            8.000000s     2**11        2      58       0      72
>           16.000000s     2**12       11      69       2      74
>           32.000000s     2**13        1      70       1      75
>           64.000000s     2**14        3      73       0      75
>          128.000000s     2**15        1      74       3      78
>          256.000000s     2**16        4      78       1      79
>          512.000000s     2**17        0      78       6      85
>         2048.000000s     2**19        2      80       1      86
>         8192.000000s     2**21        5      85       0      86
>        16384.000000s     2**22        1      86       0      86
>
> It would be interesting to see if you have a much higher number.
>
> --
> John Baldwin
>

I was trying to keep the chatter off the list, but John has been insanely
helpful here.  Looping the list back in as his insight into debugging this
is likely relevant to others.

John,

Correct 100003 is always CPU0, on our end at least.  Jackpot on the
cc_cpu[0] also, here are cores across 3 different 'spin lock callout held
by tid 100003 too long' machines.  Callouts on an idle machine are in line
with yours at 90, and a somewhat idle box pushing ~500Mb/s has in the
3000-4000 range.


(kgdb) tid 100003
[Switching to thread 40 (Thread 100003)]#0  0xffffffff80ac39b8 in
cpustop_handler ()
    at /usr/src/sys/amd64/amd64/mp_machdep.c:1432
1432            savectx(&stoppcbs[cpu]);
(kgdb) up 4
#4  0xffffffff80733f79 in callout_process (now=3549882673091690) at
/usr/src/sys/kern/kern_timeout.c:443
443                     tmp = LIST_FIRST(sc);
(kgdb) list
438             }
439
440             /* Iterate callwheel from firstb to nowb and then up to
lastb. */
441             do {
442                     sc = &cc->cc_callwheel[firstb & callwheelmask];
443                     tmp = LIST_FIRST(sc);
444                     while (tmp != NULL) {
445                             /* Run the callout if present time within
allowed. */
446                             if (tmp->c_time <= now) {
447                                     /*
(kgdb) p cc_cpu[0]
$1 = {cc_lock = {lock_object = {lo_name = 0xffffffff80ccbec7 "callout",
lo_flags = 720896, lo_data = 0, lo_witness = 0x0},
    mtx_lock = 4}, cc_exec_entity = {{cc_next = 0x0, cc_curr = 0x0,
ce_migration_func = 0, ce_migration_arg = 0x0,
      ce_migration_cpu = 64, ce_migration_time = 0, ce_migration_prec = 0,
cc_cancel = false, cc_waiting = false}, {
      cc_next = 0xffffffff812a7af0, cc_curr = 0x0, ce_migration_func = 0,
ce_migration_arg = 0x0, ce_migration_cpu = 64,
      ce_migration_time = 0, ce_migration_prec = 0, cc_cancel = false,
cc_waiting = false}},
  cc_callout = 0xfffffe000069f000, cc_callwheel = 0xfffffe00007c1000,
cc_expireq = {tqh_first = 0x0,
    tqh_last = 0xffffffff812a0510}, cc_callfree = {slh_first =
0xfffffe00007c02c0}, cc_firstevent = 3549882673088716,
  cc_lastscan = 3549882673091690, cc_cookie = 0xfffff800151f4b80, cc_bucket
= 6515}
(kgdb) p last
$2 = 3549882929381376



(kgdb) tid 100003
[Switching to thread 40 (Thread 100003)]#0  0xffffffff80ac39b8 in
cpustop_handler ()
    at /usr/src/sys/amd64/amd64/mp_machdep.c:1432
1432            savectx(&stoppcbs[cpu]);
(kgdb) up 4
#4  0xffffffff80733fab in callout_process (now=6792755852650669) at
/usr/src/sys/kern/kern_timeout.c:467
467                                     LIST_REMOVE(tmp, c_links.le);
(kgdb) list
462     #endif
463                                         1);
464                                     tmp = cc->cc_exec_next_dir;
465                             } else {
466                                     tmpn = LIST_NEXT(tmp, c_links.le);
467                                     LIST_REMOVE(tmp, c_links.le);
468                                     TAILQ_INSERT_TAIL(&cc->cc_expireq,
469                                         tmp, c_links.tqe);
470                                     tmp->c_flags |= CALLOUT_PROCESSED;
471                                     tmp = tmpn;
(kgdb) p cc_cpu[0]
$1 = {cc_lock = {lock_object = {lo_name = 0xffffffff80ccbec7 "callout",
lo_flags = 720896, lo_data = 0, lo_witness = 0x0},
    mtx_lock = 4}, cc_exec_entity = {{cc_next = 0x0, cc_curr = 0x0,
ce_migration_func = 0, ce_migration_arg = 0x0,
      ce_migration_cpu = 64, ce_migration_time = 0, ce_migration_prec = 0,
cc_cancel = true, cc_waiting = false}, {
      cc_next = 0x0, cc_curr = 0x0, ce_migration_func = 0, ce_migration_arg
= 0x0, ce_migration_cpu = 64,
      ce_migration_time = 0, ce_migration_prec = 0, cc_cancel = false,
cc_waiting = false}},
  cc_callout = 0xfffffe000069f000, cc_callwheel = 0xfffffe00007c1000,
cc_expireq = {tqh_first = 0xfffffe0003f461b0,
    tqh_last = 0xffffffff81257138}, cc_callfree = {slh_first =
0xfffffe00007c02c0}, cc_firstevent = 6792755922031366,
  cc_lastscan = 6792755852650669, cc_cookie = 0xfffff800151f4b80, cc_bucket
= 31145}
(kgdb) p last
$2 = 6792757989343232



(kgdb) tid 100003
[Switching to thread 40 (Thread 100003)]#0  0xffffffff80ac39b8 in
cpustop_handler ()
    at /usr/src/sys/amd64/amd64/mp_machdep.c:1432
1432            savectx(&stoppcbs[cpu]);
(kgdb) up 4
#4  0xffffffff8073406e in callout_process (now=4354614125734311) at
/usr/src/sys/kern/kern_timeout.c:487
487                             if (tmp->c_time < first)
(kgdb) list
482                             if (tmp->c_time > last) {
483                                     lastb = nowb;
484                                     goto next;
485                             }
486                             /* Update first and last time, respecting
this event. */
487                             if (tmp->c_time < first)
488                                     first = tmp->c_time;
489                             tmp_max = tmp->c_time + tmp->c_precision;
490                             if (tmp_max < last)
491                                     last = tmp_max;
(kgdb) p cc_cpu[0]
$1 = {cc_lock = {lock_object = {lo_name = 0xffffffff80ccbec7 "callout",
lo_flags = 720896, lo_data = 0, lo_witness = 0x0},
    mtx_lock = 4}, cc_exec_entity = {{cc_next = 0x0, cc_curr = 0x0,
ce_migration_func = 0, ce_migration_arg = 0x0,
      ce_migration_cpu = 64, ce_migration_time = 0, ce_migration_prec = 0,
cc_cancel = false, cc_waiting = false}, {
      cc_next = 0x0, cc_curr = 0x0, ce_migration_func = 0, ce_migration_arg
= 0x0, ce_migration_cpu = 64,
      ce_migration_time = 0, ce_migration_prec = 0, cc_cancel = false,
cc_waiting = false}},
  cc_callout = 0xfffffe000069f000, cc_callwheel = 0xfffffe00007c1000,
cc_expireq = {tqh_first = 0x0,
    tqh_last = 0xffffffff812a0510}, cc_callfree = {slh_first =
0xfffffe00007c02c0}, cc_firstevent = 4354614125730601,
  cc_lastscan = 4354614125734311, cc_cookie = 0xfffff800151f4b80, cc_bucket
= 32667}
(kgdb) p last
$2 = 4354614660956160


Scheduled callouts statistic snapshot:
  Callouts:     90  Buckets:  32768*6    Bucket size: 0.003906s
  C/Bk: med     0         avg      0.000457  max     13
  Time: med     4.000000s avg   1231.642745s max  56836.522122s
  Prec: med     1.000000s avg    194.507344s max   5399.999627s
  Distribution:         buckets    time    tcum    prec    pcum
           0.000000s     2**-12       0       0       1       1
           0.000030s     2**-7        1       1       0       1
           0.015625s     2**2         0       1       2       3
           0.031250s     2**3         1       2      17      20
           0.062500s     2**4         1       3      14      34
           0.125000s     2**5        15      18       1      35
           0.250000s     2**6         6      24       1      36
           0.500000s     2**7        12      36       5      41
           1.000000s     2**8         3      39      15      56
           4.000000s     2**10       17      56       9      65
           8.000000s     2**11        2      58       0      65
          16.000000s     2**12        0      58      12      77
          32.000000s     2**13        2      60       1      78
          64.000000s     2**14        6      66       4      82
         128.000000s     2**15       17      83       3      85
         256.000000s     2**16        0      83       2      87
        1024.000000s     2**18        4      87       0      87
        4096.000000s     2**20        0      87       3      90
       32768.000000s     2**23        2      89       0      90
       65536.000000s     2**24        1      90       0      90


Scheduled callouts statistic snapshot:
  Callouts:   3613  Buckets:  32768*6    Bucket size: 0.003906s
  C/Bk: med     0         avg      0.018376  max     22
  Time: med    64.000000s avg    167.175352s max  86397.094792s
  Prec: med     4.000000s avg     13.837570s max   5399.999627s
  Distribution:         buckets    time    tcum    prec    pcum
           0.000000s  2**-12          0       0       1       1
           0.001953s  2**-1           0       0       2       3
           0.003906s  2**0            2       2       0       3
           0.007812s  2**1            2       4       0       3
           0.015625s  2**2            3       7      48      51
           0.031250s  2**3            5      12      79     130
           0.062500s  2**4            9      21     139     269
           0.125000s  2**5           24      45      27     296
           0.250000s  2**6           74     119     120     416
           0.500000s  2**7          117     236      30     446
           1.000000s  2**8           89     325     202     648
           2.000000s  2**9           76     401      24     672
           4.000000s  2**10         156     557    2862    3534
           8.000000s  2**11         187     744       1    3535
          16.000000s  2**12         370    1114      12    3547
          32.000000s  2**13         589    1703       1    3548
          64.000000s  2**14        1876    3579      55    3603
         128.000000s  2**15          18    3597       2    3605
         256.000000s  2**16           1    3598       2    3607
         512.000000s  2**17           1    3599       0    3607
        1024.000000s  2**18           6    3605       0    3607
        2048.000000s  2**19           2    3607       0    3607
        4096.000000s  2**20           0    3607       6    3613
       32768.000000s  2**23           1    3608       0    3613
       65536.000000s  2**24           5    3613       0    3613

Jason



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAAAm0r0=7G=MgKEpXGDLqYE5%2BK9gfV3PLWJ=hLfvf2KWvPtd7w>