Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 13 Nov 2015 16:34:13 -0500
From:      Randall Stewart <rrs@netflix.com>
To:        "Alexander V. Chernikov" <melifaro@freebsd.org>
Cc:        "src-committers@freebsd.org" <src-committers@freebsd.org>, "svn-src-all@freebsd.org" <svn-src-all@freebsd.org>, "svn-src-head@freebsd.org" <svn-src-head@freebsd.org>, Adrian Chadd <adrian@freebsd.org>, "imp@freebsd.org" <imp@freebsd.org>
Subject:   Re: svn commit: r290664 - in head: share/man/man9 sys/kern sys/sys
Message-ID:  <343356B9-A02C-4DC6-A890-A2727436041C@netflix.com>
In-Reply-To: <278491447449232@web25j.yandex.ru>
References:  null <201511101449.tAAEnXIi065747@repo.freebsd.org> <1660421447413365@web19h.yandex.ru> <E71F4241-0CEB-42E1-BA52-4EF7405647E5@netflix.com> <278491447449232@web25j.yandex.ru>

next in thread | previous in thread | raw e-mail | index | archive | help
My patch address the following:


On Nov 13, 2015, at 4:13 PM, Alexander V. Chernikov =
<melifaro@freebsd.org> wrote:

> =20
> =20
> 13.11.2015, 23:59, "Randall Stewart" <rrs@netflix.com>:
>> Strange
>> =20
>> I went looking through all calls to callout stop with cscope and saw
>> no one paying attention to the return value=85 (which I thought was =
not good).
> =20
> 23:49 [0] m@fhead5 grep -R callout_stop sys | egrep '(=3D|\<if\>)'
> sys/netgraph/ng_base.c:    rval =3D callout_stop(c);

This one does not need changing.


> sys/netpfil/pf/if_pfsync.c:        if (callout_stop(&pd->pd_tmo)) {
> sys/netpfil/pf/if_pfsync.c:            if (callout_stop(&pd->pd_tmo))

The above two I changed to > 0

> sys/dev/isci/isci_timer.c:    /* callout_stop() will *not* keep the =
time

None of the ones in isci_timer.c check the return code or do anything =
different.

>=20
> sys/netinet6/nd6.c:        canceled =3D callout_stop(&ln->lle_timer);
> sys/netinet6/in6.c:    if (callout_stop(&lle->lle_timer))
> sys/net/if_llatbl.c:        if (callout_stop(&lle->lle_timer))

The above needed the same

> sys/kern/subr_taskqueue.c:    pending =3D =
!!callout_stop(&timeout_task->c);

same as above.. only I think the !! is strange :-)

> sys/kern/kern_exit.c:        callout_stop(&p->p_itcallout) =3D=3D 0) {
Hmm I may have missed that one let me check

Ok looking at that one it does not need to be changed.. in fact it is =
more correct.
Since the 0 return on a already expired callout is now -1 which this if =
code is looking for.


> sys/kern/subr_sleepqueue.c:    else if =
(callout_stop(&td->td_slpcallout) =3D=3D 0) {

This one again was causing extra work when the callout was already =
stopped and
it returned 0.. it would do a synchronize on the other CPU.. but if -1 =
comes back it
says the callout is already stopped.. so no synchronization is needed..


> sys/netinet/in.c:    if (callout_stop(&lle->lle_timer))
> sys/netinet/tcp_timer.c:        if (callout_stop(t_callout) &&

These two I made > 0

though the TCP one needs to change to use the new async_drain

> =20
> (not counting callout_drain() here)

drain is different since it is done safe it should wait for
the completion of the timeout. I don=92t know if you could
ever get a 0 return from it..

R
> =20
>> =20
>> And yes I am running this in a lot of systems.
> Try this:
> 0:11 [0] fhead0# ifconfig vtnet0 alias 10.10.10.10/32
> 0:11 [0] fhead0# ifconfig vtnet0 -alias 10.10.10.10
> callout_stop() for lle 10.10.10.10 on vtnet0, lle_refcnt=3D1
> panic: bogus refcnt 0 on lle 0xfffff8001996c400
>> =20
>> R
>> =20
>> =20
>> On Nov 13, 2015, at 6:16 AM, Alexander V. Chernikov =
<melifaro@freebsd.org> wrote:
>> =20
>>>=20
>>> 10.11.2015, 17:49, "Randall Stewart" <rrs@FreeBSD.org>:
>>>>=20
>>>> Author: rrs
>>>> Date: Tue Nov 10 14:49:32 2015
>>>> New Revision: 290664
>>>> URL: https://svnweb.freebsd.org/changeset/base/290664
>>>>=20
>>>> Log:
>>>>   Add new async_drain to the callout system. This is so-far not =
used but
>>>>   should be used by TCP for sure in its cleanup of the IN-PCB (will =
be coming shortly).
>>>=20
>>> Randall, this commit introduced change in callout_stop() which was =
not mentioned in commit message.
>>> This change has broken lltable arp/nd handling: deleting interface =
address causes immediate panic.
>>> I also see other other code/subsystems relying on callout_stop() =
return value (netgraph, pfsync, iscsi).
>>> I was not able to find any discussion/analysis/testing for these in =
D4076 so this change does not look like being properly tested prior =
commiting..
>>>=20
>>>=20
>>> =20
>>>>=20
>>>>   Sponsored by: Netflix Inc.
>>>>   Differential Revision: https://reviews.freebsd.org/D4076
>>>>=20
>>>> Modified:
>>>>   head/share/man/man9/timeout.9
>>>>   head/sys/kern/kern_timeout.c
>>>>   head/sys/sys/callout.h
>>>>=20
>>>> Modified: head/share/man/man9/timeout.9
>>>> =
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D
>>>> --- head/share/man/man9/timeout.9 Tue Nov 10 14:14:41 2015 =
(r290663)
>>>> +++ head/share/man/man9/timeout.9 Tue Nov 10 14:49:32 2015 =
(r290664)
>>>> @@ -35,6 +35,7 @@
>>>>  .Sh NAME
>>>>  .Nm callout_active ,
>>>>  .Nm callout_deactivate ,
>>>> +.Nm callout_async_drain ,
>>>>  .Nm callout_drain ,
>>>>  .Nm callout_handle_init ,
>>>>  .Nm callout_init ,
>>>> @@ -69,6 +70,8 @@ typedef void timeout_t (void *);
>>>>  .Ft void
>>>>  .Fn callout_deactivate "struct callout *c"
>>>>  .Ft int
>>>> +.Fn callout_async_drain "struct callout *c" "timeout_t *drain"
>>>> +.Ft int
>>>>  .Fn callout_drain "struct callout *c"
>>>>  .Ft void
>>>>  .Fn callout_handle_init "struct callout_handle *handle"
>>>> @@ -236,17 +239,42 @@ The function
>>>>  cancels a callout
>>>>  .Fa c
>>>>  if it is currently pending.
>>>> -If the callout is pending, then
>>>> +If the callout is pending and successfuly stopped, then
>>>>  .Fn callout_stop
>>>> -returns a non-zero value.
>>>> -If the callout is not set,
>>>> -has already been serviced,
>>>> -or is currently being serviced,
>>>> +returns a value of one.
>>>> +If the callout is not set, or
>>>> +has already been serviced, then
>>>> +negative one is returned.
>>>> +If the callout is currently being serviced and cannot be stopped,
>>>>  then zero will be returned.
>>>>  If the callout has an associated lock,
>>>>  then that lock must be held when this function is called.
>>>>  .Pp
>>>>  The function
>>>> +.Fn callout_async_drain
>>>> +is identical to
>>>> +.Fn callout_stop
>>>> +with one difference.
>>>> +When
>>>> +.Fn callout_async_drain
>>>> +returns zero it will arrange for the function
>>>> +.Fa drain
>>>> +to be called using the same argument given to the
>>>> +.Fn callout_reset
>>>> +function.
>>>> +.Fn callout_async_drain
>>>> +If the callout has an associated lock,
>>>> +then that lock must be held when this function is called.
>>>> +Note that when stopping multiple callouts that use the same lock =
it is possible
>>>> +to get multiple return's of zero and multiple calls to the
>>>> +.Fa drain
>>>> +function, depending upon which CPU's the callouts are running. The
>>>> +.Fa drain
>>>> +function itself is called from the context of the completing =
callout
>>>> +i.e. softclock or hardclock, just like a callout itself.
>>>> +p
>>>> +.Pp
>>>> +The function
>>>>  .Fn callout_drain
>>>>  is identical to
>>>>  .Fn callout_stop
>>>>=20
>>>> Modified: head/sys/kern/kern_timeout.c
>>>> =
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D
>>>> --- head/sys/kern/kern_timeout.c Tue Nov 10 14:14:41 2015 (r290663)
>>>> +++ head/sys/kern/kern_timeout.c Tue Nov 10 14:49:32 2015 (r290664)
>>>> @@ -136,6 +136,7 @@ u_int callwheelsize, callwheelmask;
>>>>   */
>>>>  struct cc_exec {
>>>>          struct callout *cc_curr;
>>>> + void (*cc_drain)(void *);
>>>>  #ifdef SMP
>>>>          void (*ce_migration_func)(void *);
>>>>          void *ce_migration_arg;
>>>> @@ -170,6 +171,7 @@ struct callout_cpu {
>>>>  #define callout_migrating(c) ((c)->c_iflags & =
CALLOUT_DFRMIGRATION)
>>>>=20
>>>>  #define cc_exec_curr(cc, dir) cc->cc_exec_entity[dir].cc_curr
>>>> +#define cc_exec_drain(cc, dir) cc->cc_exec_entity[dir].cc_drain
>>>>  #define cc_exec_next(cc) cc->cc_next
>>>>  #define cc_exec_cancel(cc, dir) cc->cc_exec_entity[dir].cc_cancel
>>>>  #define cc_exec_waiting(cc, dir) =
cc->cc_exec_entity[dir].cc_waiting
>>>> @@ -679,6 +681,7 @@ softclock_call_cc(struct callout *c, str
>>>>=20
>>>>          cc_exec_curr(cc, direct) =3D c;
>>>>          cc_exec_cancel(cc, direct) =3D false;
>>>> + cc_exec_drain(cc, direct) =3D NULL;
>>>>          CC_UNLOCK(cc);
>>>>          if (c_lock !=3D NULL) {
>>>>                  class->lc_lock(c_lock, lock_status);
>>>> @@ -744,6 +747,15 @@ skip:
>>>>          CC_LOCK(cc);
>>>>          KASSERT(cc_exec_curr(cc, direct) =3D=3D c, ("mishandled =
cc_curr"));
>>>>          cc_exec_curr(cc, direct) =3D NULL;
>>>> + if (cc_exec_drain(cc, direct)) {
>>>> + void (*drain)(void *);
>>>> +
>>>> + drain =3D cc_exec_drain(cc, direct);
>>>> + cc_exec_drain(cc, direct) =3D NULL;
>>>> + CC_UNLOCK(cc);
>>>> + drain(c_arg);
>>>> + CC_LOCK(cc);
>>>> + }
>>>>          if (cc_exec_waiting(cc, direct)) {
>>>>                  /*
>>>>                   * There is someone waiting for the
>>>> @@ -1145,7 +1157,7 @@ callout_schedule(struct callout *c, int
>>>>  }
>>>>=20
>>>>  int
>>>> -_callout_stop_safe(struct callout *c, int safe)
>>>> +_callout_stop_safe(struct callout *c, int safe, void (*drain)(void =
*))
>>>>  {
>>>>          struct callout_cpu *cc, *old_cc;
>>>>          struct lock_class *class;
>>>> @@ -1225,19 +1237,22 @@ again:
>>>>           * stop it by other means however.
>>>>           */
>>>>          if (!(c->c_iflags & CALLOUT_PENDING)) {
>>>> - c->c_flags &=3D ~CALLOUT_ACTIVE;
>>>> -
>>>>                  /*
>>>>                   * If it wasn't on the queue and it isn't the =
current
>>>>                   * callout, then we can't stop it, so just bail.
>>>> + * It probably has already been run (if locking
>>>> + * is properly done). You could get here if the caller
>>>> + * calls stop twice in a row for example. The second
>>>> + * call would fall here without CALLOUT_ACTIVE set.
>>>>                   */
>>>> + c->c_flags &=3D ~CALLOUT_ACTIVE;
>>>>                  if (cc_exec_curr(cc, direct) !=3D c) {
>>>>                          CTR3(KTR_CALLOUT, "failed to stop %p func =
%p arg %p",
>>>>                              c, c->c_func, c->c_arg);
>>>>                          CC_UNLOCK(cc);
>>>>                          if (sq_locked)
>>>>                                  =
sleepq_release(&cc_exec_waiting(cc, direct));
>>>> - return (0);
>>>> + return (-1);
>>>>                  }
>>>>=20
>>>>                  if (safe) {
>>>> @@ -1298,14 +1313,16 @@ again:
>>>>                                  CC_LOCK(cc);
>>>>                          }
>>>>                  } else if (use_lock &&
>>>> - !cc_exec_cancel(cc, direct)) {
>>>> + !cc_exec_cancel(cc, direct) && (drain =3D=3D NULL)) {
>>>>=20
>>>>                          /*
>>>>                           * The current callout is waiting for its
>>>>                           * lock which we hold. Cancel the callout
>>>>                           * and return. After our caller drops the
>>>>                           * lock, the callout will be skipped in
>>>> - * softclock().
>>>> + * softclock(). This *only* works with a
>>>> + * callout_stop() *not* callout_drain() or
>>>> + * callout_async_drain().
>>>>                           */
>>>>                          cc_exec_cancel(cc, direct) =3D true;
>>>>                          CTR3(KTR_CALLOUT, "cancelled %p func %p =
arg %p",
>>>> @@ -1351,11 +1368,17 @@ again:
>>>>  #endif
>>>>                          CTR3(KTR_CALLOUT, "postponing stop %p func =
%p arg %p",
>>>>                              c, c->c_func, c->c_arg);
>>>> + if (drain) {
>>>> + cc_exec_drain(cc, direct) =3D drain;
>>>> + }
>>>>                          CC_UNLOCK(cc);
>>>>                          return (0);
>>>>                  }
>>>>                  CTR3(KTR_CALLOUT, "failed to stop %p func %p arg =
%p",
>>>>                      c, c->c_func, c->c_arg);
>>>> + if (drain) {
>>>> + cc_exec_drain(cc, direct) =3D drain;
>>>> + }
>>>>                  CC_UNLOCK(cc);
>>>>                  KASSERT(!sq_locked, ("sleepqueue chain still =
locked"));
>>>>                  return (0);
>>>>=20
>>>> Modified: head/sys/sys/callout.h
>>>> =
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D
>>>> --- head/sys/sys/callout.h Tue Nov 10 14:14:41 2015 (r290663)
>>>> +++ head/sys/sys/callout.h Tue Nov 10 14:49:32 2015 (r290664)
>>>> @@ -81,7 +81,7 @@ struct callout_handle {
>>>>   */
>>>>  #define callout_active(c) ((c)->c_flags & CALLOUT_ACTIVE)
>>>>  #define callout_deactivate(c) ((c)->c_flags &=3D ~CALLOUT_ACTIVE)
>>>> -#define callout_drain(c) _callout_stop_safe(c, 1)
>>>> +#define callout_drain(c) _callout_stop_safe(c, 1, NULL)
>>>>  void callout_init(struct callout *, int);
>>>>  void _callout_init_lock(struct callout *, struct lock_object *, =
int);
>>>>  #define callout_init_mtx(c, mtx, flags) \
>>>> @@ -119,10 +119,11 @@ int callout_schedule(struct callout *, i
>>>>  int callout_schedule_on(struct callout *, int, int);
>>>>  #define callout_schedule_curcpu(c, on_tick) \
>>>>      callout_schedule_on((c), (on_tick), PCPU_GET(cpuid))
>>>> -#define callout_stop(c) _callout_stop_safe(c, 0)
>>>> -int _callout_stop_safe(struct callout *, int);
>>>> +#define callout_stop(c) _callout_stop_safe(c, 0, NULL)
>>>> +int _callout_stop_safe(struct callout *, int, void (*)(void *));
>>>>  void callout_process(sbintime_t now);
>>>> -
>>>> +#define callout_async_drain(c, d) \
>>>> + _callout_stop_safe(c, 0, d)
>>>>  #endif
>>>>=20
>>>>  #endif /* _SYS_CALLOUT_H_ */
>> =20
>> --------
>> Randall Stewart
>> rrs@netflix.com
>> 803-317-4952
>> =20
>>=20
>> =20

--------
Randall Stewart
rrs@netflix.com
803-317-4952








Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?343356B9-A02C-4DC6-A890-A2727436041C>