From owner-freebsd-arch@FreeBSD.ORG  Thu Jul 10 06:19:57 2014
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: arch@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id B4CDE119;
 Thu, 10 Jul 2014 06:19:57 +0000 (UTC)
Received: from h2.funkthat.com (gate2.funkthat.com [208.87.223.18])
 (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
 (Client CN "funkthat.com", Issuer "funkthat.com" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id 6E99425F9;
 Thu, 10 Jul 2014 06:19:57 +0000 (UTC)
Received: from h2.funkthat.com (localhost [127.0.0.1])
 by h2.funkthat.com (8.14.3/8.14.3) with ESMTP id s6A6Jt5o022804
 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
 Wed, 9 Jul 2014 23:19:56 -0700 (PDT)
 (envelope-from jmg@h2.funkthat.com)
Received: (from jmg@localhost)
 by h2.funkthat.com (8.14.3/8.14.3/Submit) id s6A6JtXR022803;
 Wed, 9 Jul 2014 23:19:55 -0700 (PDT) (envelope-from jmg)
Date: Wed, 9 Jul 2014 23:19:55 -0700
From: John-Mark Gurney <jmg@funkthat.com>
To: John Baldwin <jhb@freebsd.org>
Subject: Re: callout(9) really this complicated?
Message-ID: <20140710061955.GT45513@funkthat.com>
Mail-Followup-To: John Baldwin <jhb@freebsd.org>, freebsd-arch@freebsd.org,
 Hans Petter Selasky <hps@selasky.org>, arch@freebsd.org
References: <20140704041521.GW45513@funkthat.com>
 <53B64694.9030100@selasky.org> <20140704155752.GB45513@funkthat.com>
 <201407091211.47642.jhb@freebsd.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <201407091211.47642.jhb@freebsd.org>
User-Agent: Mutt/1.4.2.3i
X-Operating-System: FreeBSD 7.2-RELEASE i386
X-PGP-Fingerprint: 54BA 873B 6515 3F10 9E88  9322 9CB1 8F74 6D3F A396
X-Files: The truth is out there
X-URL: http://resnet.uoregon.edu/~gurney_j/
X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html
X-TipJar: bitcoin:13Qmb6AeTgQecazTWph4XasEsP7nGRbAPE
X-to-the-FBI-CIA-and-NSA: HI! HOW YA DOIN? can i haz chizburger?
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2
 (h2.funkthat.com [127.0.0.1]); Wed, 09 Jul 2014 23:19:56 -0700 (PDT)
Cc: Hans Petter Selasky <hps@selasky.org>, arch@freebsd.org,
 freebsd-arch@freebsd.org
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: Discussion related to FreeBSD architecture <freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-arch>,
 <mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch/>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
 <mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 10 Jul 2014 06:19:57 -0000

John Baldwin wrote this message on Wed, Jul 09, 2014 at 12:11 -0400:
> On Friday, July 04, 2014 11:57:52 am John-Mark Gurney wrote:
> > Hans Petter Selasky wrote this message on Fri, Jul 04, 2014 at 08:15 +0200:
> > > On 07/04/14 06:15, John-Mark Gurney wrote:
> > > >So, I was going to look at using callout(9) for some time delayed
> > > >actions...  But upon reading the docs, a) the docs are inconsistent,
> > > >and b) the docs only talk about requirements in other section...
> > > >
> > > >Is there a better interface?  If so, can we mark callout(9) deprecated?
> > > >If not, I have some questions...
> > > >
> > > >If you want callout_drain to work properly, you have to add extra code
> > > >to both your callout, and around the usage of it...
> > > >
> > > >callout_drain does not drain the callout:
> > > >      However, the callout subsystem does guarantee that the callout will 
> > > >      be
> > > >      fully stopped before callout_drain() returns.
> > > >
> > > >Yet other parse of the docs say that you can depend upon the callout
> > > >being fully stopped..  I've sent email to Ian (iedowse) about why he
> > > >added this statement...
> > > >
> > > >Second, the amount of work you have to do to make sure you drain
> > > >seems pretty crazy as documented in Avoiding Race Conditions...
> > > >
> > > >It seems like if I have created a callout w/ callout_init_mtx,
> > > >that I shouldn't have to do extra work to make it work correctly...
> > > >
> > > >When calling _callout_stop_safe as callout_drain, there is no assert
> > > >that the lock is held, though it is documented as requiring it by:
> > > >      The function callout_drain() is identical to callout_stop() except 
> > > >      that
> > > >      it will wait for the callout to be completed if it is already in
> > > >      progress.
> > > >
> > > >Maybe we need to add an additional statement here?  and assert that it
> > > >isn't locked?
> > > >
> > > >Also, I have tried to follow the code, but it's complicated, so I'm
> > > >hoping that I can get some help here.
> > 
> > 
> > > Probably the documentation needs an update. The implementation is fine.
> > 
> > Probably...  But w/ bad docs, it makes you wonder about the
> > implementation...
> 
> If you use callout_init_mtx(), then you can use callout_reset() and 
> callout_stop() while holding the mutex and everything will work as expected.  
> You typically only need to use callout_drain() during a device detach
> routine to "destroy" the callout().

So, you're say that we should just remove this text:
     However, the callout subsystem does guarantee that the callout will be
     fully stopped before callout_drain() returns.

> (A typical "detach" routine looks something like this:
> 
>  - detach external connections (ifnet, cdev, etc.)
>  - stop the hardware itself (foo_stop in various NIC drivers)
>    - this step typically does a callout_stop() with the relevant lock held
>  - drain any pending async events
>    - bus_teardown_intr() will block until the interrupt handler is
>      idle
>    - callout_drain() will block if the callout is pending because softclock
>      had already dequeued it and it was waiting for the driver lock when
>      you called callout_stop() earlier
>    - taskqueue_drain() (note that tasks do not have implicit locking ala
>      callout_init_mtx(), so you need to set some sort of flag that your
>      task function checks and breaks out of any loops for in the
>      "stop the hardware" step)
>   - free resources and destroy locks, etc.
> 
> > > drain is always called unlocked.
> > 
> > Then why the whole long section about avoiding races?   Point #3 is
> > the main one that I'm talking about...  It seems like it'd be easier
> > for callout to properly maintain the active flag (maybe set a flag to
> > tell callout to do so), or not even maintain it since it really isn't
> > used for anything important if you can munge it all you want...  There
> > aren't any statements about bad things happening if you call _deactivate
> > before the callout is called...
> 
> The language is unclear, but you only need to use one of the 3 options to
> work around the races.  Note that if you use callout_init_mtx() you fall
> into case #1 and don't need to worry about the others.  If you want to

It isn't clear that point #1 only applies to callout_init_mtx, and that
the other points don't apply...

And I would say that the text:
     The callout subsystem provides a number of mechanisms to address these
     synchronization concerns:

Is wrong, it only provides a mechanism for ONE case, the _init_mtx case,
in all other cases you have to provide the mechanism to avoid the race..

> use callout_init(.., CALLOUT_MPSAFE) and manage all the locking in your
> callout handler directly, then you need to handle the assorted races.
> However, you can generally get by with something far simpler than it suggests.
> You can just do what I described above for taskqueue_drain(), i.e. have
> your timer function do:
> 
> foo(void *arg)
> {
>    struct foo_softc *sc;
> 
>    sc = arg;
>    FOO_LOCK(sc);
>    if (sc->is_dead) {
>        FOO_UNLOCK(sc);
>        return;
>    }
>    ....
>    callout_reset(...);
>    FOO_UNLOCK(sc);
> }
> 
> In this case, simply setting 'is_dead' in the "stop the hardware" step and
> using callout_drain() will work fine.

Ok, unless someone commits a patch soon, I'll create one..

Thanks for the clarification...

-- 
  John-Mark Gurney				Voice: +1 415 225 5579

     "All that I will do, has been done, All that I have, has not."