Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 23 Feb 2020 12:48:26 -0700
From:      Ian Lepore <ian@freebsd.org>
To:        Arpan Palit <babupalit@gmail.com>
Cc:        freebsd-hackers@freebsd.org, freebsd-questions@freebsd.org
Subject:   Re: msleep_spin is failed to waken up even after wakeup routine is invoked for the same.
Message-ID:  <08fd4dcf7231237abf749834d6bce006892eac26.camel@freebsd.org>
In-Reply-To: <CAF3txfh0G%2BVkCHptXcDvu_KUCzP6yEWinSw8LXa4KLB1e-mT3A@mail.gmail.com>
References:  <CAF3txfi6Kz1y0a8jusGKrYFyMAER60xpkngQS0m-%2BY9G1RX5cw@mail.gmail.com> <890a299c074fc83a02911583531d686257924be8.camel@freebsd.org> <CAF3txfh0G%2BVkCHptXcDvu_KUCzP6yEWinSw8LXa4KLB1e-mT3A@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, 2020-02-23 at 12:30 +0530, Arpan Palit wrote:
> Thanks for replying Ian.
> 
> The code is similar to the fix structure code as you have mentioned above.
> 
>  Its happening after running the test for 5 to 10 iterations, out of 2k
> posted command to the hardware only 1 or 2 of them are failing to wake
> up properly. mutex lock has been properly introduced in both
> wait_for_completion code & interrupt handler part and also the doneflag is
> getting set from the interrupt context properly.
> 
> I am running 16 thread of 2k command simultaneously.
> 
> I have verified that from the hardware side the command completion is
> proper, its failing in waking up the sleep in software side.
> 
> So I am suspecting some issue with the scheduler which failed to wake up
> the sleep in this case.
> 
> Below is the current code structure:
> 
> /* Wait for Completion */
> mtx_lock(cmd_comp_lock);
> completion = false;
> ret = execute_cmd(cmd);
> if (ret == 0) {
> while ( !completion && err == 0)
>         err = msleep(cmd, cmd_comp_lock, PCATCH, "cmd_xfer", hz);
> }
> mtx_unlock(cmd_comp_lock);
> 
> /* Interrupt handler */
> mtx_lock_spin(cmd_lock);
> *//Disable inerrupt
> *// error checks
> completion = true;
> wakeup_one(cmd);
> mtx_unlock_spin(cmd_lock);
> *//Enable interrupt
> 

That shows two different mutexes being used, which violates the safe
pattern for doing conditional wakeups.  The same mutex that protects
the setting of the 'completion' variable in the interrupt handler must
be used to protect the clearing and testing of that variable in the
mainline code, to avoid missed wakeups.  That means using msleep_spin()
with cmd_lock in the mainline code.

-- Ian

> 
> On Wed, Feb 19, 2020 at 8:12 PM Ian Lepore <ian@freebsd.org> wrote:
> 
> > On Wed, 2020-02-19 at 14:13 +0530, Arpan Palit wrote:
> > > Hi,
> > > 
> > > I am facing one issue where wakeup rountine call is unable to
> > > waken up a
> > > msleep_spin routine call with a timeout value of 10 Seconds.
> > > 
> > > The real scenario is as follows: post a hardware command and
> > > sleep using
> > > msleep_spin routine till interrupt comes, After getting the
> > > interrupt
> > 
> > waken
> > > up the sleeping process using wakeup_one/wakeup routine call. As
> > > there
> > 
> > are
> > > more than 2048 command and 16 parallel threads are running,
> > > observed randomly *one or two of the posted command* is *timing
> > > out* for
> > > which the *interrupt has came and also wakeup routine is invoked
> > > *after
> > > getting the interrupt for the same command.
> > > 
> > > Note:
> > > *The issue is not seen when number of commands are less than 2048
> > > with
> > > timeout of 10 seconds.
> > > *The issue can be seen with less number of commands also when
> > > timeout
> > 
> > value
> > > 1 second.
> > > 
> > > Can anyone please provide me an optimized way to schedule the
> > > process or
> > 
> > a
> > > better way to do the scheduling.
> > > 
> > > Thanks,
> > > Arpan Palit
> > > 
> > 
> > Is there any chance this is the classic "missed wakeup" scenario,
> > where
> > the wakeup happens before the thread enters msleep_spin()?  That
> > can
> > happen with code structured like
> > 
> >   enqueue_request(req);
> >   err = msleep_spin(req, etc);
> >   /* Handle done req or timeout */
> > 
> > and a fix is to structure the code using the same idiom required
> > for
> > pthread_cond_wait() in userland, something like:
> > 
> >   req->doneflag = false;
> >   enqueue_request(req);
> >   while (!req->doneflag && err == 0)
> >       err = msleep_spin(req, etc);
> >   /* Handle done req or timeout */
> > 
> > and of course on the interrupt handler side you need something like
> > 
> >   /* lock mutex, do hardware stuff */
> >   req->doneflag = true;
> >   wakeup(req);
> >   /* unlock mutex */
> > 
> > -- Ian
> > 
> > 
> > 
> 
> _______________________________________________
> freebsd-hackers@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to "
> freebsd-hackers-unsubscribe@freebsd.org"




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?08fd4dcf7231237abf749834d6bce006892eac26.camel>