Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 23 Feb 2020 12:30:15 +0530
From:      Arpan Palit <babupalit@gmail.com>
To:        Ian Lepore <ian@freebsd.org>
Cc:        freebsd-hackers@freebsd.org, freebsd-questions@freebsd.org
Subject:   Re: msleep_spin is failed to waken up even after wakeup routine is invoked for the same.
Message-ID:  <CAF3txfh0G%2BVkCHptXcDvu_KUCzP6yEWinSw8LXa4KLB1e-mT3A@mail.gmail.com>
In-Reply-To: <890a299c074fc83a02911583531d686257924be8.camel@freebsd.org>
References:  <CAF3txfi6Kz1y0a8jusGKrYFyMAER60xpkngQS0m-%2BY9G1RX5cw@mail.gmail.com> <890a299c074fc83a02911583531d686257924be8.camel@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Thanks for replying Ian.

The code is similar to the fix structure code as you have mentioned above.

 Its happening after running the test for 5 to 10 iterations, out of 2k
posted command to the hardware only 1 or 2 of them are failing to wake
up properly. mutex lock has been properly introduced in both
wait_for_completion code & interrupt handler part and also the doneflag is
getting set from the interrupt context properly.

I am running 16 thread of 2k command simultaneously.

I have verified that from the hardware side the command completion is
proper, its failing in waking up the sleep in software side.

So I am suspecting some issue with the scheduler which failed to wake up
the sleep in this case.

Below is the current code structure:

/* Wait for Completion */
mtx_lock(cmd_comp_lock);
completion = false;
ret = execute_cmd(cmd);
if (ret == 0) {
while ( !completion && err == 0)
        err = msleep(cmd, cmd_comp_lock, PCATCH, "cmd_xfer", hz);
}
mtx_unlock(cmd_comp_lock);

/* Interrupt handler */
mtx_lock_spin(cmd_lock);
*//Disable inerrupt
*// error checks
completion = true;
wakeup_one(cmd);
mtx_unlock_spin(cmd_lock);
*//Enable interrupt


On Wed, Feb 19, 2020 at 8:12 PM Ian Lepore <ian@freebsd.org> wrote:

> On Wed, 2020-02-19 at 14:13 +0530, Arpan Palit wrote:
> > Hi,
> >
> > I am facing one issue where wakeup rountine call is unable to waken up a
> > msleep_spin routine call with a timeout value of 10 Seconds.
> >
> > The real scenario is as follows: post a hardware command and sleep using
> > msleep_spin routine till interrupt comes, After getting the interrupt
> waken
> > up the sleeping process using wakeup_one/wakeup routine call. As there
> are
> > more than 2048 command and 16 parallel threads are running,
> > observed randomly *one or two of the posted command* is *timing out* for
> > which the *interrupt has came and also wakeup routine is invoked *after
> > getting the interrupt for the same command.
> >
> > Note:
> > *The issue is not seen when number of commands are less than 2048 with
> > timeout of 10 seconds.
> > *The issue can be seen with less number of commands also when timeout
> value
> > 1 second.
> >
> > Can anyone please provide me an optimized way to schedule the process or
> a
> > better way to do the scheduling.
> >
> > Thanks,
> > Arpan Palit
> >
>
> Is there any chance this is the classic "missed wakeup" scenario, where
> the wakeup happens before the thread enters msleep_spin()?  That can
> happen with code structured like
>
>   enqueue_request(req);
>   err = msleep_spin(req, etc);
>   /* Handle done req or timeout */
>
> and a fix is to structure the code using the same idiom required for
> pthread_cond_wait() in userland, something like:
>
>   req->doneflag = false;
>   enqueue_request(req);
>   while (!req->doneflag && err == 0)
>       err = msleep_spin(req, etc);
>   /* Handle done req or timeout */
>
> and of course on the interrupt handler side you need something like
>
>   /* lock mutex, do hardware stuff */
>   req->doneflag = true;
>   wakeup(req);
>   /* unlock mutex */
>
> -- Ian
>
>
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAF3txfh0G%2BVkCHptXcDvu_KUCzP6yEWinSw8LXa4KLB1e-mT3A>