Date: Wed, 11 Aug 2021 15:42:20 -0700 From: John Baldwin <jhb@FreeBSD.org> To: Luiz Otavio O Souza <loos.br@gmail.com> Cc: Scott Long <scottl@freebsd.org>, src-committers <src-committers@freebsd.org>, dev-commits-src-all@freebsd.org, dev-commits-src-main@freebsd.org Subject: Re: git: 35547df5c786 - main - Call wakeup() with the lock held to avoid missed wakeup races. Message-ID: <9587664b-5c82-cbe2-fb13-6de6abb9843d@FreeBSD.org> In-Reply-To: <CAJ8CS7pQ=HWB9%2By0H6sCECD4MUR1omnwu%2Bg_xBoYnt1OW2NtLg@mail.gmail.com> References: <202108102241.17AMfr63025604@gitrepo.freebsd.org> <9b1ec22c-1545-52ea-7bf9-a96140737711@FreeBSD.org> <CAJ8CS7pT9A=9OTvwTkxUfHxxB9MjWUF7=cy=yA1ANzNO9tptww@mail.gmail.com> <b68cf1db-bbc2-bfea-317a-11aa4f6daa78@FreeBSD.org> <CAJ8CS7pQ=HWB9%2By0H6sCECD4MUR1omnwu%2Bg_xBoYnt1OW2NtLg@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 8/11/21 3:31 PM, Luiz Otavio O Souza wrote: > On Wed, Aug 11, 2021 at 7:03 PM John Baldwin <jhb@freebsd.org> wrote: >> >> On 8/11/21 2:38 PM, Luiz Otavio O Souza wrote: >>> On Wed, Aug 11, 2021 at 3:49 PM John Baldwin <jhb@freebsd.org> wrote: >>>> >>>> On 8/10/21 3:41 PM, Scott Long wrote: >>>>> The branch main has been updated by scottl: >>>>> >>>>> URL: https://cgit.FreeBSD.org/src/commit/?id=35547df5c78653b2da030f920323c0357056099f >>>>> >>>>> commit 35547df5c78653b2da030f920323c0357056099f >>>>> Author: Scott Long <scottl@FreeBSD.org> >>>>> AuthorDate: 2021-08-10 22:36:38 +0000 >>>>> Commit: Scott Long <scottl@FreeBSD.org> >>>>> CommitDate: 2021-08-10 22:36:38 +0000 >>>>> >>>>> Call wakeup() with the lock held to avoid missed wakeup races. >>>>> >>>>> Submitted by: luiz >>>>> Sponsored by: Rubicon Communications, LLC ("Netgate") >>>>> --- >>>>> sys/dev/sdhci/sdhci.c | 2 +- >>>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>>> >>>>> diff --git a/sys/dev/sdhci/sdhci.c b/sys/dev/sdhci/sdhci.c >>>>> index d075c2e05000..573e6949b57e 100644 >>>>> --- a/sys/dev/sdhci/sdhci.c >>>>> +++ b/sys/dev/sdhci/sdhci.c >>>>> @@ -2078,8 +2078,8 @@ sdhci_generic_release_host(device_t brdev __unused, device_t reqdev) >>>>> /* Deactivate led. */ >>>>> WR1(slot, SDHCI_HOST_CONTROL, slot->hostctrl &= ~SDHCI_CTRL_LED); >>>>> slot->bus_busy--; >>>>> - SDHCI_UNLOCK(slot); >>>>> wakeup(slot); >>>>> + SDHCI_UNLOCK(slot); >>>>> return (0); >>>>> } >>>> >>>> Hmm, how does this avoid a race? The sleep is checking bus_busy under >>>> the lock and should never see a stale value and go back to sleep after >>>> the wakeup has occurred: >>>> >>>> SDHCI_LOCK(slot); >>>> while (slot->bus_busy) >>>> msleep(slot, &slot->mtx, 0, "sdhciah", 0); >>>> slot->bus_busy++; >>>> /* Activate led. */ >>>> WR1(slot, SDHCI_HOST_CONTROL, slot->hostctrl |= SDHCI_CTRL_LED); >>>> SDHCI_UNLOCK(slot); >>>> >>>> Dropping the lock before wakeup() is a tiny optimization that avoids >>>> having the second thread wakeup and immediately block on the lock before >>>> it has been released by the first thread. >>>> >>> >>> 'race' is probably wrong here. this change will prevent a second >>> thread from taking the bus before you call wakeup() - poking all other >>> threads unnecessarily. >> >> This change does not prevent that. The other thread and the thread that >> are awakened will race with each other to acquire the lock. wakeup() >> doesn't do any sort of explicit lock handoff to the thread being awakened >> and it's just as likely for a thread not yet asleep to acquire the lock as >> for the thread being awakened to acquire the lock. If you have observed >> thundering herd problems with this wakeup() then you might want to change >> it to wakeup_one(). > > correct, but to be more specific, on the first thread, after you free > the bus and release the lock, a new thread can run, successfully > acquire the lock and grab the bus. at this point the first thread > resumes and call wakeup(). when that happens, the new thread always > wins, the threads being awakened won't have a chance. Perhaps on a uniprocessor system this might be true, but otherwise your new thread is likely spinning adaptively on the lock on another CPU and grabs it as soon as it is released before the thread awakened by wakeup() is even scheduled on a CPU and given a chance to run. That is, I suspect the new thread always wins even with this change on any multiprocessor system, but it now has to wait a bit longer before it wins. Have you observed some specific behavior with traces that this change seeks to address? -- John Baldwin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?9587664b-5c82-cbe2-fb13-6de6abb9843d>