Date: Thu, 7 Jan 2021 10:02:56 +0100 From: Hans Petter Selasky <hps@selasky.org> To: Ryan Stone <rysto32@gmail.com>, Alexander Motin <mav@freebsd.org> Cc: src-committers <src-committers@freebsd.org>, svn-src-all <svn-src-all@freebsd.org>, svn-src-head <svn-src-head@freebsd.org> Subject: Re: svn commit: r322272 - head/sys/compat/linuxkpi/common/src Message-ID: <f785d26a-e907-7473-0777-7b65ae99b7c2@selasky.org> In-Reply-To: <CAFMmRNxNZEEWQ%2BnaVtjq%2Bk6OXzszCT6sDDrMW3GLWYB5VFx6uw@mail.gmail.com> References: <201708081936.v78JaY4m074051@repo.freebsd.org> <CAFMmRNxNZEEWQ%2BnaVtjq%2Bk6OXzszCT6sDDrMW3GLWYB5VFx6uw@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 1/6/21 9:44 PM, Ryan Stone wrote: > On Tue, Aug 8, 2017 at 3:36 PM Alexander Motin <mav@freebsd.org> wrote: >> >> Author: mav >> Date: Tue Aug 8 19:36:34 2017 >> New Revision: 322272 >> URL: https://svnweb.freebsd.org/changeset/base/322272 >> >> Log: >> Fix few issues of LinuxKPI workqueue. >> >> LinuxKPI workqueue wrappers reported "successful" cancellation for works >> already completed in normal way. This change brings reported status and >> real cancellation fact into sync. This required for drm-next operation. >> >> Reviewed by: hselasky (earlier version) >> Sponsored by: iXsystems, Inc. >> Differential Revision: https://reviews.freebsd.org/D11904 >> > >> @@ -266,13 +267,14 @@ linux_delayed_work_timer_fn(void *arg) >> [WORK_ST_IDLE] = WORK_ST_IDLE, /* NOP */ >> [WORK_ST_TIMER] = WORK_ST_TASK, /* start queueing task */ >> [WORK_ST_TASK] = WORK_ST_TASK, /* NOP */ >> - [WORK_ST_EXEC] = WORK_ST_TASK, /* queue task another time */ >> - [WORK_ST_CANCEL] = WORK_ST_IDLE, /* complete cancel */ >> + [WORK_ST_EXEC] = WORK_ST_EXEC, /* NOP */ >> + [WORK_ST_CANCEL] = WORK_ST_TASK, /* failed to cancel */ >> }; >> struct delayed_work *dwork = arg; >> >> switch (linux_update_state(&dwork->work.state, states)) { >> case WORK_ST_TIMER: >> + case WORK_ST_CANCEL: >> linux_delayed_work_enqueue(dwork); >> break; >> default: > > I believe that this hunk introduced a regression into workqueue. > Consider the following scenario: > > 1. A delayed_work struct in the WORK_ST_TIMER state. > 2. Thread A calls mod_delayed_work() > 3. Thread B (a callout thread) simultaneously calls > linux_delayed_work_timer_fn() > > The following sequence of events is possible: > > A: Call linux_cancel_delayed_work() > A: Change state from TIMER TO CANCEL > B: Change state from CANCEL to TASK > B: taskqueue_enqueue() the task > A: taskqueue_cancel() the task > A: Call linux_queue_delayed_work_on(). This is a no-op because the > state is WORK_ST_TASK. > > As a result, the delayed_work struct will never be invoked. This is > causing address resolution in ib_addr.c to stop permanently, as it > never tries to reschedule a task that it thinks is already scheduled. > > Do you have a recommendation? Should we unconditionally > taskqueue_enqueue() when in the WORK_ST_TASK state and > linux_queue_delayed_work_on() is called? That is harmless for a > pending task but will break the deadlock if the race is lost. > Hi Ryan, Do you have a patch to fix this? --HPS
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?f785d26a-e907-7473-0777-7b65ae99b7c2>