Date: Wed, 27 Jan 2016 20:05:57 +0100 From: Wojciech Macek <wma@semihalf.com> To: Olivier Houchard <cognet@ci0.org> Cc: John Baldwin <jhb@freebsd.org>, hackers@freebsd.org, freebsd-arm@freebsd.org, arm64-dev <arm64-dev@semihalf.com> Subject: Re: SCHED_ULE race condition, fix proposal Message-ID: <CANsEV8djDi9eCtDby-gJt88yLYtdrYMEY5-gjg3axEUEchwybA@mail.gmail.com> In-Reply-To: <20160127181426.GA48838@ci0.org> References: <CANsEV8e2QbW1Y83eC-d3GczWu4Lu91jDK14Xa1FkL=Y2s%2BRBMQ@mail.gmail.com> <2587742.rOiGAYXjN1@ralph.baldwin.cx> <20160127181426.GA48838@ci0.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Holly... Good catch :) Indeed, blocked_lock is not a pointer. I'll check it tommorow at work, but it looks promissing. And the only reason it didn't cause any data abort is that the mutex structure cointains a pointer to a string (name) at the offset 0. Thanks, Wojtek 2016-01-27 19:14 GMT+01:00 Olivier Houchard <cognet@ci0.org>: > Hi, > > I may be reading the arm64 code wrong, but shouldn't : > ldr x0, =_C_LABEL(blocked_lock) > ldr x2, [x0] > > be just: > ldr x2, =_C_LABEL(blocked_lock) > > Regards, > > Olivier > > On Wed, Jan 27, 2016 at 09:51:12AM -0800, John Baldwin wrote: > > On Wednesday, January 27, 2016 06:18:16 PM Wojciech Macek wrote: > > > Hello, > > > > > > I've encountered a very nasty race condition during debugging armv8 > HWPMC. > > > It seems that ULE scheduler can execute the same thread on two > different > > > CPUs at the same time... > > > > > > Here is the scenario. > > > The PMC driver must execute some of the code on the CPU0. To ensure > that, a > > > process migration is triggered as following: > > > > > > > > > thread_lock(curthread); > > > sched_bind(curthread, cpu); > > > thread_unlock(curthread); > > > > > > KASSERT(curthread->td_oncpu == cpu, > > > ("[pmc,%d] CPU not bound [cpu=%d, curr=%d]", __LINE__, > > > cpu, curthread->td_oncpu)); > > > > > > > > > That causes the context switch and (finally) execution of > sched_switch() > > > function. The code correctly detects migration and calls > > > sched_switch_migrate. That function is supposed to add current thread > to > > > the runqueue of another CPU ("tdn" variable). So it does: > > > > > > tdq_lock_pair(tdn, tdq); > > > tdq_add(tdn, td, flags); > > > tdq_notify(tdn, td); > > > TDQ_UNLOCK(tdn); > > > spinlock_exit(); > > > > > > > > > But that sometimes is causing a crash, because the other CPU is > staring to > > > process mi_switch as soon as the IPI arrives (via tdq_notify) and the > > > runqueue lock is released. The problem is, that the thread does not > contain > > > valid register set, because its context was not yet stored - that > happens > > > later in machine dependent cpu_switch function. In another words, the > > > sched_switch run on the CPU we want the thread to migrate onto restores > > > thread context before it was actually stored on another core - that > causes > > > setting regs/pc/lt to some junk data and crash. > > > > > > > > > I'd like to discuss a possible solution for this. I think it would be > > > reasonable to extend cpu_switch to be capable of releasing a lock as > the > > > last thing it does after storing everything into the PCB. We could then > > > remove the "TDQ_UNLOCK(tdn);" from the sched_switch_migrate and be sure > > > that in the situation of migration nobody is allowed to touch the > target > > > runqueue until the migrating process finishes storing its context. > > > > > > But first I'd like to discuss some possible alternatives and maybe find > > > another solution, because any change in this area will impact all > supported > > > architectures. > > > > This belongs on hackers, not developers@. > > > > cpu_switch() already does what you describe though in a slightly > different > > way. The thread_lock() of a thread being switched out is set to > blocked_lock. > > cpu_switch() on the new CPU will always spin until cpu_switch updates > > thread_lock of the old thread to point to the proper runq lock after > saving > > its state in the pcb. arm64 does this here: > > > > /* > > * Release the old thread. This doesn't need to be a > store-release > > * as the above dsb instruction will provide release semantics. > > */ > > str x2, [x0, #TD_LOCK] > > #if defined(SCHED_ULE) && defined(SMP) > > /* Read the value in blocked_lock */ > > ldr x0, =_C_LABEL(blocked_lock) > > ldr x2, [x0] > > 1: > > ldar x3, [x1, #TD_LOCK] > > cmp x3, x2 > > b.eq 1b > > #endif > > > > Note the thread_lock_block() call just above the block you noted from > > sched_switch_migrate() to see where td_lock is set to &blocked_lock. > > > > If the comment about 'dsb' above is wrong that might explain why you see > > stale state in the PCB after seeing the new value of td_lock. > > > > -- > > John Baldwin >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANsEV8djDi9eCtDby-gJt88yLYtdrYMEY5-gjg3axEUEchwybA>