Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 27 Jan 2016 18:18:16 +0100
From:      Wojciech Macek <wma@semihalf.com>
To:        developers@freebsd.org, freebsd-arm@freebsd.org
Cc:        Olivier Houchard <cognet@ci0.org>, arm64-dev <arm64-dev@semihalf.com>
Subject:   SCHED_ULE race condition, fix proposal
Message-ID:  <CANsEV8e2QbW1Y83eC-d3GczWu4Lu91jDK14Xa1FkL=Y2s%2BRBMQ@mail.gmail.com>

next in thread | raw e-mail | index | archive | help
Hello,

I've encountered a very nasty race condition during debugging armv8 HWPMC.
It seems that ULE scheduler can execute the same thread on two different
CPUs at the same time...

Here is the scenario.
The PMC driver must execute some of the code on the CPU0. To ensure that, a
process migration is triggered as following:


thread_lock(curthread);
sched_bind(curthread, cpu);
thread_unlock(curthread);

KASSERT(curthread->td_oncpu == cpu,
 ("[pmc,%d] CPU not bound [cpu=%d, curr=%d]", __LINE__,
 cpu, curthread->td_oncpu));


That causes the context switch and (finally) execution of sched_switch()
function. The code correctly detects migration and calls
sched_switch_migrate. That function is supposed to add current thread to
the runqueue of another CPU ("tdn" variable). So it does:

tdq_lock_pair(tdn, tdq);
tdq_add(tdn, td, flags);
tdq_notify(tdn, td);
TDQ_UNLOCK(tdn);
spinlock_exit();


But that sometimes is causing a crash, because the other CPU is staring to
process mi_switch as soon as the IPI arrives (via tdq_notify) and the
runqueue lock is released. The problem is, that the thread does not contain
valid register set, because its context was not yet stored - that happens
later in machine dependent cpu_switch function. In another words, the
sched_switch run on the CPU we want the thread to migrate onto restores
thread context before it was actually stored on another core - that causes
setting regs/pc/lt to some junk data and crash.


I'd like to discuss a possible solution for this. I think it would be
reasonable to extend cpu_switch to be capable of releasing a lock as the
last thing it does after storing everything into the PCB. We could then
remove the "TDQ_UNLOCK(tdn);" from the sched_switch_migrate and be sure
that in the situation of migration nobody is allowed to touch the target
runqueue until the migrating process finishes storing its context.

But first I'd like to discuss some possible alternatives and maybe find
another solution, because any change in this area will impact all supported
architectures.


Regards,
Wojtek
wma@freebsd.org



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANsEV8e2QbW1Y83eC-d3GczWu4Lu91jDK14Xa1FkL=Y2s%2BRBMQ>