From owner-freebsd-hackers@freebsd.org Wed Jan 27 19:06:18 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3AB22A6ED68 for ; Wed, 27 Jan 2016 19:06:18 +0000 (UTC) (envelope-from wma@semihalf.com) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 1ADCE1112 for ; Wed, 27 Jan 2016 19:06:18 +0000 (UTC) (envelope-from wma@semihalf.com) Received: by mailman.ysv.freebsd.org (Postfix) id 17DC9A6ED65; Wed, 27 Jan 2016 19:06:18 +0000 (UTC) Delivered-To: hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id F1AB8A6ED63 for ; Wed, 27 Jan 2016 19:06:17 +0000 (UTC) (envelope-from wma@semihalf.com) Received: from mail-ig0-x236.google.com (mail-ig0-x236.google.com [IPv6:2607:f8b0:4001:c05::236]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id C53731110 for ; Wed, 27 Jan 2016 19:06:17 +0000 (UTC) (envelope-from wma@semihalf.com) Received: by mail-ig0-x236.google.com with SMTP id mw1so18648940igb.1 for ; Wed, 27 Jan 2016 11:06:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=semihalf-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=mZb7Cg9biXswox7H7/qDgpQh9w6eLXJpd6+BsxavL8U=; b=zQjQgneo+qDINVbqBikQCPvD776CLR1LONJyfsHQbeb+qWnO0FOqEPMMADXBGKZer2 DxrkMzgeXnZ3veCzlmAAmeCBunUsux/VWMt/AdgDJ30NwQuXqxnUWsf4pP4wLZVLOwIq WF9TzXakGhCWo1W/K2jxjWQ8AdcsBdAiL4tqYOUJCaus7OuMXNKpIfUPd+y+QZ0LJxvV x6WbF0RUwLaFOmZquOVyCIlCJz01HbNDDCM3euxCaV9acdL4SaTo1TmzAwViSbA83wOp CnS7m/A11D6g2tZAU9sAKBSH9lLvsh/UFmfQfnzCEYAh5gV+jboDl0Ncp/Ue1QIHAGeT 66Tg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-type; bh=mZb7Cg9biXswox7H7/qDgpQh9w6eLXJpd6+BsxavL8U=; b=gnts/Q2uOkC6d0A/1PEUflh3jXivFVz3ThAVkcPSfk4L4Y5Jzgogc3c4wPi4flxLJz 0TP5vwFTcxWPAK5PlfcEYmH4ocELon0lR1nnKBqv+3fCu9eNefZOrxXZ1jACKi/i2BK4 aPfroLA4OysmjrCYPEV49R207TuySHph9pmi8AWy5m1yQzlRKCm5gG/wOwDSwfjchmYl dzaWDnigZa65rNymxOO6KmKHhB/ltHA+s5DOSQuAaK2mD6uhjOCFfQK7t60Su0O0Ri8m AlF1fhGh143pghmUhxN2ohxDYFC4AZKk6OQHTTTZGrgtV1JJgcY46VDxZ9k5PxN+D6TU 4P8A== X-Gm-Message-State: AG10YOS51xbWRB75zSPCE1CLbW2SO+/0hg7c3Zmi4jG+FYiBN42ZlecBVJcgAG/PKyWvuuZq4NBU29rKipcd6Q== X-Received: by 10.50.136.136 with SMTP id qa8mr30316937igb.39.1453921577007; Wed, 27 Jan 2016 11:06:17 -0800 (PST) MIME-Version: 1.0 Received: by 10.107.154.19 with HTTP; Wed, 27 Jan 2016 11:05:57 -0800 (PST) In-Reply-To: <20160127181426.GA48838@ci0.org> References: <2587742.rOiGAYXjN1@ralph.baldwin.cx> <20160127181426.GA48838@ci0.org> From: Wojciech Macek Date: Wed, 27 Jan 2016 20:05:57 +0100 Message-ID: Subject: Re: SCHED_ULE race condition, fix proposal To: Olivier Houchard Cc: John Baldwin , hackers@freebsd.org, freebsd-arm@freebsd.org, arm64-dev X-Mailman-Approved-At: Wed, 27 Jan 2016 19:31:38 +0000 Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Jan 2016 19:06:18 -0000 Holly... Good catch :) Indeed, blocked_lock is not a pointer. I'll check it tommorow at work, but it looks promissing. And the only reason it didn't cause any data abort is that the mutex structure cointains a pointer to a string (name) at the offset 0. Thanks, Wojtek 2016-01-27 19:14 GMT+01:00 Olivier Houchard : > Hi, > > I may be reading the arm64 code wrong, but shouldn't : > ldr x0, =_C_LABEL(blocked_lock) > ldr x2, [x0] > > be just: > ldr x2, =_C_LABEL(blocked_lock) > > Regards, > > Olivier > > On Wed, Jan 27, 2016 at 09:51:12AM -0800, John Baldwin wrote: > > On Wednesday, January 27, 2016 06:18:16 PM Wojciech Macek wrote: > > > Hello, > > > > > > I've encountered a very nasty race condition during debugging armv8 > HWPMC. > > > It seems that ULE scheduler can execute the same thread on two > different > > > CPUs at the same time... > > > > > > Here is the scenario. > > > The PMC driver must execute some of the code on the CPU0. To ensure > that, a > > > process migration is triggered as following: > > > > > > > > > thread_lock(curthread); > > > sched_bind(curthread, cpu); > > > thread_unlock(curthread); > > > > > > KASSERT(curthread->td_oncpu == cpu, > > > ("[pmc,%d] CPU not bound [cpu=%d, curr=%d]", __LINE__, > > > cpu, curthread->td_oncpu)); > > > > > > > > > That causes the context switch and (finally) execution of > sched_switch() > > > function. The code correctly detects migration and calls > > > sched_switch_migrate. That function is supposed to add current thread > to > > > the runqueue of another CPU ("tdn" variable). So it does: > > > > > > tdq_lock_pair(tdn, tdq); > > > tdq_add(tdn, td, flags); > > > tdq_notify(tdn, td); > > > TDQ_UNLOCK(tdn); > > > spinlock_exit(); > > > > > > > > > But that sometimes is causing a crash, because the other CPU is > staring to > > > process mi_switch as soon as the IPI arrives (via tdq_notify) and the > > > runqueue lock is released. The problem is, that the thread does not > contain > > > valid register set, because its context was not yet stored - that > happens > > > later in machine dependent cpu_switch function. In another words, the > > > sched_switch run on the CPU we want the thread to migrate onto restores > > > thread context before it was actually stored on another core - that > causes > > > setting regs/pc/lt to some junk data and crash. > > > > > > > > > I'd like to discuss a possible solution for this. I think it would be > > > reasonable to extend cpu_switch to be capable of releasing a lock as > the > > > last thing it does after storing everything into the PCB. We could then > > > remove the "TDQ_UNLOCK(tdn);" from the sched_switch_migrate and be sure > > > that in the situation of migration nobody is allowed to touch the > target > > > runqueue until the migrating process finishes storing its context. > > > > > > But first I'd like to discuss some possible alternatives and maybe find > > > another solution, because any change in this area will impact all > supported > > > architectures. > > > > This belongs on hackers, not developers@. > > > > cpu_switch() already does what you describe though in a slightly > different > > way. The thread_lock() of a thread being switched out is set to > blocked_lock. > > cpu_switch() on the new CPU will always spin until cpu_switch updates > > thread_lock of the old thread to point to the proper runq lock after > saving > > its state in the pcb. arm64 does this here: > > > > /* > > * Release the old thread. This doesn't need to be a > store-release > > * as the above dsb instruction will provide release semantics. > > */ > > str x2, [x0, #TD_LOCK] > > #if defined(SCHED_ULE) && defined(SMP) > > /* Read the value in blocked_lock */ > > ldr x0, =_C_LABEL(blocked_lock) > > ldr x2, [x0] > > 1: > > ldar x3, [x1, #TD_LOCK] > > cmp x3, x2 > > b.eq 1b > > #endif > > > > Note the thread_lock_block() call just above the block you noted from > > sched_switch_migrate() to see where td_lock is set to &blocked_lock. > > > > If the comment about 'dsb' above is wrong that might explain why you see > > stale state in the PCB after seeing the new value of td_lock. > > > > -- > > John Baldwin >