From owner-freebsd-arch@FreeBSD.ORG Tue Jun 5 05:33:04 2007 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 4FD8916A41F; Tue, 5 Jun 2007 05:33:04 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from webaccess-cl.virtdom.com (webaccess-cl.virtdom.com [216.240.101.25]) by mx1.freebsd.org (Postfix) with ESMTP id 0CC1513C44B; Tue, 5 Jun 2007 05:33:03 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from [192.168.1.101] (c-71-231-138-78.hsd1.or.comcast.net [71.231.138.78]) (authenticated bits=0) by webaccess-cl.virtdom.com (8.13.6/8.13.6) with ESMTP id l555X0jf091176 (version=TLSv1/SSLv3 cipher=DHE-DSS-AES256-SHA bits=256 verify=NO); Tue, 5 Jun 2007 01:33:01 -0400 (EDT) (envelope-from jroberson@chesapeake.net) Date: Mon, 4 Jun 2007 22:32:46 -0700 (PDT) From: Jeff Roberson X-X-Sender: jroberson@10.0.0.1 To: arch@freebsd.org Message-ID: <20070604220649.E606@10.0.0.1> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: kmacy@freebsd.org, benno@freebsd.org, marius@freebsd.org, marcl@freebsd.org, jake@freebsd.org, tmm@freebsd.org, cognet@freebsd.org, grehan@freebsd.org Subject: New cpu_switch() and cpu_throw(). X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Jun 2007 05:33:04 -0000 For every architecture we need to support a new features in cpu_switch() and cpu_throw() before they can support per-cpu schedlock. I'll describe those below. I'm soliciting help or advice in implementing these on platforms other than x86, and amd64, especially on ia64 where things are implemented in C! I checked in the new version of cpu_switch() for amd64 today after threadlock went in. Basically, we have to release a thread's lock when it's switched out and acquire a lock when it's switched in. The release must happen after we're totally done with the stack and vmspace of the thread to be switched out. On amd64 this meant after we clear the active bits for tlb shootdown. The release actually makes use of a new 'mtx' argument to cpu_switch() and sets the td_lock pointer to this argument rather than unlocking a real lock. td_lock has previously been set to the blocked lock, which is always blocked. Threads spinning in thread_lock() will notice the td_lock pointer change and acquire the new lock. So this is simple, just a non-atomic store with a pointer passed as an argument. On amd64: movq %rdx, TD_LOCK(%rdi) /* Release the old thread */ The acquire part is slightly more complicated and involves a little loop. We don't actually have to spin trying to lock the thread. We just spin until it's no longer set to the blocked lock. The switching thread already owns the per-cpu scheduler lock for the current cpu. If we're switching into a thread that is set to the blocked_lock another cpu is about to set it to our current cpu's lock via the mtx argument mentioned above. On amd64 we have: /* Wait for the new thread to become unblocked */ movq $blocked_lock, %rdx 1: movq TD_LOCK(%rsi),%rcx cmpq %rcx, %rdx je 1b So these two are actually quite simple. You can see the full patch for cpu_switch.S as the first file modified in: http://people.freebsd.org/~jeff/threadlock.diff For cpu_throw() we have to actually complete a real unlock of a spinlock. What happens here, although this isn't in cvs yet, is that thread_exit() will set the thread's lock pointer to be the per-process spinlock. This spinlock must be unlocked so that process resources can't be reclaimed by wait while a thread is executing cpu_throw(). This code on amd64 is (from memory rather than a patch): movq $MTX_UNOWNED, %rdx movq TD_LOCK(%rsi), %rsi xchgq %rdx, MTX_LOCK(%rsi) I'm hoping to have at least the cpu_throw() part done for every architecture for 7.0. This will enable me to simplify thread_exit() and not have a lot of per-scheduler/architecture workarounds. Without the cpu_switch() parts sched_4bsd will still work on an architecture. I have a per-cpu spinlock version of ULE which may replace ULE or exist along side it as sched_smp. This will only work on architectures that implement the new cpu_throw() and cpu_switch(). Consider this an official call for help with the architectures you maintain. Please let me know also if you maintain an arch that you don't mind to have temporarily broken until you implement this. Thanks, Jeff