From owner-svn-src-all@FreeBSD.ORG Thu Jan 28 10:16:57 2010 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 399B5106566B; Thu, 28 Jan 2010 10:16:57 +0000 (UTC) (envelope-from asmrookie@gmail.com) Received: from mail-iw0-f178.google.com (mail-iw0-f178.google.com [209.85.223.178]) by mx1.freebsd.org (Postfix) with ESMTP id D1CEE8FC15; Thu, 28 Jan 2010 10:16:56 +0000 (UTC) Received: by iwn8 with SMTP id 8so598586iwn.13 for ; Thu, 28 Jan 2010 02:16:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:in-reply-to :references:date:x-google-sender-auth:message-id:subject:from:to:cc :content-type:content-transfer-encoding; bh=SXj8qvcvGsl3huRdFZRbHPX6jtNcNbewVdAXYix7DU4=; b=WKixgvDbN91TyNkHdl4yn4cxvug4RBr3VQMrU0L5qSm/VmMPLvYmdJXxE4kbXzERKw 4s0BhTZNHjHWzVl19Ugs76TyZNUMjloFYmvxTHFflXAMck8ytUq6ChmwX593fEo9BwZX 84oGSgtPfb84NLIR2+bzk3KBk4C8bF5vEJZZ0= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=WswmBWQDMkgrAMSBaPbTMdGW7cTMX47X2Cf5MX+Ku8JUFMkxenKg1ttSwSSmQyxBhT Ew+hizURgB3AcTRPU6P2CvQBnREGwe/+5zA5WsqDhTdqB65vlptuStGrdQU/ufLDDqp9 zvbNt+Lwb40d+DVe0HljeN0E0YXBBtwKazW+U= MIME-Version: 1.0 Sender: asmrookie@gmail.com Received: by 10.231.154.197 with SMTP id p5mr194257ibw.28.1264673815928; Thu, 28 Jan 2010 02:16:55 -0800 (PST) In-Reply-To: <20100127215904.GF40779@alchemy.franken.de> References: <201001231554.o0NFsMbx049837@svn.freebsd.org> <3bbf2fe11001252310r408a6be4j9bc42618394b3e3d@mail.gmail.com> <20100127215904.GF40779@alchemy.franken.de> Date: Thu, 28 Jan 2010 11:16:55 +0100 X-Google-Sender-Auth: 09c0f78f7782114d Message-ID: <3bbf2fe11001280216p705ed94ev61abc4be654f8cc1@mail.gmail.com> From: Attilio Rao To: Marius Strobl Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, Rob Farmer , src-committers@freebsd.org Subject: Re: svn commit: r202889 - head/sys/kern X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Jan 2010 10:16:57 -0000 2010/1/27 Marius Strobl : > On Tue, Jan 26, 2010 at 08:10:25AM +0100, Attilio Rao wrote: >> 2010/1/26 Rob Farmer : >> > On Sat, Jan 23, 2010 at 7:54 AM, Attilio Rao wro= te: >> >> Author: attilio >> >> Date: Sat Jan 23 15:54:21 2010 >> >> New Revision: 202889 >> >> URL: http://svn.freebsd.org/changeset/base/202889 >> >> >> >> Log: >> >> =C2=A0- Fix a race in sched_switch() of sched_4bsd. >> >> =C2=A0 =C2=A0In the case of the thread being on a sleepqueue or a tur= nstile, the >> >> =C2=A0 =C2=A0sched_lock was acquired (without the aid of the td_lock = interface) and >> >> =C2=A0 =C2=A0the td_lock was dropped. This was going to break locking= rules on other >> >> =C2=A0 =C2=A0threads willing to access to the thread (via the td_lock= interface) and >> >> =C2=A0 =C2=A0modify his flags (allowed as long as the container lock = was different >> >> =C2=A0 =C2=A0by the one used in sched_switch). >> >> =C2=A0 =C2=A0In order to prevent this situation, while sched_lock is = acquired there >> >> =C2=A0 =C2=A0the td_lock gets blocked. [0] >> >> =C2=A0- Merge the ULE's internal function thread_block_switch() into = the global >> >> =C2=A0 =C2=A0thread_lock_block() and make the former semantic as the = default for >> >> =C2=A0 =C2=A0thread_lock_block(). This means that thread_lock_block()= will not >> >> =C2=A0 =C2=A0disable interrupts when called (and consequently thread_= unlock_block() >> >> =C2=A0 =C2=A0will not re-enabled them when called). This should be do= ne manually >> >> =C2=A0 =C2=A0when necessary. >> >> =C2=A0 =C2=A0Note, however, that ULE's thread_unblock_switch() is not= reaped >> >> =C2=A0 =C2=A0because it does reflect a difference in semantic due in = ULE (the >> >> =C2=A0 =C2=A0td_lock may not be necessarilly still blocked_lock when = calling this). >> >> =C2=A0 =C2=A0While asymmetric, it does describe a remarkable differen= ce in semantic >> >> =C2=A0 =C2=A0that is good to keep in mind. >> >> >> >> =C2=A0[0] Reported by: =C2=A0 =C2=A0 =C2=A0Kohji Okuno >> >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 >> >> =C2=A0Tested by: =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Giovanni Tr= ematerra >> >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 >> >> =C2=A0MFC: =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A02 weeks >> >> >> >> Modified: >> >> =C2=A0head/sys/kern/kern_mutex.c >> >> =C2=A0head/sys/kern/sched_4bsd.c >> >> =C2=A0head/sys/kern/sched_ule.c >> > >> > Hi, >> > >> > This commit seems to be causing me a kernel panic on sparc64 - details >> > are in PR 143215. Could you take a look before MFCing this? >> >> I think that the bug may be in cpu_switch() where the mutex parameter >> for sched_4bsd is not handled correctly. >> Does sparc64 support ULE? I don't think it does and I think that it >> simply ignores the third argument of cpu_switch() which is vital now >> for for sched_4bsd too (what needs to happen is to take the passed >> mutex and to set the TD_LOCK of old thread to be the third argument). >> Unluckilly, I can't do that in sparc64 asm right now, but it should >> not be too difficult to cope with it. >> > > The following patch adds handling of the mutex parameter to the > sparc64 cpu_switch(): > http://people.freebsd.org/~marius/sparc64_cpu_switch_mtx.diff > This patch works fine with r202888. With r202889 it allows the > machine to boot again, however putting some load on the machine > causes it to issue a reset without a chance to debug. I've also > tried with some variations like duplicating the old cpu_switch() > for cpu_throw() so the altered cpu_switch() doesn't need to > distinguish between the to cases and only assigning old->td_lock > right before return but nothing made a difference. Given that > this leaves little room for a bug in the cpu_switch() changes I > suspect r202889 also breaks additional assumptions. For example > the sparc64 pmap code used sched_lock, does that need to change > to be td_lock now maybe? Is there anything else that comes to > your mind in this regard? Sorry for being lame with sparc64 assembly (so that I can't make much more productive help here), but the required patch, sched_4bsd only, should simply save the extra-argument of cpu_switch() (and cpu_throw() is not involved, so I'm not sure what is changing there) and move in TD_LOCK(%oldthreadreg) when it is safe to do (just after the oldthread switched out completely). It doesn't even require a memory barrier. This patch seems a bit too big and I wonder what else it does (or I'm entirely wrong and that's just what I asked here), maybe adding the ULE support as well? Said that, all the code, including MD parts should always use td_lock() and not doing explicit acquisitions/drops of sched_lock, if they want to support ULE (but probabilly even if they do not want), unless there is a big compelling reason (that I expect to be justified in comments at least). I'm not sure how to debug a possible reset or I'm not aware of further broken asserts, at least for tier-1 architectures. Thanks, Attilio --=20 Peace can only be achieved by understanding - A. Einstein