From owner-freebsd-arch@FreeBSD.ORG Tue Jun 4 04:42:42 2013 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id D87ECA57 for ; Tue, 4 Jun 2013 04:42:42 +0000 (UTC) (envelope-from imp@bsdimp.com) Received: from mail-ie0-x236.google.com (mail-ie0-x236.google.com [IPv6:2607:f8b0:4001:c03::236]) by mx1.freebsd.org (Postfix) with ESMTP id A56831A69 for ; Tue, 4 Jun 2013 04:42:42 +0000 (UTC) Received: by mail-ie0-f182.google.com with SMTP id 9so7504813iec.27 for ; Mon, 03 Jun 2013 21:42:42 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=sender:subject:mime-version:content-type:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to:x-mailer :x-gm-message-state; bh=1vTRJLmUnOFfInZqxD6Y4bkeBW+s05zLyd8c9y47FSo=; b=ZLxK4c4ZayxNVsfqLYZWG5oplzEWAR5uC1yFXqdWbSgZO9gu0+dAFb/a8+1jHDu7qr e1xZcYXQyI+apazMuL+OME7f4mgHez7bYDUc2JrLiFJRtwc8DxU4Jk8toMYRywYBZMu8 oY3+fgDhPP9PP7jYkfAAA26WSj8XSXwYJYx2k+hYuYvT8DRmPCBvJg3Pl21/gnaTNvhv iZo6dWW+eQWa7ZTLKdKdIM5qmmjpEc2CoRnGEZZ6KMrBRGMEWNtRUcaeQy4iL6fn9tX7 HjIrxRsxFiBiGnDF8Ha7SNG8Oc5fCFLnHmzI4rpd49f7k7dpiAAYCL5wtNhMSz+9cl+2 1uJQ== X-Received: by 10.50.136.138 with SMTP id qa10mr671216igb.53.1370320962060; Mon, 03 Jun 2013 21:42:42 -0700 (PDT) Received: from 53.imp.bsdimp.com (50-78-194-198-static.hfc.comcastbusiness.net. [50.78.194.198]) by mx.google.com with ESMTPSA id l14sm22938417igf.9.2013.06.03.21.42.40 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Mon, 03 Jun 2013 21:42:41 -0700 (PDT) Sender: Warner Losh Subject: Re: Kernelspace C11 atomics for MIPS Mime-Version: 1.0 (Apple Message framework v1085) Content-Type: text/plain; charset=us-ascii From: Warner Losh In-Reply-To: Date: Mon, 3 Jun 2013 22:42:39 -0600 Content-Transfer-Encoding: quoted-printable Message-Id: <05C98B6B-1531-4E09-80D7-4F3B1A88FF01@bsdimp.com> References: To: Patrick Kelsey X-Mailer: Apple Mail (2.1085) X-Gm-Message-State: ALoCoQlWl8QA/vHY2srwY+Mn4kw+g2v1qt9H/1DvuhxTp5vd9172cetMkUmJME2izhbVdD4WR6iR Cc: Juli Mallett , Ed Schouten , Adrian Chadd , "freebsd-mips@FreeBSD.org" , FreeBSD-arch X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 04 Jun 2013 04:42:42 -0000 On Jun 3, 2013, at 10:15 PM, Patrick Kelsey wrote: > On Tue, Jun 4, 2013 at 12:08 AM, Patrick Kelsey = wrote: >> On Mon, Jun 3, 2013 at 11:57 PM, Adrian Chadd = wrote: >>> On 3 June 2013 20:55, Juli Mallett wrote: >>>=20 >>>> To drain the pipeline on certain deficient (and mostly older) CPUs = by way of >>>> guesswork and a little vague magic. Most CPUs we support, I would = guess, do >>>> not need this, and it continues to exist solely for hysterical = reasons. >>>=20 >>> How can I turn it off for my compiles? >>>=20 >>>> I've certainly gotten rid of them and some other cargo cult = synchronization >>>> on Octeon for testing and had it survive under considerable load, = and >>>> occasionally with some slight speedups (for some more commonly-used = or >>>> slower things than Just a Bunch Of NOPs.) >>>=20 >>> Right. Well, since it's happening on every inlined lock, it's a bit = silly. >>>=20 >>>> The trouble is that proving they aren't necessary requires being = rigorous >>>> and careful in understanding documentation and errata, and FUD = about their >>>> possible necessity is somewhat-intimidating. It's not an easy kind = of >>>> corruption/unreliability/etc., to prove the lack of empirically. >>>=20 >>> I've checked the diassembly from gcc-4.mumble on linux; it doesn't >>> include NOPs like this as far as I can tell. >>>=20 >>=20 >> The sync + 8 nops is coming from the definition of mips_sync() in >> sys/mips/include/atomic.h. They came from the old mips2 branch, which may have been from the = Juniper code merge, or maybe not. >> I agree with Juli that it appears to be a manual pipeline-flush >> holdover from earlier days - I'm guessing there's 8 nops because the >> R4000/4400 had both the sync instruction and an 8-stage pipeline. = I'm >> further guessing this was an attempt at providing stronger ordering >> semantics than the sync instruction itself for the following >> mb()/wmb()/rmb() definitions that use it, as the sync instruction >> definition doesn't restrict execution of the before/after = loads/stores >> with respect to the sync instruction itself. >=20 > Forgot to emphasize that this particular bit of old-school > nop-counting is either pointless or a latent hazard - 8 does not cover > the deepest MIPS pipeline around, then there's superscalar issue to > consider - so I think it's either unnecessary or insufficient. So > far, that's all criticism and no solution :/ Yes, there's new nops for these situations starting in mips32r2 and = mips64r2 ISAs. I think that this originated in the mips-jnpr merge, but can't find the = old branches anywhere... Warner