From owner-freebsd-arch@FreeBSD.ORG Sun Mar 1 20:40:22 2015 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id C44EC72F; Sun, 1 Mar 2015 20:40:22 +0000 (UTC) Received: from mailout.easymail.ca (mailout.easymail.ca [64.68.201.169]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 6F1ED132; Sun, 1 Mar 2015 20:40:21 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by mailout.easymail.ca (Postfix) with ESMTP id A074DE30D; Sun, 1 Mar 2015 15:40:20 -0500 (EST) X-Virus-Scanned: Debian amavisd-new at mailout.easymail.ca X-Spam-Flag: NO X-Spam-Score: -3.851 X-Spam-Level: X-Spam-Status: No, score=-3.851 required=5 tests=[ALL_TRUSTED=-1.8, AWL=-0.144, BAYES_00=-2.599, DNS_FROM_AHBL_RHSBL=0.692] Received: from mailout.easymail.ca ([127.0.0.1]) by localhost (easymail-mailout.easydns.vpn [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id hPPjDZ216Zch; Sun, 1 Mar 2015 15:40:20 -0500 (EST) Received: from bsddt1241.lv01.astrodoggroup.com (unknown [40.141.24.126]) (using TLSv1 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by mailout.easymail.ca (Postfix) with ESMTPSA id 335D5E2FF; Sun, 1 Mar 2015 15:40:20 -0500 (EST) Message-ID: <54F37897.8040009@astrodoggroup.com> Date: Sun, 01 Mar 2015 12:37:43 -0800 From: Harrison Grundy User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:31.0) Gecko/20100101 Thunderbird/31.3.0 MIME-Version: 1.0 To: freebsd-net@freebsd.org, freebsd-arch@freebsd.org Subject: Call for testers: Fix incorrect error codes in connect() Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 01 Mar 2015 20:40:22 -0000 connect(), in both the man page and under POSIX is documented to only return EINVAL for invalid lengths of the namelen parameter, which is a fatal error. As it stands now, it will also return EINVAL when called on TIMEWAIT or DROPPED sockets. The patch at https://reviews.freebsd.org/D1982 changes connect to return EADDRINUSE on time-wait and ECONNREFUSED on dropped. (Different values may make more sense here, POSIX doesn't seem to specify.) If anyone has time to run the patch attached to D1982, it'd be greatly appreciated as I'm trying to find out just how much (if any) software currently depends on the old broken behavior. Any input as to what connect should return in these cases is also appreciated. --- Harrison From owner-freebsd-arch@FreeBSD.ORG Mon Mar 2 13:31:22 2015 Return-Path: Delivered-To: arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id A7C5F491 for ; Mon, 2 Mar 2015 13:31:22 +0000 (UTC) Received: from work.netasq.com (gwlille.netasq.com [91.212.116.1]) by mx1.freebsd.org (Postfix) with ESMTP id 6A4A680A for ; Mon, 2 Mar 2015 13:31:21 +0000 (UTC) Received: from work.netasq.com (localhost.localdomain [127.0.0.1]) by work.netasq.com (Postfix) with ESMTP id D3D47270087E; Mon, 2 Mar 2015 14:31:13 +0100 (CET) Received: from localhost (localhost.localdomain [127.0.0.1]) by work.netasq.com (Postfix) with ESMTP id A07752700964; Mon, 2 Mar 2015 14:31:13 +0100 (CET) Received: from work.netasq.com ([127.0.0.1]) by localhost (work.netasq.com [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id 96OPZqbf4xhJ; Mon, 2 Mar 2015 14:31:13 +0100 (CET) Received: from work.netasq.com (localhost.localdomain [127.0.0.1]) by work.netasq.com (Postfix) with ESMTP id 7135E270087E; Mon, 2 Mar 2015 14:31:13 +0100 (CET) Date: Mon, 2 Mar 2015 14:31:13 +0100 (CET) From: Emeric POUPON To: John-Mark Gurney Message-ID: <1824482166.23183751.1425303073196.JavaMail.zimbra@stormshield.eu> In-Reply-To: <20150224012026.GY46794@funkthat.com> References: <20150224012026.GY46794@funkthat.com> Subject: Re: locks and kernel randomness... MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Thread-Topic: locks and kernel randomness... Thread-Index: gLAKh9vz8YZYTxIQs0bz3dppjgAUkQ== Cc: arch@FreeBSD.org X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Mar 2015 13:31:22 -0000 Hello, About arc4random, we have noticed significant contention in that function o= n multi CPU systems when ciphering a lot of packets in the IPsec stack.=20 This is indeed due to the mutex that is being used in the arc4rand function= . Actually randomness is required by the IV used in the forged output packets= . However, making a separate random generator per CPU might be more complicat= ed than expected. The RFC 6027 (http://www.ietf.org/rfc/rfc6027.txt) reminds that the IV must= not be repeated : --- 3.7.1. Outbound SAs Using Counter Modes For SAs involving counter mode ciphers such as Counter Mode (CTR) ([RFC3686]) or Galois/Counter Mode (GCM) ([RFC4106]) there is yet another complication. The initial vector for such modes MUST NOT be repeated, and senders use methods such as counters or linear feedback shift registers (LFSRs) to ensure this [...] --- What do you think? Emeric Poupon ----- Mail original ----- De: "John-Mark Gurney" =C3=80: arch@FreeBSD.org Envoy=C3=A9: Mardi 24 F=C3=A9vrier 2015 02:20:26 Objet: locks and kernel randomness... I'm working on simplifying kernel randomness interfaces. I would like to get read of all weak random generators, and this means replacing read_random and random(9) w/ effectively arc4rand(9) (to be replaced by ChaCha or Keccak in the future). The issue is that random(9) is called from any number of contexts, such as the scheduler. This makes locking a bit more interesting. Currently, both arc4rand(9) and yarrow/fortuna use a default mtx lock to protect their state. This obviously isn't compatible w/ the scheduler, and possibly other calling contexts. I have a patch[1] that unifies the random interface. It converts a few of the locks from mtx default to mtx spin to deal w/ this. If/when this is accepted, my next plan is to convert away from arc4rand, to either ChaCha or Keccak. I already have another patch that converts arc4rand and friends over to ChaCha. This patch does use PCPU data and sched_pin to help eliminate locks, but this does need more study. We could either do a restartable loop (but there might be too much state to safely do) or a critical section (though running chacha a bunch of times could have impact). [1] https://reviews.freebsd.org/D1956 --=20 John-Mark Gurney=09=09=09=09Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not." _______________________________________________ freebsd-arch@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-arch To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" From owner-freebsd-arch@FreeBSD.ORG Mon Mar 2 15:03:22 2015 Return-Path: Delivered-To: arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 7A60893C for ; Mon, 2 Mar 2015 15:03:22 +0000 (UTC) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.freebsd.org (Postfix) with ESMTP id 667FD311 for ; Mon, 2 Mar 2015 15:03:22 +0000 (UTC) Received: from AlfredMacbookAir.local (hudsonhotel209.h.subnet.rcn.com [207.237.151.136]) by elvis.mu.org (Postfix) with ESMTPSA id BBE71341F90D; Mon, 2 Mar 2015 07:03:19 -0800 (PST) Message-ID: <54F47C98.2080505@freebsd.org> Date: Mon, 02 Mar 2015 10:07:04 -0500 From: Alfred Perlstein Organization: FreeBSD User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: Emeric POUPON , John-Mark Gurney Subject: Re: locks and kernel randomness... References: <20150224012026.GY46794@funkthat.com> <1824482166.23183751.1425303073196.JavaMail.zimbra@stormshield.eu> In-Reply-To: <1824482166.23183751.1425303073196.JavaMail.zimbra@stormshield.eu> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Cc: arch@FreeBSD.org X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Mar 2015 15:03:22 -0000 On 3/2/15 8:31 AM, Emeric POUPON wrote: > Hello, > > About arc4random, we have noticed significant contention in that function on multi CPU systems when ciphering a lot of packets in the IPsec stack. > This is indeed due to the mutex that is being used in the arc4rand function. > > Actually randomness is required by the IV used in the forged output packets. > However, making a separate random generator per CPU might be more complicated than expected. > The RFC 6027 (http://www.ietf.org/rfc/rfc6027.txt) reminds that the IV must not be repeated : > --- > 3.7.1. Outbound SAs Using Counter Modes > > For SAs involving counter mode ciphers such as Counter Mode (CTR) > ([RFC3686]) or Galois/Counter Mode (GCM) ([RFC4106]) there is yet > another complication. The initial vector for such modes MUST NOT be > repeated, and senders use methods such as counters or linear feedback > shift registers (LFSRs) to ensure this [...] > --- > > What do you think? If you can not have multiple random sources then what do you think of having a thread that pre-fetches batches of random values and queues it to each cpu? If you have the queue be pretty large then you shouldn't bottleneck on it. Sort of like UMA for random data. Sorry if this is a daft idea, not sure about this code path in general, but this struck me as a potential workaround. -Alfred From owner-freebsd-arch@FreeBSD.ORG Mon Mar 2 17:04:57 2015 Return-Path: Delivered-To: arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 183827D2 for ; Mon, 2 Mar 2015 17:04:57 +0000 (UTC) Received: from gold.funkthat.com (gate2.funkthat.com [208.87.223.18]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "gold.funkthat.com", Issuer "gold.funkthat.com" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id E7AC82F5 for ; Mon, 2 Mar 2015 17:04:56 +0000 (UTC) Received: from gold.funkthat.com (localhost [127.0.0.1]) by gold.funkthat.com (8.14.5/8.14.5) with ESMTP id t22H4n5g049843 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 2 Mar 2015 09:04:49 -0800 (PST) (envelope-from jmg@gold.funkthat.com) Received: (from jmg@localhost) by gold.funkthat.com (8.14.5/8.14.5/Submit) id t22H4m5f049842; Mon, 2 Mar 2015 09:04:48 -0800 (PST) (envelope-from jmg) Date: Mon, 2 Mar 2015 09:04:48 -0800 From: John-Mark Gurney To: Emeric POUPON Subject: Re: locks and kernel randomness... Message-ID: <20150302170448.GN32329@funkthat.com> References: <20150224012026.GY46794@funkthat.com> <1824482166.23183751.1425303073196.JavaMail.zimbra@stormshield.eu> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <1824482166.23183751.1425303073196.JavaMail.zimbra@stormshield.eu> X-Operating-System: FreeBSD 9.1-PRERELEASE amd64 X-PGP-Fingerprint: 54BA 873B 6515 3F10 9E88 9322 9CB1 8F74 6D3F A396 X-Files: The truth is out there X-URL: http://resnet.uoregon.edu/~gurney_j/ X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html X-TipJar: bitcoin:13Qmb6AeTgQecazTWph4XasEsP7nGRbAPE X-to-the-FBI-CIA-and-NSA: HI! HOW YA DOIN? can i haz chizburger? User-Agent: Mutt/1.5.21 (2010-09-15) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.7 (gold.funkthat.com [127.0.0.1]); Mon, 02 Mar 2015 09:04:49 -0800 (PST) Cc: arch@FreeBSD.org X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Mar 2015 17:04:57 -0000 Emeric POUPON wrote this message on Mon, Mar 02, 2015 at 14:31 +0100: > About arc4random, we have noticed significant contention in that function on multi CPU systems when ciphering a lot of packets in the IPsec stack. > This is indeed due to the mutex that is being used in the arc4rand function. Not just that there is a mutex protecting arc4random, but in my tests, rc4 is slower than chacha, and I believe that this is due to the requirement that every byte generated requires multiple reads and multiple writes.. > Actually randomness is required by the IV used in the forged output packets. > However, making a separate random generator per CPU might be more complicated than expected. How so? > The RFC 6027 (http://www.ietf.org/rfc/rfc6027.txt) reminds that the IV must not be repeated : > --- > 3.7.1. Outbound SAs Using Counter Modes > > For SAs involving counter mode ciphers such as Counter Mode (CTR) > ([RFC3686]) or Galois/Counter Mode (GCM) ([RFC4106]) there is yet > another complication. The initial vector for such modes MUST NOT be > repeated, and senders use methods such as counters or linear feedback > shift registers (LFSRs) to ensure this [...] > --- > > What do you think? I don't see how PCPU rng's could cause a problem... We aren't going to seed the PCPU rng's w/ the same seeding material unless the parent rng (yarrow or fortuna) manages to return the same seed twice in a row, and if that happens, well, you just won the lottery a few million times in the same day.. :) > ----- Mail original ----- > De: "John-Mark Gurney" > À: arch@FreeBSD.org > Envoyé: Mardi 24 Février 2015 02:20:26 > Objet: locks and kernel randomness... > > I'm working on simplifying kernel randomness interfaces. I would like > to get read of all weak random generators, and this means replacing > read_random and random(9) w/ effectively arc4rand(9) (to be replaced > by ChaCha or Keccak in the future). > > The issue is that random(9) is called from any number of contexts, such > as the scheduler. This makes locking a bit more interesting. Currently, > both arc4rand(9) and yarrow/fortuna use a default mtx lock to protect > their state. This obviously isn't compatible w/ the scheduler, and > possibly other calling contexts. > > I have a patch[1] that unifies the random interface. It converts a few > of the locks from mtx default to mtx spin to deal w/ this. > > If/when this is accepted, my next plan is to convert away from arc4rand, > to either ChaCha or Keccak. I already have another patch that converts > arc4rand and friends over to ChaCha. This patch does use PCPU data > and sched_pin to help eliminate locks, but this does need more study. > We could either do a restartable loop (but there might be too much state > to safely do) or a critical section (though running chacha a bunch of > times could have impact). > > [1] https://reviews.freebsd.org/D1956 -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not." From owner-freebsd-arch@FreeBSD.ORG Mon Mar 2 17:10:27 2015 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id AE82D97D; Mon, 2 Mar 2015 17:10:27 +0000 (UTC) Received: from gold.funkthat.com (gate2.funkthat.com [208.87.223.18]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "gold.funkthat.com", Issuer "gold.funkthat.com" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 16E8734D; Mon, 2 Mar 2015 17:10:27 +0000 (UTC) Received: from gold.funkthat.com (localhost [127.0.0.1]) by gold.funkthat.com (8.14.5/8.14.5) with ESMTP id t22HAQSW049882 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 2 Mar 2015 09:10:26 -0800 (PST) (envelope-from jmg@gold.funkthat.com) Received: (from jmg@localhost) by gold.funkthat.com (8.14.5/8.14.5/Submit) id t22HAQ3k049881; Mon, 2 Mar 2015 09:10:26 -0800 (PST) (envelope-from jmg) Date: Mon, 2 Mar 2015 09:10:26 -0800 From: John-Mark Gurney To: Alfred Perlstein Subject: Re: locks and kernel randomness... Message-ID: <20150302171025.GO32329@funkthat.com> References: <20150224012026.GY46794@funkthat.com> <1824482166.23183751.1425303073196.JavaMail.zimbra@stormshield.eu> <54F47C98.2080505@freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <54F47C98.2080505@freebsd.org> X-Operating-System: FreeBSD 9.1-PRERELEASE amd64 X-PGP-Fingerprint: 54BA 873B 6515 3F10 9E88 9322 9CB1 8F74 6D3F A396 X-Files: The truth is out there X-URL: http://resnet.uoregon.edu/~gurney_j/ X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html X-TipJar: bitcoin:13Qmb6AeTgQecazTWph4XasEsP7nGRbAPE X-to-the-FBI-CIA-and-NSA: HI! HOW YA DOIN? can i haz chizburger? User-Agent: Mutt/1.5.21 (2010-09-15) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.7 (gold.funkthat.com [127.0.0.1]); Mon, 02 Mar 2015 09:10:26 -0800 (PST) Cc: Emeric POUPON , arch@freebsd.org X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Mar 2015 17:10:27 -0000 Alfred Perlstein wrote this message on Mon, Mar 02, 2015 at 10:07 -0500: > On 3/2/15 8:31 AM, Emeric POUPON wrote: > > About arc4random, we have noticed significant contention in that function on multi CPU systems when ciphering a lot of packets in the IPsec stack. > > This is indeed due to the mutex that is being used in the arc4rand function. > > > > Actually randomness is required by the IV used in the forged output packets. > > However, making a separate random generator per CPU might be more complicated than expected. > > The RFC 6027 (http://www.ietf.org/rfc/rfc6027.txt) reminds that the IV must not be repeated : > > --- > > 3.7.1. Outbound SAs Using Counter Modes > > > > For SAs involving counter mode ciphers such as Counter Mode (CTR) > > ([RFC3686]) or Galois/Counter Mode (GCM) ([RFC4106]) there is yet > > another complication. The initial vector for such modes MUST NOT be > > repeated, and senders use methods such as counters or linear feedback > > shift registers (LFSRs) to ensure this [...] > > --- > > > > What do you think? > > If you can not have multiple random sources then what do you think of > having a thread that pre-fetches batches of random values and queues it > to each cpu? If you have the queue be pretty large then you shouldn't > bottleneck on it. > > Sort of like UMA for random data. > > Sorry if this is a daft idea, not sure about this code path in general, > but this struck me as a potential workaround. I'd say that's needlessly complex... You'd still need a lock, or play w/ the scheduler (sched_bind) to serialize access to the PCPU random pool... -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not." From owner-freebsd-arch@FreeBSD.ORG Mon Mar 2 17:19:47 2015 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id A0778BD5; Mon, 2 Mar 2015 17:19:47 +0000 (UTC) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.freebsd.org (Postfix) with ESMTP id 0A9CF63B; Mon, 2 Mar 2015 17:19:47 +0000 (UTC) Received: from [100.94.233.168] (36.sub-70-208-89.myvzw.com [70.208.89.36]) by elvis.mu.org (Postfix) with ESMTPSA id BD89A341F910; Mon, 2 Mar 2015 09:19:45 -0800 (PST) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (1.0) Subject: Re: locks and kernel randomness... From: Alfred Perlstein X-Mailer: iPhone Mail (12B440) In-Reply-To: <20150302171025.GO32329@funkthat.com> Date: Mon, 2 Mar 2015 12:19:43 -0500 Content-Transfer-Encoding: quoted-printable Message-Id: <1D168865-DF1B-4A08-BB42-FB26B4D88D6E@mu.org> References: <20150224012026.GY46794@funkthat.com> <1824482166.23183751.1425303073196.JavaMail.zimbra@stormshield.eu> <54F47C98.2080505@freebsd.org> <20150302171025.GO32329@funkthat.com> To: John-Mark Gurney Cc: Emeric POUPON , "arch@freebsd.org" , Alfred Perlstein X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Mar 2015 17:19:47 -0000 > On Mar 2, 2015, at 12:10 PM, John-Mark Gurney wrote: >=20 > Alfred Perlstein wrote this message on Mon, Mar 02, 2015 at 10:07 -0500: >>> On 3/2/15 8:31 AM, Emeric POUPON wrote: >>> About arc4random, we have noticed significant contention in that functio= n on multi CPU systems when ciphering a lot of packets in the IPsec stack. >>> This is indeed due to the mutex that is being used in the arc4rand funct= ion. >>>=20 >>> Actually randomness is required by the IV used in the forged output pack= ets. >>> However, making a separate random generator per CPU might be more compli= cated than expected. >>> The RFC 6027 (http://www.ietf.org/rfc/rfc6027.txt) reminds that the IV m= ust not be repeated : >>> --- >>> 3.7.1. Outbound SAs Using Counter Modes >>>=20 >>> For SAs involving counter mode ciphers such as Counter Mode (CTR) >>> ([RFC3686]) or Galois/Counter Mode (GCM) ([RFC4106]) there is yet >>> another complication. The initial vector for such modes MUST NOT be >>> repeated, and senders use methods such as counters or linear feedback= >>> shift registers (LFSRs) to ensure this [...] >>> --- >>>=20 >>> What do you think? >>=20 >> If you can not have multiple random sources then what do you think of=20 >> having a thread that pre-fetches batches of random values and queues it=20= >> to each cpu? If you have the queue be pretty large then you shouldn't=20= >> bottleneck on it. >>=20 >> Sort of like UMA for random data. >>=20 >> Sorry if this is a daft idea, not sure about this code path in general,=20= >> but this struck me as a potential workaround. >=20 > I'd say that's needlessly complex... You'd still need a lock, or play > w/ the scheduler (sched_bind) to serialize access to the PCPU random > pool... >=20 John, that is how you break down a lock. Do you have a lock free or per CPU s= olution? Using a strategy like this is very typical when trying to scale across CPU.=20= Do you have alternate idea? > --=20 > John-Mark Gurney Voice: +1 415 225 5579 >=20 > "All that I will do, has been done, All that I have, has not." > _______________________________________________ > freebsd-arch@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-arch > To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" >=20 From owner-freebsd-arch@FreeBSD.ORG Mon Mar 2 17:41:17 2015 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 3F7EC9D4; Mon, 2 Mar 2015 17:41:17 +0000 (UTC) Received: from gold.funkthat.com (gate2.funkthat.com [208.87.223.18]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "gold.funkthat.com", Issuer "gold.funkthat.com" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id B8F8B930; Mon, 2 Mar 2015 17:41:16 +0000 (UTC) Received: from gold.funkthat.com (localhost [127.0.0.1]) by gold.funkthat.com (8.14.5/8.14.5) with ESMTP id t22HfGCB050191 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 2 Mar 2015 09:41:16 -0800 (PST) (envelope-from jmg@gold.funkthat.com) Received: (from jmg@localhost) by gold.funkthat.com (8.14.5/8.14.5/Submit) id t22HfFPZ050190; Mon, 2 Mar 2015 09:41:15 -0800 (PST) (envelope-from jmg) Date: Mon, 2 Mar 2015 09:41:15 -0800 From: John-Mark Gurney To: Alfred Perlstein Subject: Re: locks and kernel randomness... Message-ID: <20150302174115.GR32329@funkthat.com> References: <20150224012026.GY46794@funkthat.com> <1824482166.23183751.1425303073196.JavaMail.zimbra@stormshield.eu> <54F47C98.2080505@freebsd.org> <20150302171025.GO32329@funkthat.com> <1D168865-DF1B-4A08-BB42-FB26B4D88D6E@mu.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1D168865-DF1B-4A08-BB42-FB26B4D88D6E@mu.org> X-Operating-System: FreeBSD 9.1-PRERELEASE amd64 X-PGP-Fingerprint: 54BA 873B 6515 3F10 9E88 9322 9CB1 8F74 6D3F A396 X-Files: The truth is out there X-URL: http://resnet.uoregon.edu/~gurney_j/ X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html X-TipJar: bitcoin:13Qmb6AeTgQecazTWph4XasEsP7nGRbAPE X-to-the-FBI-CIA-and-NSA: HI! HOW YA DOIN? can i haz chizburger? User-Agent: Mutt/1.5.21 (2010-09-15) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.7 (gold.funkthat.com [127.0.0.1]); Mon, 02 Mar 2015 09:41:16 -0800 (PST) Cc: Emeric POUPON , "arch@freebsd.org" , Alfred Perlstein X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Mar 2015 17:41:17 -0000 Alfred Perlstein wrote this message on Mon, Mar 02, 2015 at 12:19 -0500: > > On Mar 2, 2015, at 12:10 PM, John-Mark Gurney wrote: > > > > Alfred Perlstein wrote this message on Mon, Mar 02, 2015 at 10:07 -0500: > >>> On 3/2/15 8:31 AM, Emeric POUPON wrote: > >>> About arc4random, we have noticed significant contention in that function on multi CPU systems when ciphering a lot of packets in the IPsec stack. > >>> This is indeed due to the mutex that is being used in the arc4rand function. > >>> > >>> Actually randomness is required by the IV used in the forged output packets. > >>> However, making a separate random generator per CPU might be more complicated than expected. > >>> The RFC 6027 (http://www.ietf.org/rfc/rfc6027.txt) reminds that the IV must not be repeated : > >>> --- > >>> 3.7.1. Outbound SAs Using Counter Modes > >>> > >>> For SAs involving counter mode ciphers such as Counter Mode (CTR) > >>> ([RFC3686]) or Galois/Counter Mode (GCM) ([RFC4106]) there is yet > >>> another complication. The initial vector for such modes MUST NOT be > >>> repeated, and senders use methods such as counters or linear feedback > >>> shift registers (LFSRs) to ensure this [...] > >>> --- > >>> > >>> What do you think? > >> > >> If you can not have multiple random sources then what do you think of > >> having a thread that pre-fetches batches of random values and queues it > >> to each cpu? If you have the queue be pretty large then you shouldn't > >> bottleneck on it. > >> > >> Sort of like UMA for random data. > >> > >> Sorry if this is a daft idea, not sure about this code path in general, > >> but this struck me as a potential workaround. > > > > I'd say that's needlessly complex... You'd still need a lock, or play > > w/ the scheduler (sched_bind) to serialize access to the PCPU random > > pool... > > John, that is how you break down a lock. Do you have a lock free or per CPU solution? > > Using a strategy like this is very typical when trying to scale across CPU. > > Do you have alternate idea? Did people not at ALL read the original email? I said in the original email: I already have another patch that converts arc4rand and friends over to ChaCha. This patch does use PCPU data and sched_pin to help eliminate locks, but this does need more study. Please, read the original email, and then ask questions.. If you jump into the middle of the thread, it's you're responsibility to reread the thread... Does my original email answer your questions? If not, feel free to ask additional questions... Thanks. -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not." From owner-freebsd-arch@FreeBSD.ORG Mon Mar 2 20:42:43 2015 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id C7DED864; Mon, 2 Mar 2015 20:42:43 +0000 (UTC) Received: from mail-ie0-x22f.google.com (mail-ie0-x22f.google.com [IPv6:2607:f8b0:4001:c03::22f]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 8F96C279; Mon, 2 Mar 2015 20:42:43 +0000 (UTC) Received: by iecrp18 with SMTP id rp18so51433623iec.1; Mon, 02 Mar 2015 12:42:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:date:message-id:subject:from:to:content-type; bh=yI6xxNUgyElBECOfAJQZ4NxK5e7ECW+++wODeQwcnBg=; b=PqXbZWka+owY4MXLBkDTake90Y2xhJlkzNS05Jj9V1VsrMblnJ62Z+RUc/ZN/D2sEK wA1CfQ+B9nEhU4QfOWYxb3QgIpR+t/57ZY4/NHenevrExF6tr7JfkDNWXgxcJc32NMH4 Gj+rLFIoxdAUAqwgvgHCS6QPDtlg6ZzMEErMGRQHHfsP/Vf6J5Ex0xaLp9itXUuZjsST FPcLycTgXxumgajJr2yceZU6/YRAu8OIFfK1w792v8wQ8qXpMf2mxoFz3NsQuuaqN7pR KoJqXXIja8wXGrCfuz3qCMXlW6kXYjxfwNKr1Eyj/Gd3jN9ZeVg3yk/CwrBoKb7LXSNp Kk0w== MIME-Version: 1.0 X-Received: by 10.42.130.74 with SMTP id u10mr32332176ics.61.1425328962962; Mon, 02 Mar 2015 12:42:42 -0800 (PST) Sender: adrian.chadd@gmail.com Received: by 10.36.17.66 with HTTP; Mon, 2 Mar 2015 12:42:42 -0800 (PST) Date: Mon, 2 Mar 2015 12:42:42 -0800 X-Google-Sender-Auth: MV4jVvJ7JmxzeQyAQ3XtMFfPIXk Message-ID: Subject: Doing zero-copy stuff in drivers, or "is vm_fault_quick_hold_pages() enough" ? From: Adrian Chadd To: "freebsd-arch@freebsd.org" , freebsd-current Content-Type: text/plain; charset=UTF-8 X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Mar 2015 20:42:43 -0000 Hi, gonzo@ committed a fix (r278615) to the videocore driver for the raspberry pi. The fix involved doing an explicit wire of pages that were about to be passed down to the hardware to send to the videobuffer hardware. It turns out that doing vm_fault_quick_hold_pages() wasn't enough - the pages weren't being wired, and they may disappear before the hardware gets to them. I looked at vmapbuf() and how it uses vm_fault_quick_hold_pages(), but I can't find anything that wires the pages down before it hands the addresses to the hardware. So, am I missing something about how/where that's done? Thanks, -adrian From owner-freebsd-arch@FreeBSD.ORG Mon Mar 2 21:31:49 2015 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 025AE900 for ; Mon, 2 Mar 2015 21:31:49 +0000 (UTC) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id CEA5AA83 for ; Mon, 2 Mar 2015 21:31:48 +0000 (UTC) Received: from ralph.baldwin.cx (pool-173-54-116-245.nwrknj.fios.verizon.net [173.54.116.245]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id BE591B9B0; Mon, 2 Mar 2015 16:31:47 -0500 (EST) From: John Baldwin To: freebsd-arch@freebsd.org Subject: Re: Minor ULE changes and optimizations Date: Mon, 02 Mar 2015 13:53:13 -0500 Message-ID: <5490895.NN1ciTh6gZ@ralph.baldwin.cx> User-Agent: KMail/4.14.2 (FreeBSD/10.1-STABLE; KDE/4.14.2; amd64; ; ) In-Reply-To: <54F1E25F.5040905@astrodoggroup.com> References: <54EF2C54.7030207@astrodoggroup.com> <1547642.s3cC06khRt@ralph.baldwin.cx> <54F1E25F.5040905@astrodoggroup.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Mon, 02 Mar 2015 16:31:47 -0500 (EST) Cc: Harrison Grundy X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Mar 2015 21:31:49 -0000 On Saturday, February 28, 2015 07:44:31 AM Harrison Grundy wrote: > On 02/28/15 04:24, John Baldwin wrote: > > On Friday, February 27, 2015 07:50:55 AM Harrison Grundy wrote: > >> On 02/27/15 06:14, John Baldwin wrote: > >>> On Thursday, February 26, 2015 06:23:16 AM Harrison Grundy > >>> > >>> wrote: > >>>> https://reviews.freebsd.org/D1969 This allows a > >>>> non-migratable thread to pin itself to a CPU if it is already > >>>> running on that CPU. > >>>> > >>>> I've been running these patches for the past week or so > >>>> without issue. Any additional testing or comments would be > >>>> greatly appreciated. > >>> > >>> Can you explain the reason / use case for this? This seems to > >>> be allowing an API violation. sched_pin() was designed to be > >>> a lower-level API than sched_bind(), so you wouldn't call > >>> sched_bind() if you were already pinned. In addition, > >>> sched_pin() is sometimes used by code that assumes it won't > >>> migrate until sched_unpin() (e.g. temporary mappings inside an > >>> sfbuf). If you allow sched_bind() to move a thread that is > >>> pinned you will allow someone to unintentionally break those > >>> sort of things instead of getting an assertion failure panic. > >> > >> For a pinned thread, the underlying idea is that if you're > >> already on the CPU you pinned to, calling sched_bind with that > >> CPU specified allows you to set TSF_BOUND without calling > >> sched_unpin first. > >> > >> If a pinned thread were to call sched_bind for a CPU it isn't > >> pinned to, it would still hit the assert and fail. > >> > >> For any unpinned thread, if you're already running on the correct > >> CPU, you can skip the THREAD_CAN_MIGRATE check and the call to > >> mi_switch. > > > > Ah, ok, so you aren't allowing migration in theory. However, I'm > > still curious as to why you want/need this. This makes the API > > usage a bit more complex to reason about (sched_bind() can > > sometimes be called while pinned but not always after this change), > > so I think that extra complexity needs a reason to exist. > > Primarily, it allows those threads already on a CPU to skip the call > to mi_switch and get out of sched_bind a bit faster. sched_bind() already does this. Internally it skips the call to mi_switch() if the thread is already on the correct CPU: void sched_bind(struct thread *td, int cpu) { ... ts->ts_flags |= TSF_BOUND; sched_pin(); if (PCPU_GET(cpuid) == cpu) return; ... } Calling sched_pin() before sched_bind() isn't going to really change that. Once you do thread_lock(td) your thread is effectively pinned until you do a thread_unlock() since the spin lock blocks preemption (and thus migration as well), so in a sequence of: thread_lock(td); sched_bind(td, cpu); The thread is effectively pinned once thread_lock() returns and will not need to use mi_switch() if it is already on the correct CPU. > Additionally, it allows a driver to call sched_pin, then bind to that > same cpu later without having to write something like > "critical_enter(); sched_unpin(); sched_bind(foo, bar); > critical_exit();", since otherwise it could be migrated/preempted > between unpin and bind. But why would a driver want to do that? This code: sched_pin(td); /* do something */ thread_lock(td); sched_unpin(td); sched_bind(td, PCPU_GET(cpuid)); thread_unlock(td); /* do something else */ thread_lock(td); sched_unbind(td); thread_unlock(td); Is equivalent to: sched_pin(td); /* do something */ /* do something else */ sched_unpin(td); But the latter form is lighter weight and easier to read / understand. Letting you sched_bind() to the current CPU while you are pinned doesn't enable any new functionality than you can already achieve by just using sched_pin() and sched_unpin(). -- John Baldwin From owner-freebsd-arch@FreeBSD.ORG Mon Mar 2 22:06:27 2015 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 622D2ED8 for ; Mon, 2 Mar 2015 22:06:27 +0000 (UTC) Received: from mailout.easymail.ca (mailout.easymail.ca [64.68.201.169]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id F1D42EA5 for ; Mon, 2 Mar 2015 22:06:26 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by mailout.easymail.ca (Postfix) with ESMTP id 96CDEE3F1 for ; Mon, 2 Mar 2015 17:06:24 -0500 (EST) X-Virus-Scanned: Debian amavisd-new at mailout.easymail.ca X-Spam-Flag: NO X-Spam-Score: -3.849 X-Spam-Level: X-Spam-Status: No, score=-3.849 required=5 tests=[ALL_TRUSTED=-1.8, AWL=-0.142, BAYES_00=-2.599, DNS_FROM_AHBL_RHSBL=0.692] Received: from mailout.easymail.ca ([127.0.0.1]) by localhost (easymail-mailout.easydns.vpn [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 1EQlRRS4806j for ; Mon, 2 Mar 2015 17:06:24 -0500 (EST) Received: from bsddt1241.lv01.astrodoggroup.com (unknown [40.141.24.126]) (using TLSv1 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by mailout.easymail.ca (Postfix) with ESMTPSA id 331D7E348 for ; Mon, 2 Mar 2015 17:06:24 -0500 (EST) Message-ID: <54F4DE0D.7070606@astrodoggroup.com> Date: Mon, 02 Mar 2015 14:02:53 -0800 From: Harrison Grundy User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:31.0) Gecko/20100101 Thunderbird/31.3.0 MIME-Version: 1.0 To: freebsd-arch@freebsd.org Subject: Re: Minor ULE changes and optimizations References: <54EF2C54.7030207@astrodoggroup.com> <1547642.s3cC06khRt@ralph.baldwin.cx> <54F1E25F.5040905@astrodoggroup.com> <5490895.NN1ciTh6gZ@ralph.baldwin.cx> In-Reply-To: <5490895.NN1ciTh6gZ@ralph.baldwin.cx> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Mar 2015 22:06:27 -0000 On 03/02/15 10:53, John Baldwin wrote: > On Saturday, February 28, 2015 07:44:31 AM Harrison Grundy wrote: >> On 02/28/15 04:24, John Baldwin wrote: >>> On Friday, February 27, 2015 07:50:55 AM Harrison Grundy wrote: >>>> On 02/27/15 06:14, John Baldwin wrote: >>>>> On Thursday, February 26, 2015 06:23:16 AM Harrison Grundy >>>>> >>>>> wrote: >>>>>> https://reviews.freebsd.org/D1969 This allows a >>>>>> non-migratable thread to pin itself to a CPU if it is already >>>>>> running on that CPU. >>>>>> >>>>>> I've been running these patches for the past week or so >>>>>> without issue. Any additional testing or comments would be >>>>>> greatly appreciated. >>>>> >>>>> Can you explain the reason / use case for this? This seems to >>>>> be allowing an API violation. sched_pin() was designed to be >>>>> a lower-level API than sched_bind(), so you wouldn't call >>>>> sched_bind() if you were already pinned. In addition, >>>>> sched_pin() is sometimes used by code that assumes it won't >>>>> migrate until sched_unpin() (e.g. temporary mappings inside an >>>>> sfbuf). If you allow sched_bind() to move a thread that is >>>>> pinned you will allow someone to unintentionally break those >>>>> sort of things instead of getting an assertion failure panic. >>>> >>>> For a pinned thread, the underlying idea is that if you're >>>> already on the CPU you pinned to, calling sched_bind with that >>>> CPU specified allows you to set TSF_BOUND without calling >>>> sched_unpin first. >>>> >>>> If a pinned thread were to call sched_bind for a CPU it isn't >>>> pinned to, it would still hit the assert and fail. >>>> >>>> For any unpinned thread, if you're already running on the correct >>>> CPU, you can skip the THREAD_CAN_MIGRATE check and the call to >>>> mi_switch. >>> >>> Ah, ok, so you aren't allowing migration in theory. However, I'm >>> still curious as to why you want/need this. This makes the API >>> usage a bit more complex to reason about (sched_bind() can >>> sometimes be called while pinned but not always after this change), >>> so I think that extra complexity needs a reason to exist. >> >> Primarily, it allows those threads already on a CPU to skip the call >> to mi_switch and get out of sched_bind a bit faster. > > sched_bind() already does this. Internally it skips the call to mi_switch() > if the thread is already on the correct CPU: > > void > sched_bind(struct thread *td, int cpu) > { > ... > ts->ts_flags |= TSF_BOUND; > sched_pin(); > if (PCPU_GET(cpuid) == cpu) > return; > ... > } > > Calling sched_pin() before sched_bind() isn't going to really change that. > Once you do thread_lock(td) your thread is effectively pinned until you do a > thread_unlock() since the spin lock blocks preemption (and thus migration as > well), so in a sequence of: > > thread_lock(td); > sched_bind(td, cpu); > > The thread is effectively pinned once thread_lock() returns and will not need > to use mi_switch() if it is already on the correct CPU. > >> Additionally, it allows a driver to call sched_pin, then bind to that >> same cpu later without having to write something like >> "critical_enter(); sched_unpin(); sched_bind(foo, bar); >> critical_exit();", since otherwise it could be migrated/preempted >> between unpin and bind. > > But why would a driver want to do that? This code: > > sched_pin(td); > > /* do something */ > > thread_lock(td); > sched_unpin(td); > sched_bind(td, PCPU_GET(cpuid)); > thread_unlock(td); > > /* do something else */ > > thread_lock(td); > sched_unbind(td); > thread_unlock(td); > > Is equivalent to: > > sched_pin(td); > > /* do something */ > > /* do something else */ > > sched_unpin(td); > > But the latter form is lighter weight and easier to read / understand. > > Letting you sched_bind() to the current CPU while you are pinned doesn't > enable any new functionality than you can already achieve by just using > sched_pin() and sched_unpin(). > The difference between the two is that TSF_BOUND is set for "do something else" in the former case. As I understand the difference, sched_pin is designed for temporarily assigning to a CPU, while sched_bind is intended for longer-term affinity. The patch would allow you to set the bound flag without unpinning, basically. It seems easier to do this here, than add a "set bound flag" function that allows drivers to "promote" themselves from pinned to bound, though that would also be an option to get to the same place. --- Harrison From owner-freebsd-arch@FreeBSD.ORG Mon Mar 2 23:06:00 2015 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 61522CFB; Mon, 2 Mar 2015 23:06:00 +0000 (UTC) Received: from mail-yh0-x230.google.com (mail-yh0-x230.google.com [IPv6:2607:f8b0:4002:c01::230]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 1C0436DA; Mon, 2 Mar 2015 23:06:00 +0000 (UTC) Received: by yhoa41 with SMTP id a41so16372674yho.9; Mon, 02 Mar 2015 15:05:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=I2W/TD3ky1JofeJdLHe2L3621FCeCd8zVSnl7JqeqQ0=; b=A7VF9zwIdYK2zI+9Pszrfr0F/3fxfVNuOsl5u+7byaDagcFyiFe3rJjvFJXBWn7Pbt vQZ+DheDyf5gX7Fu+jvU6AvTk4a6XUuX2yPg2sszuOKZs4tekRsutZ/JXVQEW/z8+p5y tO4sch7uy2CBtlZZSsxVgpOw0PYsdwTZ9YymnGyWp6RouKyq1OJJPBvhoweF1Xw/7tl2 S/2D6hW+tMNT9awJgUGGt2+wDAfPvF75OiP/8VZcmtgHtVCrSOEyClzd4yiSKNW/mFS3 4UmNn9zf3IenJtcttxTGjZY+4Cg3jxBVnNurVNAn0EW9pGnufZroVjn632H8efeduI1o V8Yw== MIME-Version: 1.0 X-Received: by 10.170.185.71 with SMTP id b68mr13529604yke.25.1425337559164; Mon, 02 Mar 2015 15:05:59 -0800 (PST) Sender: kmacybsd@gmail.com Received: by 10.170.76.66 with HTTP; Mon, 2 Mar 2015 15:05:59 -0800 (PST) In-Reply-To: References: Date: Mon, 2 Mar 2015 15:05:59 -0800 X-Google-Sender-Auth: 1k4rEY3CGKDdAhTV-jSklRrpPyY Message-ID: Subject: Re: Doing zero-copy stuff in drivers, or "is vm_fault_quick_hold_pages() enough" ? From: "K. Macy" To: Adrian Chadd Content-Type: text/plain; charset=UTF-8 Cc: freebsd-current , "freebsd-arch@freebsd.org" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Mar 2015 23:06:00 -0000 On Mon, Mar 2, 2015 at 12:42 PM, Adrian Chadd wrote: > Hi, > > gonzo@ committed a fix (r278615) to the videocore driver for the > raspberry pi. The fix involved doing an explicit wire of pages that > were about to be passed down to the hardware to send to the > videobuffer hardware. > > It turns out that doing vm_fault_quick_hold_pages() wasn't enough - > the pages weren't being wired, and they may disappear before the > hardware gets to them. Then your code is buggy or you've hit a bug in the VM. Holding a page is a short-term wiring. Right above vm_page_hold(): /* * Keep page from being freed by the page daemon * much of the same effect as wiring, except much lower * overhead and should be used only for *very* temporary * holding ("wiring"). */ > I looked at vmapbuf() and how it uses vm_fault_quick_hold_pages(), but > I can't find anything that wires the pages down before it hands the > addresses to the hardware. > > So, am I missing something about how/where that's done? Yes. > Thanks, > > > -adrian > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" From owner-freebsd-arch@FreeBSD.ORG Mon Mar 2 23:49:57 2015 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id E9BF8E6E; Mon, 2 Mar 2015 23:49:57 +0000 (UTC) Received: from mail-ie0-x233.google.com (mail-ie0-x233.google.com [IPv6:2607:f8b0:4001:c03::233]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id AE18BBA3; Mon, 2 Mar 2015 23:49:57 +0000 (UTC) Received: by ierx19 with SMTP id x19so52709628ier.3; Mon, 02 Mar 2015 15:49:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=vHsCoYnt8fm4lByVHNuUSFBJeE7qzH2FEL3+Z6dRCNU=; b=JXRanK2TPV4Yf0OkI+hm+7+zkWvXeINfOUhk+9hQNIQDgnsEoZAbTGnce0FpzFF6O8 YUdrSwVPuqyUe7v0ri0sg1v7f9Pe8j17w1eTRasKv95REoZgQqXQt6v22hhvy7oFhiaq tnYqFmXOwsKtaqbHXfWI3+eMjpB/LXtRF3HOWlNor8HLvHplzn0lJzwn7QerriECXLd+ EyA0s/FZLi5NLzl48uPFAmFLvaoKdTsmq7KVqFyF7e7++egHxfTjljzsa14oZcLc/f8S MAeNQr0wq6pnNoku0fXQQ10ThXAHBfNE4QeIJqaUWClirwyF3VUGZ9XQUIy65MhADquq imDw== MIME-Version: 1.0 X-Received: by 10.107.155.13 with SMTP id d13mr39898709ioe.29.1425340196978; Mon, 02 Mar 2015 15:49:56 -0800 (PST) Sender: adrian.chadd@gmail.com Received: by 10.36.17.66 with HTTP; Mon, 2 Mar 2015 15:49:56 -0800 (PST) In-Reply-To: References: Date: Mon, 2 Mar 2015 15:49:56 -0800 X-Google-Sender-Auth: GszTLDc9d24aTj_AC-IR8XmeAWg Message-ID: Subject: Re: Doing zero-copy stuff in drivers, or "is vm_fault_quick_hold_pages() enough" ? From: Adrian Chadd To: "K. Macy" Content-Type: text/plain; charset=UTF-8 Cc: freebsd-current , "freebsd-arch@freebsd.org" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Mar 2015 23:49:58 -0000 On 2 March 2015 at 15:05, K. Macy wrote: > On Mon, Mar 2, 2015 at 12:42 PM, Adrian Chadd wrote: >> Hi, >> >> gonzo@ committed a fix (r278615) to the videocore driver for the >> raspberry pi. The fix involved doing an explicit wire of pages that >> were about to be passed down to the hardware to send to the >> videobuffer hardware. >> >> It turns out that doing vm_fault_quick_hold_pages() wasn't enough - >> the pages weren't being wired, and they may disappear before the >> hardware gets to them. > > > Then your code is buggy or you've hit a bug in the VM. Holding a page > is a short-term wiring. Well, you can look at what's going on in the vchiq code. gonzo made it an explicit wire in order to avoid issues. > Right above vm_page_hold(): > /* > * Keep page from being freed by the page daemon > * much of the same effect as wiring, except much lower > * overhead and should be used only for *very* temporary > * holding ("wiring"). > */ What's the definition of "very temporary holding" ? What's the behavioural difference? -adrian From owner-freebsd-arch@FreeBSD.ORG Tue Mar 3 00:12:28 2015 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id B460C55A; Tue, 3 Mar 2015 00:12:28 +0000 (UTC) Received: from mail-yh0-x22a.google.com (mail-yh0-x22a.google.com [IPv6:2607:f8b0:4002:c01::22a]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 6DAE9E2D; Tue, 3 Mar 2015 00:12:28 +0000 (UTC) Received: by yhzz6 with SMTP id z6so16540643yhz.5; Mon, 02 Mar 2015 16:12:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=ZBOXDfpNH5/5TPuolhIszIGBiZQ9Scrk+w26AthIBXI=; b=hzHH1Ax+VVK3VnnsIM0Px/H3t+Y7EoNT7eSS0ONS8Rfj3UA9HryauuikdHt2jbia+f NbxA/9cSaBYxJs+xao9865xQUSZqADpF6iU78sZep+2AokKcSNDWYl/m6WzYViF+2oPb nViIuXYSazbDhFVBgTZRyoXDQ+JesbXkhIM882jFE4uMwBhcrIMUbVnUdm6C0WJ47Va6 PnnN1mLp4yEr3VupJusW3+seB+qjpRUrBQTJoy9sGFd34qBc+hUZpfqZ3a2z7R59eab0 F5WINDu/7N9deIHen71+EoAgAmIabUszFY/SKMzmSP4P0vkC05nlXDux5tI1iAjQQiMr AkEg== MIME-Version: 1.0 X-Received: by 10.236.25.68 with SMTP id y44mr28269936yhy.4.1425341547450; Mon, 02 Mar 2015 16:12:27 -0800 (PST) Sender: kmacybsd@gmail.com Received: by 10.170.76.66 with HTTP; Mon, 2 Mar 2015 16:12:27 -0800 (PST) In-Reply-To: References: Date: Mon, 2 Mar 2015 16:12:27 -0800 X-Google-Sender-Auth: MSiKnxdPMREYCl4AuzX1lBOFQh0 Message-ID: Subject: Re: Doing zero-copy stuff in drivers, or "is vm_fault_quick_hold_pages() enough" ? From: "K. Macy" To: Adrian Chadd Content-Type: text/plain; charset=UTF-8 Cc: freebsd-current , "freebsd-arch@freebsd.org" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Mar 2015 00:12:28 -0000 >> Right above vm_page_hold(): >> /* >> * Keep page from being freed by the page daemon >> * much of the same effect as wiring, except much lower >> * overhead and should be used only for *very* temporary >> * holding ("wiring"). >> */ > > What's the definition of "very temporary holding" ? What's the > behavioural difference? Long enough to complete a DMA operation versus the lifetime of an executing program. From owner-freebsd-arch@FreeBSD.ORG Tue Mar 3 00:17:04 2015 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 6421E7C8; Tue, 3 Mar 2015 00:17:04 +0000 (UTC) Received: from mail-ig0-x22c.google.com (mail-ig0-x22c.google.com [IPv6:2607:f8b0:4001:c05::22c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 265D1E61; Tue, 3 Mar 2015 00:17:04 +0000 (UTC) Received: by igal13 with SMTP id l13so20554386iga.1; Mon, 02 Mar 2015 16:17:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=O6ma+twoQqaH6CIF07PMVGVcafyydbyRFgPdmcxQ/ik=; b=WJNjyD8JM5yVSjifWXw04itVBGbXCOTSf20q+TcyfBwvk/qOXJTQJgYB7Qzwp90NWu +BFCy5qtOJR8+BEg5VjwptJemlrWr7NDYd6MMdS1HI63Hca+FSM0GsjyMv/Sh6ivebiO TLNVmlZLDrFGjPN44Uj3KRwz+XMalyGGfPdoiGyrid8znimeRJaukdH1HL5EaktdWsGF 7ZAg6bi+Oc0V3nekBHGoQW4ujmPIsFnyrC6iPArJpgy8kOE/b+feVi6w4NzxXPxj9F/E N6nU9ofW+ot476wR+B2n63D/yOvD3DJZYvfIjx/daE32GackFO14pXfS4tsMxvZXI45j SwRA== MIME-Version: 1.0 X-Received: by 10.50.43.201 with SMTP id y9mr25934985igl.6.1425341823612; Mon, 02 Mar 2015 16:17:03 -0800 (PST) Sender: adrian.chadd@gmail.com Received: by 10.36.17.66 with HTTP; Mon, 2 Mar 2015 16:17:03 -0800 (PST) In-Reply-To: References: Date: Mon, 2 Mar 2015 16:17:03 -0800 X-Google-Sender-Auth: LKAbD33Rc4Sd6D7PP2BMOdeFCGQ Message-ID: Subject: Re: Doing zero-copy stuff in drivers, or "is vm_fault_quick_hold_pages() enough" ? From: Adrian Chadd To: "K. Macy" Content-Type: text/plain; charset=UTF-8 Cc: freebsd-current , "freebsd-arch@freebsd.org" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Mar 2015 00:17:04 -0000 On 2 March 2015 at 16:12, K. Macy wrote: >>> Right above vm_page_hold(): >>> /* >>> * Keep page from being freed by the page daemon >>> * much of the same effect as wiring, except much lower >>> * overhead and should be used only for *very* temporary >>> * holding ("wiring"). >>> */ >> >> What's the definition of "very temporary holding" ? What's the >> behavioural difference? > > Long enough to complete a DMA operation versus the lifetime of an > executing program. Ok, but is there a specific time length that this should be? A DMA operation to a slow device could be up to hundreds of milliseconds; or seconds if things are really backed up. Using wire instead of hold definitely made things work without having the page disappear from underneath it. Oleksander knows more about the details of that. -adrian From owner-freebsd-arch@FreeBSD.ORG Tue Mar 3 00:25:42 2015 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id C510DD80; Tue, 3 Mar 2015 00:25:42 +0000 (UTC) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 30E58F5D; Tue, 3 Mar 2015 00:25:41 +0000 (UTC) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.14.9/8.14.9) with ESMTP id t230PaeG079640 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 3 Mar 2015 02:25:36 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.9.2 kib.kiev.ua t230PaeG079640 Received: (from kostik@localhost) by tom.home (8.14.9/8.14.9/Submit) id t230PaZK079639; Tue, 3 Mar 2015 02:25:36 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Tue, 3 Mar 2015 02:25:36 +0200 From: Konstantin Belousov To: Adrian Chadd Subject: Re: Doing zero-copy stuff in drivers, or "is vm_fault_quick_hold_pages() enough" ? Message-ID: <20150303002536.GE2379@kib.kiev.ua> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.0 X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on tom.home Cc: freebsd-current , "K. Macy" , "freebsd-arch@freebsd.org" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Mar 2015 00:25:42 -0000 On Mon, Mar 02, 2015 at 04:17:03PM -0800, Adrian Chadd wrote: > On 2 March 2015 at 16:12, K. Macy wrote: > >>> Right above vm_page_hold(): > >>> /* > >>> * Keep page from being freed by the page daemon > >>> * much of the same effect as wiring, except much lower > >>> * overhead and should be used only for *very* temporary > >>> * holding ("wiring"). > >>> */ > >> > >> What's the definition of "very temporary holding" ? What's the > >> behavioural difference? > > > > Long enough to complete a DMA operation versus the lifetime of an > > executing program. > > Ok, but is there a specific time length that this should be? Difference between hold and wire is effectively that held pages are still kept on the page queues, providing potentially uneccessary work for pagedaemon to find them and skip. Wired pages are removed from the queues. This means that holding a page is much cheaper, by the cost of leaving slightly more work to the system. Also, holding a page only requires the page lock, while wiring contend on the page queue lock, in addition to the page lock. > > A DMA operation to a slow device could be up to hundreds of > milliseconds; or seconds if things are really backed up. > > Using wire instead of hold definitely made things work without having > the page disappear from underneath it. Oleksander knows more about the > details of that. Page cannot 'disappear'. The only thing which could happen with the memory page is reuse, when the page is removed from the previous object and re-purposed for some other object, loosing old content. Your terminology suggests that something unrelated happen. From owner-freebsd-arch@FreeBSD.ORG Tue Mar 3 00:37:14 2015 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id F2894103; Tue, 3 Mar 2015 00:37:13 +0000 (UTC) Received: from mail-ig0-x22d.google.com (mail-ig0-x22d.google.com [IPv6:2607:f8b0:4001:c05::22d]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id B2C63E1; Tue, 3 Mar 2015 00:37:13 +0000 (UTC) Received: by igbhn18 with SMTP id hn18so22318172igb.2; Mon, 02 Mar 2015 16:37:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=/ywBmXikn/5JrH1PePyqOs1uXBgi5ufbpnJ1XlzX2A4=; b=G2JdhU4ZYJ94WCukjJldctNbC4WunxmZvy3gKRSoN4YTrEbYdWvT+eccXHhICYZAYH swdtjcEF1AIzaWz7vqcjivMmYK1uE9g7KMp1585Iaw+QswDOgTGy95FppsYJm4KWdqxz 38Os+5EtX/X5FuFzgtzPc3F4RIsNBj2nXLjJ0XdO+fWCI8VVF7yVX6o4C8F6i7guuNP+ ebQp9qZacjGu5ReH0g86bATcCRisJpbtHad4ifsVpRy0VOkxH7bvzzrL0PMfdFUEoLx2 FArys+znk6lhkbcCIP03jTAXut8jSb0g/WwWXEtAzvwQMO6lNHahhK7knIZvQ2r1eSky vq4w== MIME-Version: 1.0 X-Received: by 10.42.188.133 with SMTP id da5mr34081038icb.37.1425343033199; Mon, 02 Mar 2015 16:37:13 -0800 (PST) Sender: adrian.chadd@gmail.com Received: by 10.36.17.66 with HTTP; Mon, 2 Mar 2015 16:37:13 -0800 (PST) In-Reply-To: <20150303002536.GE2379@kib.kiev.ua> References: <20150303002536.GE2379@kib.kiev.ua> Date: Mon, 2 Mar 2015 16:37:13 -0800 X-Google-Sender-Auth: 3Um-zdb7MZ5WGA0o8Yr7A8iCP74 Message-ID: Subject: Re: Doing zero-copy stuff in drivers, or "is vm_fault_quick_hold_pages() enough" ? From: Adrian Chadd To: Konstantin Belousov Content-Type: text/plain; charset=UTF-8 Cc: freebsd-current , "K. Macy" , "freebsd-arch@freebsd.org" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Mar 2015 00:37:14 -0000 On 2 March 2015 at 16:25, Konstantin Belousov wrote: >> Ok, but is there a specific time length that this should be? > Difference between hold and wire is effectively that held pages are > still kept on the page queues, providing potentially uneccessary work > for pagedaemon to find them and skip. Wired pages are removed from the > queues. > > This means that holding a page is much cheaper, by the cost of leaving > slightly more work to the system. Also, holding a page only requires the > page lock, while wiring contend on the page queue lock, in addition to > the page lock. Thanks for the description - that makes things a lot clearer! >> >> A DMA operation to a slow device could be up to hundreds of >> milliseconds; or seconds if things are really backed up. >> >> Using wire instead of hold definitely made things work without having >> the page disappear from underneath it. Oleksander knows more about the >> details of that. > > Page cannot 'disappear'. The only thing which could happen with the > memory page is reuse, when the page is removed from the previous object > and re-purposed for some other object, loosing old content. > > Your terminology suggests that something unrelated happen. Yup, and that's what I'm worried about :( -adrian From owner-freebsd-arch@FreeBSD.ORG Wed Mar 4 16:23:27 2015 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 085182C3 for ; Wed, 4 Mar 2015 16:23:27 +0000 (UTC) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id D4631BBC for ; Wed, 4 Mar 2015 16:23:26 +0000 (UTC) Received: from ralph.baldwin.cx (pool-173-54-116-245.nwrknj.fios.verizon.net [173.54.116.245]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id A952AB926; Wed, 4 Mar 2015 11:23:25 -0500 (EST) From: John Baldwin To: freebsd-arch@freebsd.org Subject: Re: Minor ULE changes and optimizations Date: Wed, 04 Mar 2015 10:29:27 -0500 Message-ID: <1843154.4TcuH8bhtB@ralph.baldwin.cx> User-Agent: KMail/4.14.2 (FreeBSD/10.1-STABLE; KDE/4.14.2; amd64; ; ) In-Reply-To: <54F4DE0D.7070606@astrodoggroup.com> References: <54EF2C54.7030207@astrodoggroup.com> <5490895.NN1ciTh6gZ@ralph.baldwin.cx> <54F4DE0D.7070606@astrodoggroup.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Wed, 04 Mar 2015 11:23:25 -0500 (EST) Cc: Harrison Grundy X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Mar 2015 16:23:27 -0000 On Monday, March 02, 2015 02:02:53 PM Harrison Grundy wrote: > > But why would a driver want to do that? This code: > > sched_pin(td); > > > > /* do something */ > > > > thread_lock(td); > > sched_unpin(td); > > sched_bind(td, PCPU_GET(cpuid)); > > thread_unlock(td); > > > > /* do something else */ > > > > thread_lock(td); > > sched_unbind(td); > > thread_unlock(td); > > > > Is equivalent to: > > sched_pin(td); > > > > /* do something */ > > > > /* do something else */ > > > > sched_unpin(td); > > > > But the latter form is lighter weight and easier to read / understand. > > > > Letting you sched_bind() to the current CPU while you are pinned doesn't > > enable any new functionality than you can already achieve by just using > > sched_pin() and sched_unpin(). > > The difference between the two is that TSF_BOUND is set for "do > something else" in the former case. > > As I understand the difference, sched_pin is designed for temporarily > assigning to a CPU, while sched_bind is intended for longer-term affinity. sched_bind() calls sched_pin() internally, so that isn't the difference. The flag only exists to know which type of pinning is in force. The differences are that sched_pin() assumes the current CPU rather than a specific CPU and that it can nest, whereas sched_bind() is used to move a thread to a specific CPU and it cannot nest. It is true that one cannot bind a pinned thread, but that is because the semantics conflict, not because one is longer term than the other. You can't return from a system call using either sched_bind or sched_pin for example. For a longer term binding you need to set the thread's cpuset instead. sched_pin() and sched_bind() are both "short term" in that regard. > The patch would allow you to set the bound flag without unpinning, > basically. It seems easier to do this here, than add a "set bound flag" > function that allows drivers to "promote" themselves from pinned to > bound, though that would also be an option to get to the same place. I don't see a use case for why a driver would want to do this. Do you have a specific real world use case in mind? -- John Baldwin From owner-freebsd-arch@FreeBSD.ORG Fri Mar 6 01:46:03 2015 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id A91F03B6; Fri, 6 Mar 2015 01:46:03 +0000 (UTC) Received: from mail-ie0-x236.google.com (mail-ie0-x236.google.com [IPv6:2607:f8b0:4001:c03::236]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 6E37C935; Fri, 6 Mar 2015 01:46:03 +0000 (UTC) Received: by iecrd18 with SMTP id rd18so81909821iec.5; Thu, 05 Mar 2015 17:46:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=eA5Ps1rPFN1g0mpel7k3k5CZOY6Io0ffDFqU17XqJKs=; b=z8GHFoKt1Pi9GY9u7cQutwWEMeo9u5giprjIBYuwS2/KUdy6ZooKfoBdn45ohYmYSG PpmzP0q9l67NhJuFQKPaqvWEZSIJrgfIXaKpPxtGAl/afc0wlNsE7jQK0OOpdzWUTFw/ GZ/FC++dOOTvRvqBvWcFfZv+DQZ4nExk9ZzfvggLCVRHn2YvkF/fAIuZan7wYaWSwFAQ ZoaLseGIyvIfR70paDJEVZfcUiofcy0NCe4Qdb2qhxLfA0u86SUIkYAcPRHDsGMwHhgZ 9qh7cYhSXA3S2BrJwHcNQ62KvCKTdPKyheSmVNemgKgA3duq1EojXWtlIOfp8XHLCsUB XewA== MIME-Version: 1.0 X-Received: by 10.50.93.70 with SMTP id cs6mr50547601igb.6.1425606361842; Thu, 05 Mar 2015 17:46:01 -0800 (PST) Sender: adrian.chadd@gmail.com Received: by 10.36.17.66 with HTTP; Thu, 5 Mar 2015 17:46:01 -0800 (PST) In-Reply-To: <6632720.8QN4idWR9d@ralph.baldwin.cx> References: <1848011.eGOHhpCEMm@ralph.baldwin.cx> <2650364.MV3AvSBuVe@ralph.baldwin.cx> <6632720.8QN4idWR9d@ralph.baldwin.cx> Date: Thu, 5 Mar 2015 17:46:01 -0800 X-Google-Sender-Auth: -K4HUQwvbOrNjJTSQ9yqeURT0gU Message-ID: Subject: Re: RFC: bus_get_cpus(9) From: Adrian Chadd To: John Baldwin Content-Type: text/plain; charset=UTF-8 Cc: "freebsd-arch@freebsd.org" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Mar 2015 01:46:03 -0000 [snip] So I'm happy with this. I'll do up a conversion to ixgbe once it lands in the tree, and then I'll (eventually) extend RSS to support this. Want to wrap it up in a review so we can get it into -HEAD? Thanks, -adrian From owner-freebsd-arch@FreeBSD.ORG Fri Mar 6 20:48:00 2015 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 8F368B96; Fri, 6 Mar 2015 20:48:00 +0000 (UTC) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 69106F6E; Fri, 6 Mar 2015 20:48:00 +0000 (UTC) Received: from ralph.baldwin.cx (pool-173-54-116-245.nwrknj.fios.verizon.net [173.54.116.245]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 3289BB915; Fri, 6 Mar 2015 15:47:56 -0500 (EST) From: John Baldwin To: freebsd-arch@freebsd.org Subject: RFC: Simplfying hyperthreading distinctions Date: Fri, 06 Mar 2015 15:44:06 -0500 Message-ID: <1640664.8z9mx3EOQs@ralph.baldwin.cx> User-Agent: KMail/4.14.2 (FreeBSD/10.1-STABLE; KDE/4.14.2; amd64; ; ) MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Fri, 06 Mar 2015 15:47:56 -0500 (EST) Cc: 'Andriy Gapon' X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Mar 2015 20:48:00 -0000 Currently we go out of our way a bit to distinguish Pentium4-era hyperthreading from more recent ("modern") hyperthreading. I suspect that this distinction probably results in confusion more than anything else. Intel's documentation does not make near as broad a distinction as far as I can tell. Both types of SMT are called hyperthreading in the SDM for example. However, we have the astonishing behavior that 'machdep.hyperthreading_allowed' only affects "old" hyperthreads, but not "new" ones. We also try to be overly cute in our dmesg output by using HTT for "old" hyperthreading, and SMT for "new" hyperthreading. I propose the following changes to simplify things a bit: 1) Call both "old" and "new" hyperthreading HTT in dmesg. 2) Change machdep.hyperthreading_allowed to apply to both new and old HTT. However, doing this means a POLA violation in that we would now disable modern HTT by default. Balanced against re-enabling "old" HTT by default on an increasingly-shrinking pool of old hardware, I think the better approach here would be to also change the default to allow HTT. 3) Possibly add a different knob (or change the behavior of machdep.hyperthreading_allowed) to still bring up hyperthreads, but leave them out of the default cpuset (set 1). This would allow those threads to be re-enabled dynamically at runtime by adjusting the mask on set 1. The original htt settings back when 'hyperthreading_allowed' was introduced actually permitted this via by adjusting 'machdep.hlt_cpus' at runtime. What do people think? -- John Baldwin From owner-freebsd-arch@FreeBSD.ORG Fri Mar 6 20:50:39 2015 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id B8DC8D6C for ; Fri, 6 Mar 2015 20:50:39 +0000 (UTC) Received: from mailout.easymail.ca (mailout.easymail.ca [64.68.201.169]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4BAB1A9 for ; Fri, 6 Mar 2015 20:50:38 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by mailout.easymail.ca (Postfix) with ESMTP id 9097BE3E5 for ; Fri, 6 Mar 2015 15:50:36 -0500 (EST) X-Virus-Scanned: Debian amavisd-new at mailout.easymail.ca X-Spam-Flag: NO X-Spam-Score: -3.845 X-Spam-Level: X-Spam-Status: No, score=-3.845 required=5 tests=[ALL_TRUSTED=-1.8, AWL=-0.138, BAYES_00=-2.599, DNS_FROM_AHBL_RHSBL=0.692] Received: from mailout.easymail.ca ([127.0.0.1]) by localhost (easymail-mailout.easydns.vpn [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id XCOpjyaia+KJ for ; Fri, 6 Mar 2015 15:50:36 -0500 (EST) Received: from bsddt1241.lv01.astrodoggroup.com (unknown [40.141.24.126]) (using TLSv1 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by mailout.easymail.ca (Postfix) with ESMTPSA id 2BB3AE3DF for ; Fri, 6 Mar 2015 15:50:36 -0500 (EST) Message-ID: <54FA1180.3080605@astrodoggroup.com> Date: Fri, 06 Mar 2015 12:43:44 -0800 From: Harrison Grundy User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:31.0) Gecko/20100101 Thunderbird/31.3.0 MIME-Version: 1.0 To: freebsd-arch@freebsd.org Subject: Re: RFC: Simplfying hyperthreading distinctions References: <1640664.8z9mx3EOQs@ralph.baldwin.cx> In-Reply-To: <1640664.8z9mx3EOQs@ralph.baldwin.cx> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Mar 2015 20:50:39 -0000 On 03/06/15 12:44, John Baldwin wrote: > Currently we go out of our way a bit to distinguish Pentium4-era > hyperthreading from more recent ("modern") hyperthreading. I > suspect that this distinction probably results in confusion more > than anything else. Intel's documentation does not make near as > broad a distinction as far as I can tell. Both types of SMT are > called hyperthreading in the SDM for example. However, we have the > astonishing behavior that 'machdep.hyperthreading_allowed' only > affects "old" hyperthreads, but not "new" ones. We also try to be > overly cute in our dmesg output by using HTT for "old" > hyperthreading, and SMT for "new" hyperthreading. I propose the > following changes to simplify things a bit: > > 1) Call both "old" and "new" hyperthreading HTT in dmesg. > > 2) Change machdep.hyperthreading_allowed to apply to both new and > old HTT. However, doing this means a POLA violation in that we > would now disable modern HTT by default. Balanced against > re-enabling "old" HTT by default on an increasingly-shrinking pool > of old hardware, I think the better approach here would be to also > change the default to allow HTT. > > 3) Possibly add a different knob (or change the behavior of > machdep.hyperthreading_allowed) to still bring up hyperthreads, but > leave them out of the default cpuset (set 1). This would allow > those threads to be re-enabled dynamically at runtime by adjusting > the mask on set 1. The original htt settings back when > 'hyperthreading_allowed' was introduced actually permitted this via > by adjusting 'machdep.hlt_cpus' at runtime. > > What do people think? > I'm not sure of how interrupt handling works as it relates to HTT, but wouldn't using cpuset potentially leave them active for interrupt handling? Other than that question, this all makes sense to me. --- Harrison From owner-freebsd-arch@FreeBSD.ORG Fri Mar 6 20:55:29 2015 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 5856D313; Fri, 6 Mar 2015 20:55:29 +0000 (UTC) Received: from zxy.spb.ru (zxy.spb.ru [195.70.199.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 10FBEED; Fri, 6 Mar 2015 20:55:29 +0000 (UTC) Received: from slw by zxy.spb.ru with local (Exim 4.84 (FreeBSD)) (envelope-from ) id 1YTzHE-00097g-7u; Fri, 06 Mar 2015 23:55:20 +0300 Date: Fri, 6 Mar 2015 23:55:20 +0300 From: Slawa Olhovchenkov To: John Baldwin Subject: Re: RFC: Simplfying hyperthreading distinctions Message-ID: <20150306205520.GA95179@zxy.spb.ru> References: <1640664.8z9mx3EOQs@ralph.baldwin.cx> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1640664.8z9mx3EOQs@ralph.baldwin.cx> User-Agent: Mutt/1.5.23 (2014-03-12) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: slw@zxy.spb.ru X-SA-Exim-Scanned: No (on zxy.spb.ru); SAEximRunCond expanded to false Cc: 'Andriy Gapon' , freebsd-arch@freebsd.org X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Mar 2015 20:55:29 -0000 On Fri, Mar 06, 2015 at 03:44:06PM -0500, John Baldwin wrote: > Currently we go out of our way a bit to distinguish Pentium4-era > hyperthreading from more recent ("modern") hyperthreading. I suspect that > this distinction probably results in confusion more than anything else. > Intel's documentation does not make near as broad a distinction as far as I > can tell. Both types of SMT are called hyperthreading in the SDM for example. > However, we have the astonishing behavior that > 'machdep.hyperthreading_allowed' only affects "old" hyperthreads, but not > "new" ones. We also try to be overly cute in our dmesg output by using HTT > for "old" hyperthreading, and SMT for "new" hyperthreading. I propose the > following changes to simplify things a bit: > > 1) Call both "old" and "new" hyperthreading HTT in dmesg. > > 2) Change machdep.hyperthreading_allowed to apply to both new and old HTT. > However, doing this means a POLA violation in that we would now disable > modern HTT by default. Balanced against re-enabling "old" HTT by default > on an increasingly-shrinking pool of old hardware, I think the better > approach here would be to also change the default to allow HTT. > 3) Possibly add a different knob (or change the behavior of > machdep.hyperthreading_allowed) to still bring up hyperthreads, but leave > them out of the default cpuset (set 1). This would allow those threads > to be re-enabled dynamically at runtime by adjusting the mask on set 1. > The original htt settings back when 'hyperthreading_allowed' was > introduced actually permitted this via by adjusting 'machdep.hlt_cpus' at > runtime. > > What do people think? Do you have expiriment with 3)? And compare with HTT/SMT disabled in BIOS? My expirense (for may workload) with SMT is very bad -- unperdicable performance in pair threads don't allow to build high (and prdicable) performance system. From owner-freebsd-arch@FreeBSD.ORG Fri Mar 6 20:59:23 2015 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 862AF50B; Fri, 6 Mar 2015 20:59:23 +0000 (UTC) Received: from st11p02mm-asmtp001.mac.com (st11p02mm-asmtp001.mac.com [17.172.220.236]) (using TLSv1.2 with cipher DHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 5B3E3125; Fri, 6 Mar 2015 20:59:23 +0000 (UTC) Received: from [10.60.0.53] (209-23-203-214-Illinois.hfc.comcastbusiness.net [209.23.203.214]) by st11p02mm-asmtp001.mac.com (Oracle Communications Messaging Server 7.0.5.35.0 64bit (built Dec 4 2014)) with ESMTPSA id <0NKT00EL66AGWU40@st11p02mm-asmtp001.mac.com>; Fri, 06 Mar 2015 20:59:06 +0000 (GMT) X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:5.13.68,1.0.33,0.0.0000 definitions=2015-03-06_06:2015-03-06,2015-03-06,1970-01-01 signatures=0 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 suspectscore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=7.0.1-1412110000 definitions=main-1503060232 Content-type: text/plain; charset=us-ascii MIME-version: 1.0 (Mac OS X Mail 8.2 \(2070.6\)) Subject: Re: RFC: Simplfying hyperthreading distinctions From: Rui Paulo In-reply-to: <1640664.8z9mx3EOQs@ralph.baldwin.cx> Date: Fri, 06 Mar 2015 12:59:04 -0800 Content-transfer-encoding: quoted-printable Message-id: <145BFBD8-7092-42F3-BCF3-5CBB78D15893@me.com> References: <1640664.8z9mx3EOQs@ralph.baldwin.cx> To: John Baldwin X-Mailer: Apple Mail (2.2070.6) Cc: Andriy Gapon , freebsd-arch@freebsd.org X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Mar 2015 20:59:23 -0000 > On 6 Mar 2015, at 12:44, John Baldwin wrote: >=20 > Currently we go out of our way a bit to distinguish Pentium4-era=20 > hyperthreading from more recent ("modern") hyperthreading. I suspect = that=20 > this distinction probably results in confusion more than anything = else. =20 > Intel's documentation does not make near as broad a distinction as far = as I=20 > can tell. Both types of SMT are called hyperthreading in the SDM for = example. =20 > However, we have the astonishing behavior that=20 > 'machdep.hyperthreading_allowed' only affects "old" hyperthreads, but = not=20 > "new" ones. We also try to be overly cute in our dmesg output by = using HTT=20 > for "old" hyperthreading, and SMT for "new" hyperthreading. Yes, this is annoying. > I propose the=20 > following changes to simplify things a bit: >=20 > 1) Call both "old" and "new" hyperthreading HTT in dmesg. Yes. > 2) Change machdep.hyperthreading_allowed to apply to both new and old = HTT. > However, doing this means a POLA violation in that we would now = disable > modern HTT by default. Balanced against re-enabling "old" HTT by = default > on an increasingly-shrinking pool of old hardware, I think the = better > approach here would be to also change the default to allow HTT. I think that's ok given 3). > 3) Possibly add a different knob (or change the behavior of > machdep.hyperthreading_allowed) to still bring up hyperthreads, = but leave > them out of the default cpuset (set 1). This would allow those = threads > to be re-enabled dynamically at runtime by adjusting the mask on = set 1. > The original htt settings back when 'hyperthreading_allowed' was > introduced actually permitted this via by adjusting = 'machdep.hlt_cpus' at > runtime. Sounds good. Thanks, -- Rui Paulo From owner-freebsd-arch@FreeBSD.ORG Fri Mar 6 21:37:05 2015 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id C2AF3ED1; Fri, 6 Mar 2015 21:37:05 +0000 (UTC) Received: from mail-ie0-x22f.google.com (mail-ie0-x22f.google.com [IPv6:2607:f8b0:4001:c03::22f]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 8858A77C; Fri, 6 Mar 2015 21:37:05 +0000 (UTC) Received: by iecvy18 with SMTP id vy18so24966258iec.1; Fri, 06 Mar 2015 13:37:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=42GlYp3kKhD1sL5Dpo6bs0MAuSHvyjOIt3zFER71bqI=; b=I8ln/4eRbp3pNZBNwAOfN/SmjEh7j3pPmcDuThyS+Z4iSqqcebGmYf5UI87Ec5eq0h yR3p1eiwINi+u8UacPrwdaM3YebJXizwdPjMK0cx+F0SdEHdiyhAyhBDPaq8feqcAAx3 fIkny4qWrb5lMRFOYvI+k8AY4t/BlV1iUrVnw2xGYALfMYAOM7pmP4DzSmfO1jijhT/K Gx+z25LZmcHh8u5MCivZR1JhFEsnxe6ZkyZFickg/cdEgm3TR10j1+uZaN2FT63w2QBE QX8L5XEzldy7b0xzBH69h/kgEzJwNYsYBCnIcXl25OLfRld/QIZW53UXOROfyxN03uhA FCQQ== MIME-Version: 1.0 X-Received: by 10.50.43.201 with SMTP id y9mr30609618igl.6.1425677824970; Fri, 06 Mar 2015 13:37:04 -0800 (PST) Sender: adrian.chadd@gmail.com Received: by 10.36.17.66 with HTTP; Fri, 6 Mar 2015 13:37:04 -0800 (PST) In-Reply-To: <1640664.8z9mx3EOQs@ralph.baldwin.cx> References: <1640664.8z9mx3EOQs@ralph.baldwin.cx> Date: Fri, 6 Mar 2015 13:37:04 -0800 X-Google-Sender-Auth: IHAsQkgKQM-8D-ylGP2jjswoX8s Message-ID: Subject: Re: RFC: Simplfying hyperthreading distinctions From: Adrian Chadd To: John Baldwin Content-Type: text/plain; charset=UTF-8 Cc: Andriy Gapon , "freebsd-arch@freebsd.org" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Mar 2015 21:37:05 -0000 Hi! 1) I'd rather we leave them as SMT/HTT as they're slightly different things. Who knows if intel will re-introduce this stuff in their more embedded CPU line at a future time, or add another threading type in the future. Being told about the distinction is nice. 2) I'd rather we had it more clearly defind - machdep.htt_allowed / machdep.smt_allowed . Again, I'd rather have the distinction in case Intel decide again to make their embedded things use old-style threading. (The intel edison/galilleo boards use P1 style cores that are low power, I can imagine a world where they reuse HTT for that.) 3) I'd like that kind of tunable setting. And: 4) Yes, I'd also like a machdep tunable for "don't bother routing interrupts to SMT / HTTs". You have that patch in your jhbbsd tree; I don't think it's in HEAD yet? -adrian From owner-freebsd-arch@FreeBSD.ORG Fri Mar 6 21:46:07 2015 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 00A71492 for ; Fri, 6 Mar 2015 21:46:06 +0000 (UTC) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id CC9C586F for ; Fri, 6 Mar 2015 21:46:06 +0000 (UTC) Received: from ralph.baldwin.cx (pool-173-54-116-245.nwrknj.fios.verizon.net [173.54.116.245]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 5CFFFB980; Fri, 6 Mar 2015 16:46:05 -0500 (EST) From: John Baldwin To: freebsd-arch@freebsd.org Subject: Re: RFC: Simplfying hyperthreading distinctions Date: Fri, 06 Mar 2015 16:17:37 -0500 Message-ID: <1526311.uylCbgv5VB@ralph.baldwin.cx> User-Agent: KMail/4.14.2 (FreeBSD/10.1-STABLE; KDE/4.14.2; amd64; ; ) In-Reply-To: <54FA1180.3080605@astrodoggroup.com> References: <1640664.8z9mx3EOQs@ralph.baldwin.cx> <54FA1180.3080605@astrodoggroup.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Fri, 06 Mar 2015 16:46:05 -0500 (EST) Cc: Harrison Grundy X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Mar 2015 21:46:07 -0000 On Friday, March 06, 2015 12:43:44 PM Harrison Grundy wrote: > On 03/06/15 12:44, John Baldwin wrote: > > Currently we go out of our way a bit to distinguish Pentium4-era > > hyperthreading from more recent ("modern") hyperthreading. I > > suspect that this distinction probably results in confusion more > > than anything else. Intel's documentation does not make near as > > broad a distinction as far as I can tell. Both types of SMT are > > called hyperthreading in the SDM for example. However, we have the > > astonishing behavior that 'machdep.hyperthreading_allowed' only > > affects "old" hyperthreads, but not "new" ones. We also try to be > > overly cute in our dmesg output by using HTT for "old" > > hyperthreading, and SMT for "new" hyperthreading. I propose the > > following changes to simplify things a bit: > > > > 1) Call both "old" and "new" hyperthreading HTT in dmesg. > > > > 2) Change machdep.hyperthreading_allowed to apply to both new and > > old HTT. However, doing this means a POLA violation in that we > > would now disable modern HTT by default. Balanced against > > re-enabling "old" HTT by default on an increasingly-shrinking pool > > of old hardware, I think the better approach here would be to also > > change the default to allow HTT. > > > > 3) Possibly add a different knob (or change the behavior of > > machdep.hyperthreading_allowed) to still bring up hyperthreads, but > > leave them out of the default cpuset (set 1). This would allow > > those threads to be re-enabled dynamically at runtime by adjusting > > the mask on set 1. The original htt settings back when > > 'hyperthreading_allowed' was introduced actually permitted this via > > by adjusting 'machdep.hlt_cpus' at runtime. > > > > What do people think? > > I'm not sure of how interrupt handling works as it relates to HTT, but > wouldn't using cpuset potentially leave them active for interrupt > handling? > > Other than that question, this all makes sense to me. Interrupt handling works differently. Per my commit a few minutes ago, we do not send interrupts to hyperthreads by default (either old or new). However, ithreads that are not explicitly bound to a specific CPU will "float" among all the CPUs in set 1 so 3) would affect that. Eventually I want to use a separate cpuset for interrupts that ithreads inherit from (rather than belonging to set 1). -- John Baldwin From owner-freebsd-arch@FreeBSD.ORG Fri Mar 6 21:58:38 2015 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id A8771868; Fri, 6 Mar 2015 21:58:38 +0000 (UTC) Received: from zxy.spb.ru (zxy.spb.ru [195.70.199.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 628D4967; Fri, 6 Mar 2015 21:58:38 +0000 (UTC) Received: from slw by zxy.spb.ru with local (Exim 4.84 (FreeBSD)) (envelope-from ) id 1YU0GR-000A0J-Bz; Sat, 07 Mar 2015 00:58:35 +0300 Date: Sat, 7 Mar 2015 00:58:35 +0300 From: Slawa Olhovchenkov To: Adrian Chadd Subject: Re: RFC: Simplfying hyperthreading distinctions Message-ID: <20150306215835.GB95179@zxy.spb.ru> References: <1640664.8z9mx3EOQs@ralph.baldwin.cx> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: slw@zxy.spb.ru X-SA-Exim-Scanned: No (on zxy.spb.ru); SAEximRunCond expanded to false Cc: Andriy Gapon , "freebsd-arch@freebsd.org" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Mar 2015 21:58:38 -0000 On Fri, Mar 06, 2015 at 01:37:04PM -0800, Adrian Chadd wrote: > Hi! > > 1) I'd rather we leave them as SMT/HTT as they're slightly different > things. Who knows if intel will re-introduce this stuff in their more > embedded CPU line at a future time, or add another threading type in > the future. Being told about the distinction is nice. May be diagnostic HTT[SMT] or HTT[HTT] is best chois? > 2) I'd rather we had it more clearly defind - machdep.htt_allowed / > machdep.smt_allowed . Again, I'd rather have the distinction in case > Intel decide again to make their embedded things use old-style > threading. (The intel edison/galilleo boards use P1 style cores that > are low power, I can imagine a world where they reuse HTT for that.) I think this distinction don't need -- any way this setup is per-box. If you need to disable HTT/SMT -- you don't need to choise between machdep.htt_allowed and machdep.smt_allowed -- only one exist. From owner-freebsd-arch@FreeBSD.ORG Fri Mar 6 22:01:20 2015 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id A46F0A47; Fri, 6 Mar 2015 22:01:20 +0000 (UTC) Received: from mail-ie0-x233.google.com (mail-ie0-x233.google.com [IPv6:2607:f8b0:4001:c03::233]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 66F4CA15; Fri, 6 Mar 2015 22:01:20 +0000 (UTC) Received: by iecrp18 with SMTP id rp18so21309295iec.10; Fri, 06 Mar 2015 14:01:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=+v95Hrd3atEhlRpWIYYrAfsOyC+lDxfpeKQNDDf2Z7k=; b=LPrWuBUjO2rRp8Fnbuv0wmvMVz8fFGcN7X4f87RCWDb7j8EIn6xOXdTM7K3nV/jJu3 sFP9Gu/Xo1tXI5xKCb9mLBK/3rO2dtSSOytBopcPAhZ25tqd1q7urqCv+hQVjTXGALpF UafT+PiPie+8J62mTf3EKlC94o66KnpMAjUDC1vXf/8z+4raXXtVvY1MMJckcYDYSsg+ K40npnJ0DBdOl66Zzop4+PQtmkyGJ6s/Et2uGMuxdpa8KISg5cpHioa5vb3r+sbc32wY 1iRgHBA8laPr8zRs8aJx+TTy4qkpGEKZgwyIh5IlDvfGqf9VIlyCnJ6Nx5eiGHndgdGV n4zQ== MIME-Version: 1.0 X-Received: by 10.107.136.230 with SMTP id s99mr30750931ioi.8.1425679279745; Fri, 06 Mar 2015 14:01:19 -0800 (PST) Sender: adrian.chadd@gmail.com Received: by 10.36.17.66 with HTTP; Fri, 6 Mar 2015 14:01:19 -0800 (PST) In-Reply-To: <20150306215835.GB95179@zxy.spb.ru> References: <1640664.8z9mx3EOQs@ralph.baldwin.cx> <20150306215835.GB95179@zxy.spb.ru> Date: Fri, 6 Mar 2015 14:01:19 -0800 X-Google-Sender-Auth: Rv4Gy8Txg8etT3XRKu56IOZS6Us Message-ID: Subject: Re: RFC: Simplfying hyperthreading distinctions From: Adrian Chadd To: Slawa Olhovchenkov Content-Type: text/plain; charset=UTF-8 Cc: Andriy Gapon , "freebsd-arch@freebsd.org" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Mar 2015 22:01:20 -0000 If you're looking to ship a single image that runs on a variety of platforms (say, you're pfsense, just say) and you want to set defaults that say "hey, i don't mind doing some processing on SMT, but HTT is no thanks) then how do you achieve that? -a From owner-freebsd-arch@FreeBSD.ORG Fri Mar 6 22:08:43 2015 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id B9B0BD60; Fri, 6 Mar 2015 22:08:43 +0000 (UTC) Received: from zxy.spb.ru (zxy.spb.ru [195.70.199.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 728D4A72; Fri, 6 Mar 2015 22:08:43 +0000 (UTC) Received: from slw by zxy.spb.ru with local (Exim 4.84 (FreeBSD)) (envelope-from ) id 1YU0QD-000AAB-JF; Sat, 07 Mar 2015 01:08:41 +0300 Date: Sat, 7 Mar 2015 01:08:41 +0300 From: Slawa Olhovchenkov To: Adrian Chadd Subject: Re: RFC: Simplfying hyperthreading distinctions Message-ID: <20150306220841.GC95179@zxy.spb.ru> References: <1640664.8z9mx3EOQs@ralph.baldwin.cx> <20150306215835.GB95179@zxy.spb.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: slw@zxy.spb.ru X-SA-Exim-Scanned: No (on zxy.spb.ru); SAEximRunCond expanded to false Cc: Andriy Gapon , "freebsd-arch@freebsd.org" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Mar 2015 22:08:43 -0000 On Fri, Mar 06, 2015 at 02:01:19PM -0800, Adrian Chadd wrote: > If you're looking to ship a single image that runs on a variety of > platforms (say, you're pfsense, just say) and you want to set defaults > that say "hey, i don't mind doing some processing on SMT, but HTT is > no thanks) then how do you achieve that? some platforms have SMT, some HTT, yes? and some one socket, some multi-socket... some 1Gbit NIC, some 2x10Gbit NIC, some 1 of 2x40Gbit NIC. All of this need individual tuning. Or smart auto-tuning. Yes, I have similar case. I am use smart auto-tuning (on first boot after setup). From owner-freebsd-arch@FreeBSD.ORG Fri Mar 6 23:14:13 2015 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 7F55891A; Fri, 6 Mar 2015 23:14:13 +0000 (UTC) Received: from st11p02mm-asmtp001.mac.com (st11p02mm-asmtpout001.mac.com [17.172.220.236]) (using TLSv1.2 with cipher DHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 50B66287; Fri, 6 Mar 2015 23:14:12 +0000 (UTC) Received: from rpaulo-dt.sj.pi-coral.com (unknown [12.218.212.178]) by st11p02mm-asmtp001.mac.com (Oracle Communications Messaging Server 7.0.5.35.0 64bit (built Dec 4 2014)) with ESMTPSA id <0NKT001VPCJJPI30@st11p02mm-asmtp001.mac.com>; Fri, 06 Mar 2015 23:14:09 +0000 (GMT) X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:5.13.68,1.0.33,0.0.0000 definitions=2015-03-06_07:2015-03-06,2015-03-06,1970-01-01 signatures=0 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 suspectscore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=7.0.1-1412110000 definitions=main-1503060255 Content-type: text/plain; charset=us-ascii MIME-version: 1.0 (Mac OS X Mail 8.2 \(2070.6\)) Subject: Re: RFC: Simplfying hyperthreading distinctions From: Rui Paulo In-reply-to: Date: Fri, 06 Mar 2015 15:14:06 -0800 Content-transfer-encoding: quoted-printable Message-id: References: <1640664.8z9mx3EOQs@ralph.baldwin.cx> <20150306215835.GB95179@zxy.spb.ru> To: Adrian Chadd X-Mailer: Apple Mail (2.2070.6) Cc: "freebsd-arch@freebsd.org" , Andriy Gapon , Slawa Olhovchenkov X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Mar 2015 23:14:13 -0000 On 6 Mar 2015, at 14:01, Adrian Chadd wrote: >=20 > If you're looking to ship a single image that runs on a variety of > platforms (say, you're pfsense, just say) and you want to set defaults > that say "hey, i don't mind doing some processing on SMT, but HTT is > no thanks) then how do you achieve that? Who cares? We're talking about Pentium 4 stuff. I'm with John: the = distinction should go away. If Intel comes up with a new name, we'll = just think about it when it happens. -- Rui Paulo From owner-freebsd-arch@FreeBSD.ORG Fri Mar 6 23:42:25 2015 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 54BFCDFB; Fri, 6 Mar 2015 23:42:25 +0000 (UTC) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 2CC027A7; Fri, 6 Mar 2015 23:42:25 +0000 (UTC) Received: from ralph.baldwin.cx (pool-173-54-116-245.nwrknj.fios.verizon.net [173.54.116.245]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 66722B918; Fri, 6 Mar 2015 18:42:23 -0500 (EST) From: John Baldwin To: Adrian Chadd Subject: Re: RFC: Simplfying hyperthreading distinctions Date: Fri, 06 Mar 2015 18:41:48 -0500 Message-ID: <2092193.qt8NhEKglv@ralph.baldwin.cx> User-Agent: KMail/4.14.2 (FreeBSD/10.1-STABLE; KDE/4.14.2; amd64; ; ) In-Reply-To: References: <1640664.8z9mx3EOQs@ralph.baldwin.cx> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Fri, 06 Mar 2015 18:42:23 -0500 (EST) Cc: Andriy Gapon , "freebsd-arch@freebsd.org" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Mar 2015 23:42:25 -0000 On Friday, March 06, 2015 01:37:04 PM Adrian Chadd wrote: > Hi! > > 1) I'd rather we leave them as SMT/HTT as they're slightly different > things. Who knows if intel will re-introduce this stuff in their more > embedded CPU line at a future time, or add another threading type in > the future. Being told about the distinction is nice. > 2) I'd rather we had it more clearly defind - machdep.htt_allowed / > machdep.smt_allowed . Again, I'd rather have the distinction in case > Intel decide again to make their embedded things use old-style > threading. (The intel edison/galilleo boards use P1 style cores that > are low power, I can imagine a world where they reuse HTT for that.) I don't think Netburst is coming back. Even the Atom stuff is based on the PIII/Core line, not Netburst (and Atom CPUs that support HTT use the newer- style HTT, not Netburst). In the SDM it mostly seems to be Netburst vs everything else where the distinction is made (and it's not always made). We are now in the odd situation where we refer to a small (and shrinking) set of CPUs that support HTT as HTT and we refer to a much larger (and growing) set of CPUs that support HTT as something else. This means that if a random user wants to see if FreeBSD supports HTT they won't see that in dmesg on a modern CPU without having some sort of magic decoder ring. I also think the set of folks who actually care about the slight differences in HTT is quite small and not worth the cost of looking as if we don't support HTT on the majority of processors that support it. Note that BIOS manufacturers have gotten away with labeling the two things the same without people bringing out pitchforks, so I think having FreeBSD be consistent with that (and other OS's) is less confusing to users, not more. -- John Baldwin From owner-freebsd-arch@FreeBSD.ORG Fri Mar 6 23:45:14 2015 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id E78A1EED; Fri, 6 Mar 2015 23:45:14 +0000 (UTC) Received: from mail-ie0-x236.google.com (mail-ie0-x236.google.com [IPv6:2607:f8b0:4001:c03::236]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id AE55B7B9; Fri, 6 Mar 2015 23:45:14 +0000 (UTC) Received: by iecrl12 with SMTP id rl12so7027715iec.5; Fri, 06 Mar 2015 15:45:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=BtJI89jCKDynm46clhhR7KnNCm95Y+U9kzqhA4uLeUI=; b=bZQFcRA4416fhgc4Whk+JuPicpL+TQyHqiTOIHkcxY/D2KtQ3pKtnqqx6q1/AqFKC1 3WZ4p7cIjVDLaCksg2WfLYGoo8QMS4xbs8UTWGeb0mCdfOBaP2xrlRWqNOA9d+8ikdlc 60TgrDTXgCCh3K18mXbSKsptpz8PJhkhUDaTfwlqiFgeNZOa24UK14nCMbt2TMS9oN55 735DHIAqLNbVqYzaj8kYQIp/dzHKB419ndtNuqFQlgq1A18hVWLwrhfDjY6iovHLQ0zl 3l2fqK0VXCngphd7Ur0nt1drk41u6dE/z2WnIsWqPbDKA/uNFAAgQWuoKxRBHdOVgc0v 8xDA== MIME-Version: 1.0 X-Received: by 10.42.109.12 with SMTP id j12mr13100643icp.22.1425685513988; Fri, 06 Mar 2015 15:45:13 -0800 (PST) Sender: adrian.chadd@gmail.com Received: by 10.36.17.66 with HTTP; Fri, 6 Mar 2015 15:45:13 -0800 (PST) In-Reply-To: <2092193.qt8NhEKglv@ralph.baldwin.cx> References: <1640664.8z9mx3EOQs@ralph.baldwin.cx> <2092193.qt8NhEKglv@ralph.baldwin.cx> Date: Fri, 6 Mar 2015 15:45:13 -0800 X-Google-Sender-Auth: CiJVVoVMuvwnQq8H5f0qfTlqkaE Message-ID: Subject: Re: RFC: Simplfying hyperthreading distinctions From: Adrian Chadd To: John Baldwin Content-Type: text/plain; charset=UTF-8 Cc: Andriy Gapon , "freebsd-arch@freebsd.org" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Mar 2015 23:45:15 -0000 Hi! Hm, I looked at this: https://en.wikipedia.org/wiki/Bonnell_%28microarchitecture%29 .. and thought it was old-school HTT. If it's not old-school HTT then cool. -adrian From owner-freebsd-arch@FreeBSD.ORG Sat Mar 7 02:14:09 2015 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id BA5F46BF; Sat, 7 Mar 2015 02:14:09 +0000 (UTC) Received: from c.mail.sonic.net (c.mail.sonic.net [64.142.111.80]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 856787A6; Sat, 7 Mar 2015 02:14:09 +0000 (UTC) Received: from aurora.physics.berkeley.edu (aurora.Physics.Berkeley.EDU [128.32.117.67]) (authenticated bits=0) by c.mail.sonic.net (8.15.1/8.15.1) with ESMTPSA id t272E1M3030345 (version=TLSv1.2 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Fri, 6 Mar 2015 18:14:02 -0800 Message-ID: <54FA5EE9.4090305@freebsd.org> Date: Fri, 06 Mar 2015 18:14:01 -0800 From: Nathan Whitehorn User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: John Baldwin , freebsd-arch@freebsd.org Subject: Re: RFC: Simplfying hyperthreading distinctions References: <1640664.8z9mx3EOQs@ralph.baldwin.cx> In-Reply-To: <1640664.8z9mx3EOQs@ralph.baldwin.cx> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-Sonic-CAuth: UmFuZG9tSVZCe8u/SDzcYNWz1twsPF9nWYp8mFS4hlveMjaYdazIu/VoLZatzaVCQhm6PJtvKoLhRdysuwsD680Iggr8eYg9+wWNraEnL2g= X-Sonic-ID: C;0u0rn2/E5BGBrO8Jj30JFw== M;sHVcn2/E5BGBrO8Jj30JFw== X-Spam-Flag: No X-Sonic-Spam-Details: 0.0/5.0 by cerberusd Cc: 'Andriy Gapon' X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 07 Mar 2015 02:14:09 -0000 On 03/06/15 12:44, John Baldwin wrote: > Currently we go out of our way a bit to distinguish Pentium4-era > hyperthreading from more recent ("modern") hyperthreading. I suspect that > this distinction probably results in confusion more than anything else. > Intel's documentation does not make near as broad a distinction as far as I > can tell. Both types of SMT are called hyperthreading in the SDM for example. > However, we have the astonishing behavior that > 'machdep.hyperthreading_allowed' only affects "old" hyperthreads, but not > "new" ones. We also try to be overly cute in our dmesg output by using HTT > for "old" hyperthreading, and SMT for "new" hyperthreading. I propose the > following changes to simplify things a bit: > > 1) Call both "old" and "new" hyperthreading HTT in dmesg. > > 2) Change machdep.hyperthreading_allowed to apply to both new and old HTT. > However, doing this means a POLA violation in that we would now disable > modern HTT by default. Balanced against re-enabling "old" HTT by default > on an increasingly-shrinking pool of old hardware, I think the better > approach here would be to also change the default to allow HTT. > > 3) Possibly add a different knob (or change the behavior of > machdep.hyperthreading_allowed) to still bring up hyperthreads, but leave > them out of the default cpuset (set 1). This would allow those threads > to be re-enabled dynamically at runtime by adjusting the mask on set 1. > The original htt settings back when 'hyperthreading_allowed' was > introduced actually permitted this via by adjusting 'machdep.hlt_cpus' at > runtime. > > What do people think? I'm fine with whatever naming, but if we're making new sysctls, especially for the cpuset case, is there a reason to hide the behavior under machdep? We support at least three non-x86 CPUs with SMT (POWER8, Cell, and POWER5) and the relevant scheduling logic should be MI. At least POWER8 supports 8 threads per core, so you might also want more granularity than just "on" or "off". -Nathan From owner-freebsd-arch@FreeBSD.ORG Sat Mar 7 06:23:40 2015 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 46EDEDEE for ; Sat, 7 Mar 2015 06:23:40 +0000 (UTC) Received: from mail-pa0-f52.google.com (mail-pa0-f52.google.com [209.85.220.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 13BBA1BC for ; Sat, 7 Mar 2015 06:23:39 +0000 (UTC) Received: by pablj1 with SMTP id lj1so51082605pab.8 for ; Fri, 06 Mar 2015 22:23:38 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:sender:subject:mime-version:content-type:from :in-reply-to:date:cc:message-id:references:to; bh=x+EazqpqaB2gacSfJWWl9yoLLmhfdNpO9E0t/tmzyz8=; b=Yby3aNYZg7yIJI/NQBeEJ+cCCh7yOCiOKOT9RZuiFkorspMoTyFxXVjfLxMMF3Fzi+ ZnFTlI42dFu7aZTEkpTgXziB7HoLNqkWSvlYzWE9Zo94OudIKKTtFQLlHxj09YAatdnZ LrO1gW3j7v5Qox7wIUJZMRip+rSRqHKwrxW8T2vgFd7U+/CV2RsTKaMx+FvByXShH3a6 Cd6EOhJMwuS1evsznVVwxEX7eiK7ImCZRl4KoICy0a7kksYw3GjkgcWffLvG9AkIqTfB lzLp4cJWbBkVQmBhE9rno0jW4RN5WtSEhlsZYGUn1tjfIkxW0Re+PVCkkWc5yaK43tcw BHRA== X-Gm-Message-State: ALoCoQkD2hyLr20WVydyvTkx2TmOp+bdZGjhIkFS26rRb/t/+e8gdpsV8JVXV0X13stsmNhKmxrq X-Received: by 10.66.222.7 with SMTP id qi7mr32690459pac.15.1425709418345; Fri, 06 Mar 2015 22:23:38 -0800 (PST) Received: from [10.64.27.202] ([69.53.236.236]) by mx.google.com with ESMTPSA id ge7sm11273625pbc.16.2015.03.06.22.23.36 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 06 Mar 2015 22:23:37 -0800 (PST) Sender: Warner Losh Subject: Re: RFC: Simplfying hyperthreading distinctions Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2070.6\)) Content-Type: multipart/signed; boundary="Apple-Mail=_83014606-E21B-4976-A8FC-F6259E09C50F"; protocol="application/pgp-signature"; micalg=pgp-sha512 X-Pgp-Agent: GPGMail 2.5b5 From: Warner Losh In-Reply-To: <54FA5EE9.4090305@freebsd.org> Date: Fri, 6 Mar 2015 23:23:35 -0700 Message-Id: <6E129CCC-C4CD-45A4-9945-3384A20B7A31@bsdimp.com> References: <1640664.8z9mx3EOQs@ralph.baldwin.cx> <54FA5EE9.4090305@freebsd.org> To: Nathan Whitehorn X-Mailer: Apple Mail (2.2070.6) Cc: Andriy Gapon , freebsd-arch@freebsd.org X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 07 Mar 2015 06:23:40 -0000 --Apple-Mail=_83014606-E21B-4976-A8FC-F6259E09C50F Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 > On Mar 6, 2015, at 7:14 PM, Nathan Whitehorn = wrote: >=20 >=20 > On 03/06/15 12:44, John Baldwin wrote: >> Currently we go out of our way a bit to distinguish Pentium4-era >> hyperthreading from more recent ("modern") hyperthreading. I suspect = that >> this distinction probably results in confusion more than anything = else. >> Intel's documentation does not make near as broad a distinction as = far as I >> can tell. Both types of SMT are called hyperthreading in the SDM for = example. >> However, we have the astonishing behavior that >> 'machdep.hyperthreading_allowed' only affects "old" hyperthreads, but = not >> "new" ones. We also try to be overly cute in our dmesg output by = using HTT >> for "old" hyperthreading, and SMT for "new" hyperthreading. I = propose the >> following changes to simplify things a bit: >>=20 >> 1) Call both "old" and "new" hyperthreading HTT in dmesg. >>=20 >> 2) Change machdep.hyperthreading_allowed to apply to both new and = old HTT. >> However, doing this means a POLA violation in that we would now = disable >> modern HTT by default. Balanced against re-enabling "old" HTT = by default >> on an increasingly-shrinking pool of old hardware, I think the = better >> approach here would be to also change the default to allow HTT. >>=20 >> 3) Possibly add a different knob (or change the behavior of >> machdep.hyperthreading_allowed) to still bring up hyperthreads, = but leave >> them out of the default cpuset (set 1). This would allow those = threads >> to be re-enabled dynamically at runtime by adjusting the mask on = set 1. >> The original htt settings back when 'hyperthreading_allowed' was >> introduced actually permitted this via by adjusting = 'machdep.hlt_cpus' at >> runtime. >>=20 >> What do people think? >=20 > I'm fine with whatever naming, but if we're making new sysctls, = especially for the cpuset case, is there a reason to hide the behavior = under machdep? We support at least three non-x86 CPUs with SMT (POWER8, = Cell, and POWER5) and the relevant scheduling logic should be MI. At = least POWER8 supports 8 threads per core, so you might also want more = granularity than just "on" or "off=E2=80=9D. MIPS has xlr/xlp support as well, which has threads=E2=80=A6 Warner --Apple-Mail=_83014606-E21B-4976-A8FC-F6259E09C50F Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename=signature.asc Content-Type: application/pgp-signature; name=signature.asc Content-Description: Message signed with OpenPGP using GPGMail -----BEGIN PGP SIGNATURE----- Comment: GPGTools - https://gpgtools.org iQIcBAEBCgAGBQJU+plnAAoJEGwc0Sh9sBEAawYQAL5g/isuGc8rbJMuWNj2YV8Z ct1YjZPshrrh9ZSV6ApPbeROi07Ki7d3z1SzhJ2njUBaZHj/gCNHUqEIZnW3FSll W0nx25kKNUs3opmW5CTXz0HkNgzvbXjz2zbBH7HF4fJuPlx4UF79pPSl0a+00lzO DoTypj+qhEtTGGAmnJcU+3XDojXTpLOZwIAgIJLworAuAog6SEa7Uj5pUvESDWPz ODCu04eNndTkwX/X/PkAXhunYrim24zJjjh6s+aFsyVVbQ7YdiAcbWzDNy+NBWbD 6CkjD0NkaCTlzyPVGC1Ha/kRVi+o4josZBPqpuiHd/1Z6zZc23R2j0NtdMxFa4Xq 0K6/SCmZiouDUy+kzh8xuA757ve94Ci5l/i15OEl+tDyDMiqnK/hZTnlrD6gBLEg A058xjnjamyw3lOE60rm6up3ox8JTMZj9dxkZ5mXj/WCIs9huISPuWw7MM9qtK4Z Kh3gfwVWBEl2BjTJLlpCOGEBkiqB21j9pyGjU6hfUMmlv6BPG8wAMWg/xEKyKfNg I8lLMw1MRP1EyqOBcwYXbg7nJk41akVkJ+gUuv2mX8NGBX7fgR8FQtZBZMwBWc7n lS070L6Zhew1/9xUGf9Y5XThrrqO8L8vd+onX4vAdH5GJi9ppfcBcijJPNGYBAaf dCENqrYpUNBHhE2me4qE =uUW+ -----END PGP SIGNATURE----- --Apple-Mail=_83014606-E21B-4976-A8FC-F6259E09C50F-- From owner-freebsd-arch@FreeBSD.ORG Sat Mar 7 09:43:40 2015 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 8290A8E3 for ; Sat, 7 Mar 2015 09:43:40 +0000 (UTC) Received: from vps1.elischer.org (vps1.elischer.org [204.109.63.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "vps1.elischer.org", Issuer "CA Cert Signing Authority" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 531CF773 for ; Sat, 7 Mar 2015 09:43:40 +0000 (UTC) Received: from Julian-MBP3.local (50-196-156-133-static.hfc.comcastbusiness.net [50.196.156.133]) (authenticated bits=0) by vps1.elischer.org (8.14.9/8.14.9) with ESMTP id t279hWwo031271 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NO) for ; Sat, 7 Mar 2015 01:43:33 -0800 (PST) (envelope-from julian@freebsd.org) Message-ID: <54FAC83F.7020008@freebsd.org> Date: Sat, 07 Mar 2015 01:43:27 -0800 From: Julian Elischer User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: freebsd-arch@freebsd.org Subject: Re: RFC: Simplfying hyperthreading distinctions References: <1640664.8z9mx3EOQs@ralph.baldwin.cx> <2092193.qt8NhEKglv@ralph.baldwin.cx> In-Reply-To: <2092193.qt8NhEKglv@ralph.baldwin.cx> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 07 Mar 2015 09:43:40 -0000 On 3/6/15 3:41 PM, John Baldwin wrote: > We are now in the odd situation where we refer to a small (and > shrinking) set of CPUs that support HTT as HTT and we refer to a > much larger (and growing) set of CPUs that support HTT as something > else. This means that if a random user wants to see if FreeBSD > supports HTT they won't see that in dmesg on a modern CPU without > having some sort of magic decoder ring. I like the HTT (SMT variant) idea. More information is better.