From owner-freebsd-amd64@freebsd.org Wed Mar 22 20:50:36 2017 Return-Path: Delivered-To: freebsd-amd64@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E9479D1884C for ; Wed, 22 Mar 2017 20:50:36 +0000 (UTC) (envelope-from fjwcash@gmail.com) Received: from mail-qk0-x236.google.com (mail-qk0-x236.google.com [IPv6:2607:f8b0:400d:c09::236]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id A5CBB1D74; Wed, 22 Mar 2017 20:50:36 +0000 (UTC) (envelope-from fjwcash@gmail.com) Received: by mail-qk0-x236.google.com with SMTP id p64so165614819qke.1; Wed, 22 Mar 2017 13:50:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=NcyO8ph173stVV3zBS8MjyMpEOgouCGmlSUlso/Bbms=; b=h6eZAHdISu4P+0IpzdpYA2KGTDK4uD8mwyrl82irJhizg+HLhmAmB8atGsVTZVhRMV 63UqTBHqxO14LP9/Qs+N3IN6QWK1icEovWGg6E3d9FKXuVxGmyv+HKxoqRLtt8C8Hslu M+muDOOPOc9M62ShLcKgfieEqtr2Hv1HIbZ4Ltw/LIqFmL19k1s/dfxiZBF/geATiQwd PTKRTTEju0H7Byr0Tjp+NWREXZv7XsVv8k/9ycMGJ/u0gqYG7zDr7B7lcSZlzAZH8V37 stWOcSsCx5CbP3oRLAR1z6LE8vvFyA+vsJkMnUG7SdmQ+Sn+i34ItsZw5tTATSDSEvnA /LSw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=NcyO8ph173stVV3zBS8MjyMpEOgouCGmlSUlso/Bbms=; b=k5KbvjZTJyuIaY4lgVVzHtE+Y90CprYnuEcUt8mLcsr9zJ2ukhclJGsZunlYGEIvlh q30EwLZ3AO3pok+gKK9H+ksgA6zLmjBEXQIAncoyUdwrhzuF54A32yCHIzdSXT7+v3dq FalXRKqKKbz+Bu2bwE2aYz7DED6BIJM6Ug63Z8RDXyTOZoIeA1Ga811T+Jlw2ugFdn45 J8QU4P8fCmO0PyM5x0tyY95qAQUHx7/eExK6/d2FeCV3AX5KCYJWIb7GDAOtEAuH3Gap H3VDMGgaKi4wQJp5hxuwt6kULVKFbl4m1bUnmgpDx8KzIxhIsGQSV88RK8RPckqZVOwi EETQ== X-Gm-Message-State: AFeK/H2GDyOvpK43wzsdUP5fIKklo6yBMcvqI7blDcFlfsYPVg+fzpkoUe09b4DfYgSeAQtOCnkYaXXYjiLx9g== X-Received: by 10.55.157.146 with SMTP id g140mr36776617qke.30.1490215835757; Wed, 22 Mar 2017 13:50:35 -0700 (PDT) MIME-Version: 1.0 Received: by 10.140.28.202 with HTTP; Wed, 22 Mar 2017 13:50:35 -0700 (PDT) In-Reply-To: <201703222030.v2MKUJJs026400@gw.catspoiler.org> References: <201703222030.v2MKUJJs026400@gw.catspoiler.org> From: Freddie Cash Date: Wed, 22 Mar 2017 13:50:35 -0700 Message-ID: Subject: Re: FreeBSD on Ryzen To: Don Lewis Cc: freebsd-amd64@freebsd.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.23 X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 Mar 2017 20:50:37 -0000 On Wed, Mar 22, 2017 at 1:30 PM, Don Lewis wrote: > I put together a Ryzen 1700X machine over the weekend and installed the > 12.0-CURRENT r315413 snapshot on it a couple of days ago. The RAM is > DDR4 2400. > > First impression is that it's pretty zippy. Compared to my previous > fastest machine: > CPU: AMD FX-8320E Eight-Core Processor (3210.84-MHz K8-class CPU) > make -j8 buildworld using tmpfs is a bit more than 2x faster. Since the > Ryzen has SMT, it's eight cores look like 16 CPUs to FreeBSD, I get > almost a 2.6x speedup with -j16 as compared to my old machine. > > I do see that the reported total CPU time increases quite a bit at -j16 > (~19900u) as compared to -j8 (~13600u) so it is running into some > hardware bottlenecks that are slowing down instruction execution. It > could be the resources shared by both SMT threads that share each core, > or it could be cache or memory bandwidth related. The Ryzen topology is > a bit complicated. There are two groups of four cores, where each group > of four cores shares half of the L3 cache, with a slowish interconnect > bus between the groups. This probably causes some NUMA-like issues. I > wonder if the ULE scheduler could be tweaked to handle this better. > =E2=80=8BThe interconnect, aka Infinity Fabric, runs at the speed of the me= mory controller, so if you put faster RAM into the system, the fabric runs faster, and inter-CCX latency should drop to match. There's 2 MB of L3 cache shared between every two cores, but any core can access data in the L3 cache of any other core. Latency for those requests depends on whether it's within the same CCX (4-core cluster), or in the other CCX=E2=80=8B (going across the Infinity Fabric). There's a lot of finicky timing issues with L3 cache accesses, and with thread migration (in-CCX vs across the fabric). This is a whole other level of NUMA fun. And it'll get even more fun when the server version ships where you have 4 CCXes in a single CPU, with multiple sockets on a motherboard, and Infinity Fabric joining everything together. :) I feel sorry for the scheduler devs who get to figure all this out. :D Supposedly, the Linux folks have this mostly figured out in kernel 4.10, but I'll wait for the benchmarks to believe it. There's a bunch up on Phoronix ... but, well, it's Phoronix. :) --=20 Freddie Cash fjwcash@gmail.com