From owner-freebsd-hackers@FreeBSD.ORG Thu Apr 2 10:15:27 2015 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 37BA3CC6; Thu, 2 Apr 2015 10:15:27 +0000 (UTC) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id B40DDA52; Thu, 2 Apr 2015 10:15:26 +0000 (UTC) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.14.9/8.14.9) with ESMTP id t32AFKqv073289 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Thu, 2 Apr 2015 13:15:20 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.9.2 kib.kiev.ua t32AFKqv073289 Received: (from kostik@localhost) by tom.home (8.14.9/8.14.9/Submit) id t32AFKe8073284; Thu, 2 Apr 2015 13:15:20 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Thu, 2 Apr 2015 13:15:20 +0300 From: Konstantin Belousov To: Tobias Oberstein Subject: Re: NVMe performance 4x slower than expected Message-ID: <20150402101520.GC2379@kib.kiev.ua> References: <551BC57D.5070101@gmail.com> <551C5A82.2090306@gmail.com> <20150401212303.GB2379@kib.kiev.ua> <551C702D.2070009@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <551C702D.2070009@gmail.com> User-Agent: Mutt/1.5.23 (2014-03-12) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.0 X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on tom.home Cc: "freebsd-hackers@freebsd.org" , Michael Fuckner , Jim Harris , Alan Somers X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Apr 2015 10:15:27 -0000 On Thu, Apr 02, 2015 at 12:24:45AM +0200, Tobias Oberstein wrote: > Am 01.04.2015 um 23:23 schrieb Konstantin Belousov: > > On Wed, Apr 01, 2015 at 10:52:18PM +0200, Tobias Oberstein wrote: > >>> > FreeBSD 11 Current with patches (DMAR and ZFS patches, otherwise the box > >>> > doesn't boot at all .. because of 3TB RAM and the amount of periphery). > >>> > >>> Do you still have WITNESS and INVARIANTS turned on in your kernel > >>> config? They're turned on by default for Current, but they do have > >>> some performance impact. To turn them off, just build a > >>> GENERIC-NODEBUG kernel . > >> > >> WITNESS is off, INVARIANTS is still on. > > INVARIANTS are costly. > > ah, ok. will rebuild without this option. > > > I have the following patch for a long time, it allowed to increase pps > > in iperf and similar tests when DMAR is enabled. In your case it could > > reduce the rate of the DMAR interrupts. > > You mean these lines from vmstat? > > irq257: dmar0:qi 22312 0 > irq259: dmar1:qi 22652 0 > irq261: dmar2:qi 261874194 6911 > irq263: dmar3:qi 124939 3 > > So these dmar2 interrupts come from DMAR region 2 which is used by nvd7? Dmar unit 2. In modern machines, there is one (or two, sometimes) translation units per CPU package, which handle devices from the PCIe buses rooted in the socket. Interrupt stats above mean that the load on your machine is unbalanced WRT PCIe buses, most of the DMA transfers were performed by devices attached to the bus(es) on socket where DMAR 2 is located. > > From dmesg: > > dmar0: iomem 0xc7ffc000-0xc7ffcfff on acpi0 > dmar1: iomem 0xe3ffc000-0xe3ffcfff on acpi0 > dmar2: iomem 0xfbffc000-0xfbffcfff on acpi0 > dmar3: iomem 0xabffc000-0xabffcfff on acpi0 > > mpr0: dmar3 pci0:4:0:0 rid 400 domain 4 mgaw 48 agaw 48 re-mapped > mpr1: dmar2 pci0:195:0:0 rid c300 domain 2 mgaw 48 agaw 48 re-mapped > > nvme0: dmar0 pci0:65:0:0 rid 4100 domain 0 mgaw 48 agaw 48 re-mapped > nvme1: dmar0 pci0:67:0:0 rid 4300 domain 1 mgaw 48 agaw 48 re-mapped > nvme2: dmar0 pci0:69:0:0 rid 4500 domain 2 mgaw 48 agaw 48 re-mapped > nvme3: dmar1 pci0:129:0:0 rid 8100 domain 0 mgaw 48 agaw 48 re-mapped > nvme4: dmar1 pci0:131:0:0 rid 8300 domain 1 mgaw 48 agaw 48 re-mapped > nvme5: dmar1 pci0:132:0:0 rid 8400 domain 2 mgaw 48 agaw 48 re-mapped > nvme6: dmar2 pci0:193:0:0 rid c100 domain 0 mgaw 48 agaw 48 re-mapped > nvme7: dmar2 pci0:194:0:0 rid c200 domain 1 mgaw 48 agaw 48 re-mapped > > unknown: dmar3 pci0:0:29:0 rid e8 domain 0 mgaw 48 agaw 48 re-mapped > unknown: dmar3 pci0:0:26:0 rid d0 domain 1 mgaw 48 agaw 48 re-mapped > > ix0: dmar3 pci0:1:0:0 rid 100 domain 2 mgaw 48 agaw 48 re-mapped > ix1: dmar3 pci0:1:0:1 rid 101 domain 3 mgaw 48 agaw 48 re-mapped > > ix0: Using MSIX interrupts with 49 vectors > ix1: Using MSIX interrupts with 49 vectors > > -- > > So the LSI HBAs, Intel NICs and NVMe are all using DMAR, but only the > NICs use MSI-X? MSI-X is the method of reporting interrupt requests to CPUs. DMARs are some engines to translate addresses of DMA requests (and also to translate interrupts). > > But 2 * 49 = 98, and that is smaller than the 191 which Jim mentions. > > And what are those "unknown" devices on dmar3? 0:26:0 and 0:29:0 are USB controllers, most likely, the b/d/f numbers are typical for the Intel PCH. "unknown" is displayed when pci device does not have driver attached, you probably do not have USB loaded. DMAR still has to enable the translation context for USB controllers, since BIOS performs transfers behind the OS, and instructs DMAR driver to enable mappings.