From owner-freebsd-questions@freebsd.org Fri Sep 9 21:15:09 2016 Return-Path: Delivered-To: freebsd-questions@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7932BBD431D for ; Fri, 9 Sep 2016 21:15:09 +0000 (UTC) (envelope-from freebsd@pki2.com) Received: from btw.pki2.com (btw.pki2.com [IPv6:2001:470:a:6fd::2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 3937D7D7 for ; Fri, 9 Sep 2016 21:15:09 +0000 (UTC) (envelope-from freebsd@pki2.com) Received: from localhost (localhost [127.0.0.1]) by btw.pki2.com (8.15.2/8.15.2) with ESMTP id u89LEokx068807; Fri, 9 Sep 2016 14:14:50 -0700 (PDT) (envelope-from freebsd@pki2.com) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=pki2.com; s=pki2; t=1473455690; bh=RwyZdjRXl9Ge/jNds0foUUnrj0hlYRm2hpkHL5BJtF0=; h=Subject:From:To:Date:In-Reply-To:References; z=Subject:=20Re:=2040=20cores,=2048=20NVMe=20disks,=20feel=20free=2 0to=20take=20over|From:=20Dennis=20Glatting=20|T o:=20Christoph=20Pilka=20,=20freebsd-question s@freebsd.org|Date:=20Fri,=2009=20Sep=202016=2014:14:50=20-0700|In -Reply-To:=20|Re ferences:=20; b=R8Lu8q1i3nRNchxcrql1h11N6DKKitGfcFwhaZoy29JEZplf6fMjePcJE516rSkMK PYQScmmSatjU9qIU2rLXDys4tB5yvqwRhm+ZZaSx2t2vMiTtrh/HM8PcsOZFx1cSdG Iy7/DfPW/6aj5K/jOHMAJlaGc2nDgNFoUAGcAYFS2DBy3TuHFgQXWSDmwVS7YhRfW9 8RIQqfhyiLC+t0nNxuTLxTjFORq9RaFETZ+9pwYSf9X2yEwEgRtM2KyNX4cHbSW616 X4eHzPq2Vxgblb6mUHaLsw3j1DBk8fUsCO1DoxVmSFx+kYddX9QMyEq0a6SktBDcRr y1z61x/uBcUlg== Message-ID: <1473455690.58708.93.camel@pki2.com> Subject: Re: 40 cores, 48 NVMe disks, feel free to take over From: Dennis Glatting To: Christoph Pilka , freebsd-questions@freebsd.org Date: Fri, 09 Sep 2016 14:14:50 -0700 In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.18.5.1 FreeBSD GNOME Team Port Mime-Version: 1.0 Content-Transfer-Encoding: 8bit X-SoftwareMunitions-MailScanner-Information: Dennis Glatting X-SoftwareMunitions-MailScanner-ID: u89LEokx068807 X-SoftwareMunitions-MailScanner: Found to be clean X-MailScanner-From: freebsd@pki2.com X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 09 Sep 2016 21:15:09 -0000 On Fri, 2016-09-09 at 22:51 +0200, Christoph Pilka wrote: > Hi, > > we've just been granted a short-term loan of a server from Supermicro > with 40 physical cores (plus HTT) and 48 NVMe drives. After a bit of > mucking about, we managed to get 11-RC running. A couple of things > are preventing the system from being terribly useful: > > - We have to use hw.nvme.force_intx=1 for the server to boot > If we don't, it panics around the 9th NVMe drive with "panic: > couldn't find an APIC vector for IRQ...". Increasing > hw.nvme.min_cpus_per_ioq brings it further, but it still panics later > in the NVMe enumeration/init. hw.nvme.per_cpu_io_queues=0 causes it > to panic later (I suspect during ixl init - the box has 4x10gb > ethernet ports). > > - zfskern seems to be the limiting factor when doing ~40 parallel "dd > if=/dev/zer of= bs=1m" on a zpool stripe of all 48 drives. Each > drive shows ~30% utilization (gstat), I can do ~14GB/sec write and 16 > read. > > - direct writing to the NVMe devices (dd from /dev/zero) gives about > 550MB/sec and ~91% utilization per device  > > Obviously, the first item is the most troublesome. The rest is based > on entirely synthetic testing and may have little or no actual impact > on the server's usability or fitness for our purposes.  > > There is nothing but sshd running on the server, and if anyone wants > to play around you'll have IPMI access (remote kvm, virtual media, > power) and root. > > Any takers? > I'm curious to know what board you have. I have had FreeBSD, including release 11 candidates, running on SM boards without any trouble although some of them are older boards. I haven't looked at ZFS performance because mine are typically low disk use. That said, my virtual server (also a SM) IOPs suck but so do its disks. I recently found the Intel RAID chip on one SM isn't real RAID, rather it's pseudo RAID but for a few dollars more it could be real RAID. :( It was killing IOPs so I popped in an old LSI board, routed the cables from the Intel chip, and the server is now a happy camper. I then replaced 11-RC with Ubuntu 16.10 due to a specific application but I am also running RAIDz2 under Ubuntu on three trash 2.5T disks (I didn't do this for any reason other than fun).  root@Tuck3r:/opt/bin# zpool status   pool: opt  state: ONLINE   scan: none requested config: NAME        STATE     READ WRITE CKSUM opt         ONLINE       0     0     0   raidz2-0  ONLINE       0     0     0     sda     ONLINE       0     0     0     sdb     ONLINE       0     0     0     sdc     ONLINE       0     0     0 > Wbr > Christoph Pilka > Modirum MDpay > > Sent from my iPhone > _______________________________________________ > freebsd-questions@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-questions > To unsubscribe, send any mail to "freebsd-questions-unsubscribe@freeb > sd.org"