From owner-freebsd-stable@freebsd.org Mon May 14 15:35:30 2018 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9847ADF5BCA for ; Mon, 14 May 2018 15:35:30 +0000 (UTC) (envelope-from hausen@punkt.de) Received: from kagate.punkt.de (kagate.punkt.de [217.29.33.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 1ECE37901C for ; Mon, 14 May 2018 15:35:29 +0000 (UTC) (envelope-from hausen@punkt.de) Received: from hugo10.ka.punkt.de (hugo10.ka.punkt.de [217.29.44.10]) by gate1.intern.punkt.de with ESMTP id w4EFZLPC068875; Mon, 14 May 2018 17:35:21 +0200 (CEST) Received: from [217.29.44.241] ([217.29.44.241]) by hugo10.ka.punkt.de (8.14.2/8.14.2) with ESMTP id w4EFZLDf089335; Mon, 14 May 2018 17:35:21 +0200 (CEST) (envelope-from hausen@punkt.de) From: "Patrick M. Hausen" Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\)) Subject: Spectre/Meltdown mitigation in 11.1-p10 bogging down zfs send/receive? Message-Id: <39DC78FE-D56E-4E7F-8F86-28C0ACAD761F@punkt.de> Date: Mon, 14 May 2018 17:35:21 +0200 Cc: mops@punkt.de To: freebsd-stable X-Mailer: Apple Mail (2.3273) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 14 May 2018 15:35:30 -0000 Hey guys, as some might know we run our hosting products in ZFS and iocage based jails. The backup concept relies on recurring local snapshots and a copy = of these on one (more planned) central storage server. The storage server does essentially nothing but run zfs receive for each dataset on each hosting node. 12x spinning rust and 128G of RAM. Lots of space ;-) In preparation of rolling out (among other patches) the Meltdown and = Spectre mitigation fixes and microcode updates we already ran benchmarks that measured our primary applications - the TYPO3 and Neos CMS. We did not see much of an impact. We updated that central storage system last Friday. Today we provisioned a new server meaning a new hosting hardware and a couple of jails on that one. The new system already has got all the = latest patches. Part of the provisioning process is creating an initial snapshot of = every dataset and sending an initial copy to the storage server, so we can send = nightly incrementals. That step took surprisingly long for the first of the new jails. At least an order of magnitude, I cannot provide exact measurements yet, because this is all part of rather complex Ansible task and it really = caught us by surprise. We already received a couple of warnings from the Icinga service = monitoring the nightly replication runs - we still need to investigate this. We = suspect they ran slower than usual, too. To narrow down the cause of the problem we tried this in chronological = order: 1. storage server (receiving end): Disable microcode update and hw.ibrs_active still slow Disable vm.pmap.pti = still slow 2. new jail host (sending end): Disable both = fast Re-enable microcode update and hw.ibrs_active still fast Re-enable vm.pmap.pti = still fast Reboot as necessary, of course. And we double checked the current value of the respective sysctls before running the tests. That last step is *quite* unexpected, because it just does not make = sense to me. Does anybody know what impact the fixes, both PTI and IBRS are = *expected* to have on bulk zfs send/receive operations from/to two different hosts? Possibly we are on the wrong track altogether. We suspected the CPU = fixes because of the general "what did you change last" approach ... Thank you very much Patrick --=20 punkt.de GmbH Internet - Dienstleistungen - Beratung Kaiserallee 13a Tel.: 0721 9109-0 Fax: -100 76133 Karlsruhe info@punkt.de http://punkt.de AG Mannheim 108285 Gf: Juergen Egeling