From owner-freebsd-stable@freebsd.org Mon Nov 21 17:50:48 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 03B63C4C14F for ; Mon, 21 Nov 2016 17:50:48 +0000 (UTC) (envelope-from crest@rlwinm.de) Received: from smtp.rlwinm.de (smtp.rlwinm.de [IPv6:2a01:4f8:201:31ef::e]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id C5CCA9DA for ; Mon, 21 Nov 2016 17:50:47 +0000 (UTC) (envelope-from crest@rlwinm.de) Received: from vader9.bultmann.eu (unknown [87.253.189.132]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.rlwinm.de (Postfix) with ESMTPSA id 98B04582E for ; Mon, 21 Nov 2016 18:50:44 +0100 (CET) Subject: Re: Help! two machines ran out of swap and corrupted their zpools! To: freebsd-stable@freebsd.org References: From: Jan Bramkamp Message-ID: Date: Mon, 21 Nov 2016 18:50:43 +0100 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Nov 2016 17:50:48 -0000 On 21/11/2016 18:47, Pete French wrote: > So, I am off sick and my colleagues decided to load test our set of five > servers excesively. All ran out of swap. So far so irritating, but whats has > happened is that twoof them now will not boot, as it appears the ZFS pool > they are booting from has become corrupted. > > One starts to boot, then crases importing the root pool. The other doenst > even get that far with gptzfsboot saying it can't find the pool to boot from! > > Now I can recover these, but I am a bit worried, that it got like this at > all, as I havent ever seen ZFS corrupt a pool like this. Anyone got any insights, > or suggstions as to how to stop it happening again ? > > We are swapping to a separate partition, not to the pool by theway. How much trust do you put in your hardware? Have you ever put the hardware under full load for extended periods before e.g. run poudriere to build pkg repos? -- Jan Bramkamp