From owner-freebsd-fs@FreeBSD.ORG Mon Oct 22 13:21:24 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id AABFD71E for ; Mon, 22 Oct 2012 13:21:24 +0000 (UTC) (envelope-from feld@feld.me) Received: from feld.me (unknown [IPv6:2607:f4e0:100:300::2]) by mx1.freebsd.org (Postfix) with ESMTP id 689FE8FC17 for ; Mon, 22 Oct 2012 13:21:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=feld.me; s=blargle; h=In-Reply-To:Message-Id:From:Mime-Version:Date:References:Subject:Cc:To:Content-Type; bh=aM8Z7R75Wzzmd7G4IPK30sPaVaFw0/7OM9xq68T34/Q=; b=NrWK6EcTb60oqEuFS+Cv8I36p1P0ziHLr1SIB8PIL9HHbGSlNWHehKiPuocZE5VBVUWAuagIM5+WDlvA2vkiS69cRG7mq274eyLX62r3Y/UEd5+5PAg9Sc3V5ljxekSF; Received: from localhost ([127.0.0.1] helo=mwi1.coffeenet.org) by feld.me with esmtp (Exim 4.80 (FreeBSD)) (envelope-from ) id 1TQHwP-000IYc-9k; Mon, 22 Oct 2012 08:21:18 -0500 Received: from feld@feld.me by mwi1.coffeenet.org (Archiveopteryx 3.1.4) with esmtpa id 1350912067-65253-65252/5/1; Mon, 22 Oct 2012 13:21:07 +0000 Content-Type: text/plain; charset=utf-8; format=flowed; delsp=yes To: Dustin Wenz , Olivier Smedts , Steven Hartland Subject: Re: Imposing ZFS latency limits References: <6116A56E-4565-4485-887E-46E3ED231606@ebureau.com> <089898A4493042448C934643FD5C3887@multiplay.co.uk> Date: Mon, 22 Oct 2012 08:21:07 -0500 Mime-Version: 1.0 From: Mark Felder Message-Id: In-Reply-To: <089898A4493042448C934643FD5C3887@multiplay.co.uk> User-Agent: Opera Mail/12.10 (FreeBSD) X-SA-Report: ALL_TRUSTED=-1, KHOP_THREADED=-0.5 X-SA-Score: -1.5 Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 Oct 2012 13:21:24 -0000 On Tue, 16 Oct 2012 10:46:00 -0500, Steven Hartland wrote: > > Interesting, what metrics where you using which made it easy to detect, > work be nice to know your process there Mark? One reason is that our virtual machine performance gets awful and we get alerted for higher than usual load and/or disk io latency by the hypervisor. Another thing we've implemented is watching for some SCSI errors on the server too. They seem to let us know before it really gets bad. It's nice knowing ZFS is doing everything within its power to read the data off the disk, but when there's a fully intact raidz it should be smart enough to kick a disk out that's being problematic.