From owner-freebsd-fs@FreeBSD.ORG Wed Jun 29 13:15:50 2011 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2028C10656D5; Wed, 29 Jun 2011 13:15:50 +0000 (UTC) (envelope-from alexander@leidinger.net) Received: from mail.ebusiness-leidinger.de (mail.ebusiness-leidinger.de [217.11.53.44]) by mx1.freebsd.org (Postfix) with ESMTP id A783A8FC08; Wed, 29 Jun 2011 13:15:49 +0000 (UTC) Received: from outgoing.leidinger.net (p4FC4623A.dip.t-dialin.net [79.196.98.58]) by mail.ebusiness-leidinger.de (Postfix) with ESMTPSA id 7DCAD84400D; Wed, 29 Jun 2011 15:15:34 +0200 (CEST) Received: from webmail.leidinger.net (webmail.Leidinger.net [IPv6:fd73:10c7:2053:1::3:102]) by outgoing.leidinger.net (Postfix) with ESMTP id B782F2035; Wed, 29 Jun 2011 15:15:31 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=Leidinger.net; s=outgoing-alex; t=1309353331; bh=tbqpZMONtZ5phZ+zCklqqiwA1su5HsbQiCB3l7RggFM=; h=Message-ID:Date:From:To:Cc:Subject:References:In-Reply-To: MIME-Version:Content-Type:Content-Transfer-Encoding; b=b7l9TTG6CZXN5W/+S0WXBcG75nx0ZJcZVPGmUmstfedHA0DeaVbNCzSxSbg6eglyy XBYLatF6Mo8sqr+TK4YZWrMhoxpWgj4way9eJgBzTZoI1QB7k/InT2DAui3wJ6i9+A qijP2IVdU2QXL+NsmFsYdgFE09Pm8jf8DW7fIRcpy4lZycH+9AmnkqcAw2DCJ8nE3+ wA61d27XDxbBYkcViVnhb29b0yhImCCvidsGSX5UGR3usSpHNqH+KIYXjRJQ7a9llD YGLL+aKxsKGO2lBySlQt27yTd2FBdEFkNZ4cQdBUTV5kTwGwvI8Mbw5IuAGDS+q2nG 64kiT7kXdv89A== Received: (from www@localhost) by webmail.leidinger.net (8.14.4/8.14.4/Submit) id p5TDFU86034619; Wed, 29 Jun 2011 15:15:30 +0200 (CEST) (envelope-from Alexander@Leidinger.net) X-Authentication-Warning: webmail.leidinger.net: www set sender to Alexander@Leidinger.net using -f Received: from pslux.ec.europa.eu (pslux.ec.europa.eu [158.169.9.14]) by webmail.leidinger.net (Horde Framework) with HTTP; Wed, 29 Jun 2011 15:15:30 +0200 Message-ID: <20110629151530.13154p1oc899fhwy@webmail.leidinger.net> Date: Wed, 29 Jun 2011 15:15:30 +0200 From: Alexander Leidinger To: Jeremy Chadwick References: <20110628203228.GA4957@onyx.glenbarber.us> <20110629104633.26824evikzh8tgtl@webmail.leidinger.net> <4E0B006C.8050000@FreeBSD.org> <20110629111915.GA75648@icarus.home.lan> In-Reply-To: <20110629111915.GA75648@icarus.home.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; DelSp="Yes"; format="flowed" Content-Disposition: inline Content-Transfer-Encoding: 7bit User-Agent: Dynamic Internet Messaging Program (DIMP) H3 (1.1.6) X-EBL-MailScanner-Information: Please contact the ISP for more information X-EBL-MailScanner-ID: 7DCAD84400D.A1F3E X-EBL-MailScanner: Found to be clean X-EBL-MailScanner-SpamCheck: not spam, spamhaus-ZEN, SpamAssassin (not cached, score=-0.1, required 6, autolearn=disabled, DKIM_SIGNED 0.10, DKIM_VALID -0.10, DKIM_VALID_AU -0.10) X-EBL-MailScanner-From: alexander@leidinger.net X-EBL-MailScanner-Watermark: 1309958136.28546@KBItfFGpgR+GbYh/jpKrCQ X-EBL-Spam-Status: No Cc: Glen Barber , fs@FreeBSD.org Subject: Re: [RFC] [patch] periodic status-zfs: list pools in daily emails X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Jun 2011 13:15:50 -0000 Quoting Jeremy Chadwick (from Wed, 29 Jun 2011 04:19:15 -0700): > At my workplace we use a heavily modified version of Netsaint, with bits > and pieces Nagios-like created. I happened to write the perl code used > to monitor our production Solaris systems (~2000+ servers) for ZFS pool > status. It parses "zpool status -x" output, monitoring read, write, and > checksum errors per pool, vdev, and device, in addition to general pool > status. I tested too many conditions, not to mention had to deal with > parsing pains as a result of ZFS code changes, plus supporting > completely different revisions of Solaris 10 in production. And before > someone asks: no, I cannot provide the source (employee agreements, LCA, > etc...). I did have to dig through ZFS source code to figure out a > bunch of necessary bits too, so don't be surprised if you have to too. > > My recommendation: just look for pools which are in any state other than > ONLINE (don't try to be smart with an OR regex looking for all the > combos; it doesn't scale when ZFS changes), and you should also handle > situations where a device is currently undergoing manual or automatic > device replacement (specifically regex '^[\t\s]+replacing\s+DEGRADED'), > which will be important to people who keep spares in pools. This might > be difficult with just standard BSD sh, but BSD awk should be able to > handle this. Thanks for your suggestions, but the script is intentionally dump: It runs "zpool status" and looks for "all pools are healthy". If this line is not there, the output is marked as important (this is important if you decided to configure periodic.conf to skip unimportant output). All the rest is up to the person which reads the daily run output. The zpool list output which is added in the patch under discussion, is just displaying "zpool list" additionally to the output of zpool status (if activated). Bye, Alexander. -- "I'll reason with him." -- Vito Corleone, "Chapter 14", page 200 http://www.Leidinger.net Alexander @ Leidinger.net: PGP ID = B0063FE7 http://www.FreeBSD.org netchild @ FreeBSD.org : PGP ID = 72077137