From owner-cvs-all@FreeBSD.ORG Wed Oct 17 15:05:03 2007 Return-Path: Delivered-To: cvs-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3E02116A420; Wed, 17 Oct 2007 15:05:03 +0000 (UTC) (envelope-from scottl@samsco.org) Received: from pooker.samsco.org (pooker.samsco.org [168.103.85.57]) by mx1.freebsd.org (Postfix) with ESMTP id D021813C494; Wed, 17 Oct 2007 15:05:02 +0000 (UTC) (envelope-from scottl@samsco.org) Received: from phobos.samsco.home (phobos.samsco.home [192.168.254.11]) (authenticated bits=0) by pooker.samsco.org (8.13.8/8.13.8) with ESMTP id l9HF4p7U079964; Wed, 17 Oct 2007 09:04:52 -0600 (MDT) (envelope-from scottl@samsco.org) Message-ID: <47162487.2090408@samsco.org> Date: Wed, 17 Oct 2007 09:04:39 -0600 From: Scott Long User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1.6) Gecko/20070802 SeaMonkey/1.1.4 MIME-Version: 1.0 To: Alexander Leidinger References: <200710161702.00008.jhb@freebsd.org> <471537CA.9080807@FreeBSD.org> <200710170907.07832.jhb@freebsd.org> <20071017163102.jkl3sdzww8wkscw0@webmail.leidinger.net> In-Reply-To: <20071017163102.jkl3sdzww8wkscw0@webmail.leidinger.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (pooker.samsco.org [168.103.85.57]); Wed, 17 Oct 2007 09:04:52 -0600 (MDT) X-Spam-Status: No, score=-1.4 required=5.5 tests=ALL_TRUSTED autolearn=failed version=3.1.8 X-Spam-Checker-Version: SpamAssassin 3.1.8 (2007-02-13) on pooker.samsco.org Cc: src-committers@freebsd.org, John Baldwin , cvs-src@freebsd.org, cvs-all@freebsd.org, "Constantine A. Murenin" , Poul-Henning Kamp , Wilko Bulte Subject: Re: cvs commit: src/etc Makefile sensorsd.conf src/etc/defaults rc.conf src/etc/rc.d Makefile sensorsd src/lib/libc/gen sysctl.3 src/sbin/sysctl sysctl.8 sysctl.c src/share/man/man5 rc.conf.5 src/share/man/man9 Makefile sensor_attach.9 src/sys/conf f X-BeenThere: cvs-all@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: CVS commit messages for the entire tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Oct 2007 15:05:03 -0000 Alexander Leidinger wrote: > Quoting John Baldwin (from Wed, 17 Oct 2007 09:07:06 > -0400): > >> On Tuesday 16 October 2007 06:14:34 pm Constantine A. Murenin wrote: >>> On 16/10/2007 17:01, John Baldwin wrote: > >>> > Basically, by having so little data in hw.sensors if I had to write >>> a RAID >>> > monitoring daemon I would just not use hw.sensors since it's >>> easier for me to >>> > figure out the simple status myself based on the other state I >>> already have >>> > to track (unless you write an event-driven daemon based on >>> messages posted by >>> > the firmware in which case again you wouldn't use hw.sensors for >>> that either). >>> >>> There is no other daemon that you'd need, you'd simply use sensorsd for >>> this. You could write a script that would be executed by sensorsd if a >>> certain logical disc drive sensor changes state, and then this script >>> would call the bio framework and give you additional details on why the >>> state was changed. >> >> That's actually not quite good enough as, for example, I want to keep >> yelling >> about a busted volume on a periodic basis until its fixed. Also, >> having a volume >> change state doesn't tell me if a drive was pulled. On at least one RAID >> controller firmware I am familiar with, the only way you can figure >> this out is >> to keep track of which drives are currently present with a generation >> count and >> use that to determine when a drive goes away. Even my monitoring >> daemon for >> ata-raid has to do this since the ata(4) driver just detaches and >> removes a drive >> when it fails and you have no way to figure out which drive died as >> the kernel >> thinks that drive no longer exists. > > Note, talking about interaction with bio or similar is not productive > ATM. On Sunday I had a discussion with scottl and he identified some > things with bio which don't make it a good choice for FreeBSD. > Unfortunately I didn't had time to take it off the ideas list so far. > Scott also agreed to come up with a description for a similar framework > that is is usable with our RAID drivers. > John has the most recent experience of anyone with writing RAID monitoring and control tools, and he's brought up some very good points with about some of the specific technical challenges. A simple sysctl tree like hw.sensors is stateless, and that doesn't cut it for an environment where devices can come and go. More intelligence and state is needed, and for that you need an event component to your framework. I still have no strong opinion on whether FreeBSD-specific APIs like sysctl and devd are the right mechanisms for this. Maybe when it comes to the storage side of monitoring, consolidating all information under GEOM via /dev/geom.ctl is the right path, or maybe it isn't. But ultimately, what works for lmsensors or CPU throttling or arbitrary 1 wire or 3 wire buses might not work for a more complex system like storage. Scott