From owner-freebsd-arch@FreeBSD.ORG Wed Oct 17 17:10:46 2007 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0F1EA16A46B for ; Wed, 17 Oct 2007 17:10:46 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from speedfactory.net (mail6.speedfactory.net [66.23.216.219]) by mx1.freebsd.org (Postfix) with ESMTP id B737813C48A for ; Wed, 17 Oct 2007 17:10:45 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (unverified [66.23.211.162]) by speedfactory.net (SurgeMail 3.8p) with ESMTP id 214852681-1834499 for multiple; Wed, 17 Oct 2007 12:48:04 -0400 Received: from localhost.corp.yahoo.com (john@localhost [127.0.0.1]) (authenticated bits=0) by server.baldwin.cx (8.13.8/8.13.8) with ESMTP id l9HGjrGU006678; Wed, 17 Oct 2007 12:45:53 -0400 (EDT) (envelope-from jhb@freebsd.org) From: John Baldwin To: "Constantine A. Murenin" Date: Wed, 17 Oct 2007 12:45:36 -0400 User-Agent: KMail/1.9.6 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200710171245.36949.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (server.baldwin.cx [127.0.0.1]); Wed, 17 Oct 2007 12:45:53 -0400 (EDT) X-Virus-Scanned: ClamAV 0.88.3/4543/Wed Oct 17 10:16:16 2007 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-4.4 required=4.2 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on server.baldwin.cx Cc: arch@freebsd.org Subject: sensors fun.. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Oct 2007 17:10:46 -0000 [Trying to redirect this off cvs-all & friends.. ] So as I said previously, I thought about this some more offline last night / this morning and looked at the code some and here are my thoughts: Things I like about the current sensors code: - I like the actual sensor object used to represent a sensor. It has a few basic things like a string for a name, a type (I would have just done a "units" for the value, but the type is basically that), and a basic alarm state. (I might have done 4 states, think green, yellow, orange, red mapped to good, warning, critical, bad. However, the 3 states in the current code is fine. 4 states might be overkill.) - I like having the lm(4), etc. sensors report status via an object like this instead of just getting some string out of a tool like 'mbmon'. A framework should provide the data so that multiple utilities can use it. - I'm not entirely opposed to having kernel drivers for various known sensor providers like lm(4), etc. OTOH, I could see a driver just providing ioctls to query the list of sensors and a sensor's value similar to using requests to the IPMI BMC to request SDR data via ioctls to /dev/ipmi0 as the kernel -> userland interface. Things I'm not a big fan of: - Forcing all sensors to be in the kernel. A general rule in most OSes is to minimize the amount of stuff in the kernel. The kernel is easily the most, erm, "sensitive" process in the system. A segfault has much more serious consequences in the kernel (panic) than in userland (SIGSEGV). It also makes it more complicated to add "psuedo" sensors. For example, if I wanted a sensor for CPU usage based on the kern.cp_time sysctl, I could easily do that in userland by quering the sysctl periodically, computing the relative % busy and set a status based on a set of trigger points. Similarly, you could conceive having virtual/pseudo/whatever sensors for disk space, etc. At some point you do risk duplicating SNMP traps with sensord I suppose. I also genuinely think it is better to keep lots of state around in userland rather than in the kernel. That is, I think the kernel should provide a way to query a sensor (RAID drivers provide ioctls to communicate with the firmware, IPMI has ioctls to allow userland to communicate with the BMC, some drivers may provide ioctl/sysctl/whatever to read sensor values directly), but I don't think we should try to store history (see the cp_time example above) or extended state (keeping track of which drives exist so you can detect a drive that goes away) in the kernel. That belongs in userland. I think it can be ok for some sensors to be completely in the kernel and just get queried directly from userland, but I don't think that is a valid design constraint to enforce on all sensors. Things I think are dubious at best: - That it is more secure to put code in the kernel than as root in userland. It seems odd to even have to mention this, but it should be painfully obvious that a bug in a driver has much worse consequences than a bug in a user app running as root (or even better, some dedicated non-privileged account in a group that can send ioctls to /dev/ipmi0 or other monitored devices). I guess maybe I can see a viewpoint where you hope that a driver bug always panics and doesn't just corrupt data and so it's more likely to get a SA's attention and be less exploitable maybe? I can understand that 'mbmon' is untrusted code, but it seems to me that there are some less drastic measures than rewriting it all as kernel code, namely 1) audit the existing code, or 2) rewrite it all as userland code, ideally less privileged. Other things that might be nice: - IWBN to have a userland interface to sensors. For example, if nothing else a sensor enumerator rather than duplicating the sysctl loop as the current code does. This would make it easier to at least adjust the current artificial limit on the number of sensors since only one place in userland would have to change. (BTW, having an artificial limit on the number of sensors is lame. This is an example where using the normal way of walking a sysctl tree is superior. You can lose the entire limit.) Having a userland interface also makes it easier to have backends that are entirely in userland. - An snmp module that uses the above userland interface to export sensors via snmp. bsnmp already has a way to load modules at runtime (at least startup time) to add new MIBS. This would allow remote monitoring of various sensors if people prefer that to having a daemon on the box post alerts. If nothing else, it lets you add mrtg or rrdtool type graphs of the history of sensors if desired. Basically, I think there should be a "real" abstracted interface in userland that can use various backends. One backend could be to query sensors from drivers that provide them directly (lm(4), etc.). Another backend could use the existing IPMI interface to query SDR sensors via IPMI commands to the BMC. Different RAID controllers could provide backends that communicate with the firmware to maintain whatever state is needed, etc. but w/o doing all that in the device driver. People could write their own custom sensors w/o having to write a kernel module. Maybe that's a bigger vision than you were shooting for. I'm not sure phk@ will agree with this one either fwiw. :) -- John Baldwin