From owner-cvs-src@FreeBSD.ORG  Tue Oct 16 16:34:28 2007
Return-Path: <owner-cvs-src@FreeBSD.ORG>
Delivered-To: cvs-src@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 8BAEA16A419;
	Tue, 16 Oct 2007 16:34:28 +0000 (UTC)
	(envelope-from netchild@freebsd.org)
Received: from redbull.bpaserver.net (redbullneu.bpaserver.net
	[213.198.78.217])
	by mx1.freebsd.org (Postfix) with ESMTP id CE24713C465;
	Tue, 16 Oct 2007 16:34:27 +0000 (UTC)
	(envelope-from netchild@freebsd.org)
Received: from outgoing.leidinger.net (p54A54BF1.dip.t-dialin.net
	[84.165.75.241])
	by redbull.bpaserver.net (Postfix) with ESMTP id 9798B2E145;
	Tue, 16 Oct 2007 18:34:17 +0200 (CEST)
Received: from webmail.leidinger.net (webmail.Leidinger.net [192.168.1.102])
	by outgoing.leidinger.net (Postfix) with ESMTP id E14615B480D;
	Tue, 16 Oct 2007 18:33:11 +0200 (CEST)
Received: (from www@localhost)
	by webmail.leidinger.net (8.14.1/8.13.8/Submit) id l9GGXBbW066103;
	Tue, 16 Oct 2007 18:33:11 +0200 (CEST)
	(envelope-from netchild@FreeBSD.org)
Received: from pslux.cec.eu.int (pslux.cec.eu.int [158.169.9.14]) by
	webmail.leidinger.net (Horde MIME library) with HTTP; Tue, 16 Oct 2007
	18:33:11 +0200
Message-ID: <20071016183311.lu97hbwzggsk4ow4@webmail.leidinger.net>
X-Priority: 3 (Normal)
Date: Tue, 16 Oct 2007 18:33:11 +0200
From: Alexander Leidinger <netchild@FreeBSD.org>
To: Poul-Henning Kamp <phk@phk.freebsd.dk>
References: <1947.1192477927@critter.freebsd.dk>
In-Reply-To: <1947.1192477927@critter.freebsd.dk>
MIME-Version: 1.0
Content-Type: text/plain;
	charset=UTF-8;
	DelSp="Yes";
	format="flowed"
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
User-Agent: Internet Messaging Program (IMP) H3 (4.1.4) / FreeBSD-7.0
X-BPAnet-MailScanner-Information: Please contact the ISP for more information
X-BPAnet-MailScanner: Found to be clean
X-BPAnet-MailScanner-SpamCheck: not spam, SpamAssassin (not cached,
	score=3.377, required 8, BAYES_50 2.50, J_CHICKENPOX_27 0.60,
	RDNS_DYNAMIC 0.10, SARE_FROM_SPAM_WORD3 0.10, TW_SK 0.08)
X-BPAnet-MailScanner-SpamScore: sss
X-BPAnet-MailScanner-From: netchild@freebsd.org
X-Spam-Status: No
Cc: Wilko Bulte <wb@freebie.xs4all.nl>, src-committers@FreeBSD.org,
	cvs-all@FreeBSD.org, cvs-src@FreeBSD.org
Subject: Re: cvs commit: src/etc Makefile sensorsd.conf src/etc/defaults
	rc.conf src/etc/rc.d Makefile sensorsd src/lib/libc/gen sysctl.3
	src/sbin/sysctl sysctl.8 sysctl.c src/share/man/man5 rc.conf.5
	src/share/man/man9 Makefile sensor_attach.9 src/sys/conf files ...
X-BeenThere: cvs-src@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: CVS commit messages for the src tree <cvs-src.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/cvs-src>,
	<mailto:cvs-src-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/cvs-src>
List-Post: <mailto:cvs-src@freebsd.org>
List-Help: <mailto:cvs-src-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/cvs-src>,
	<mailto:cvs-src-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 16 Oct 2007 16:34:28 -0000

Quoting Poul-Henning Kamp <phk@phk.freebsd.dk> (from Mon, 15 Oct 2007 =20
19:52:07 +0000):

> In message <20071015210909.1b6b693b@deskjail>, Alexander Leidinger writes:
>> Quoting "Poul-Henning Kamp" <phk@phk.freebsd.dk> (Mon, 15 Oct 2007  =20
>> 15:01:47 +0000):
>
>>> >And I responded that we need to have a way to export data from the
>>> >kernel to userland in an unified way.
>>>
>>> I can't seem to find the supposed requirement for unification here,
>>
>> You didn't comment on what I wrote about reducing the work of
>> reinventing a way to export the data we want to export again and again
>> and again and again...
>
> Yes, that is the abstract argument, but the very same argument can
> be made for every other single kind of entity which consumes or
> produces bytes, from fingerprint readers to 9-track tape stations.

Why do we have a common linked list API? It's easy enough to do it =20
again and again and again... We have it because we don't want to do it =20
again and again... And with the sensors API we gain something similar. =20
Sure, we can do it with sysctls in every place. But with the sensors =20
API we reduce the code to write for such things like with the linked =20
lists API.

> My beef with this code is that it unifies a lot of non-alike
> measurements and indications in a new, and pretty bogus IMO, api,
> but adding no value above a plain old sysctl while doing so.

It adds meta-data which can be used in an automated way. This is done =20
with a consistent and documented API. Sure, we can do it with sysctls =20
by hand, but see above.

>>> and in fact, it is exactly that a lot of crap gets bundled up
>>> under the wildcard term of "sensor" that I object to.
>>
>> Exporting temperature readings / voltage measurement, system status is
>> not crap.
>
> Actually it is, if it can be done as well (or better) from userland
> by exporting the underlying communications paths.

See below for the mbmon example. And for some things we already had =20
the export of this via sysctl (temperature of Intel core processors), =20
the sensors API "just" groups this together so that people don't have =20
to hunt such info down, adds meta data to it, so that tools can do =20
some automation, and unifies the API to export this info from the =20
kernel.

>> And when you write some monitoring probes in a large server
>> farm, you don't want to hunt down every interesting data value in a lot
>> of places.
>
> Ahh, so this brings me to the next problem with sensors.
>
> If by "monitoring" you mean "plot MRTG graphs", then I guess sensors
> could possibly make sense, although, I'd have to point out that all
> that can be done just as well from ports/sysutils/whatever.

With elevated privileges for those tools. See below.

> But if you're talking real world monitoring, then sensors are
> pointless, because there is no way to derive necessary machine
> readable contextual information, such as alarm limits, resolution,
> calibration constants, hysteresis, polling periods etc.

Normally you configure this in the tools which do the monitoring, not =20
in the tools which acquire the data. I'm not talking about doing this =20
in the sensors framework, but I talk about to get rid of the need to =20
hunt down such information from everywhere. I makes it easier to write =20
monitoring probes. It is not supposed to make the monitoring itself =20
easier.

> Or, for that matter, machine-readable information about what is
> actually being measured and where it is being measured and what it
> means.

A human being still has to interprete the measurements. No doubts. But =20
with the framework you don't have to hunt down where to read the =20
sensor data, and how to name it. You can write a probe which takes =20
everything in the the sensors mib and let it produce names and values =20
for the probed things automatically.

> There are frameworks for doing that sort of stuff in professional
> server hardware, but since OpenBSD doesn't support those, IPMI,
> ACPI etc, they have come up with this VAX-mentality junkheap instead.

IPMI is intended to handle situations when the OS is not booted or the =20
system is not powered on. Yes, it allows some features when the OS is =20
booted too. Now... how much hardware out there supports IPMI, or =20
better... how much in production use doesn't use IPMI? I would say =20
quite a lot (we all know the managers, "do everything with no money at =20
all"). The sensors framework allows to monitor those systems in an =20
easy (because you don't need to invest that much work with the sensors =20
framework) way. Have you identified some parts in the sensors =20
framework which make it impossible to play together with IPMI? If yes, =20
would you please so kind and tell us what prevents them to play =20
together?

Regarding ACPI... Nate (our ACPI committer) asked after the commit why =20
acpi_thermal was not modified to play together with the sensors =20
framework (the reason: ACPI was a little bit too much to do in the SoC =20
for Constantine, Nate agreed to work together (after some other work =20
in ACPI he wants to finish first) if desired (put on hold from the =20
sensors side until this discussion comes tp an conclusion), and also =20
suggested some improvements for the sensors framework). He didn't told =20
us about something in the framework, which prevents the use of it =20
together with ACPI.

>>> >I already told you last time
>>> >that the current way (access to the i2c or smbus) needs more access
>>> >rights than using the userland parts of the sensors framework.
>>>
>>> More rights than what exactly ?
>>
>> One popular userland temperature/voltage reading tool (as it supports a
>> lot of popular devices) is mbmon. It is currently a SUID root
>> application.
>
> Let me get this straight, you're telling me:
>
> =09"I'm worried about this code running as root, so I'm putting
> =09it in the kernel instead."
>
> Can anybody here spot the logic flaw of this argument ?
>
> Anybody ?
>
> Right, I thought so.  Lets pretend you didn't say that.

You missed the point. In our architecture we can not do it without =20
executing some code with elevated privileges. What I suggested was to =20
replace SUID root code with unknown security heritage and potential =20
complete access to the machine (direct access to ISA I/O ports and the =20
smbus), not only with code (which runs with elevated privileges) from =20
a known security heritage (from people which strongly care about =20
security), but also with an access method which doesn't need any =20
elevated privileges at all. I assume you think that for example =20
expoting some predefined numbers with "sysctl hw.sensors >outout" is =20
not something we need to care that much about from a security point of =20
view. Feel free to tell me if I'm wrong with my assumption. I at least =20
feel safer to export an int over sysctl, than to run the SUID root =20
mbmon tool.

There may be other ways to accomplish the same. Maybe a daemon which =20
runs with elevated privileges and just outputs the temperature/voltage =20
as a number to a tool which runs without elevated privileges and =20
collects the data. So far we don't have such a tool, and in the =20
beginning I'm not sure we can trust such a tool to the same extend as =20
we can trust sysctl.

And this would only affect sensors like lm (which is not part of the =20
sensors framework, but uses the sensors framework to export it's data, =20
and was used as an example to show how the sensors framework can be =20
used). What about other such data which lives in subsystems we don't =20
want to let an userland application mess around with? I don't know if =20
we want to let an userland application give that much access to ACPI. =20
ACPI already exports some data with sysctl. ATM it does it by hand. =20
The sensors framework provides an unified API to not do everything by =20
hand anymore.

>>> No, I have yet to see why we need this framework.
>>
>> Several committers which voted for this project in the soc webinterface
>> seem to think otherwise, else they wouldn't have voted for it.
>
> I repeat: The SoC interface is not the gateway to -current.

It provides an idea in what people are interested in. And several =20
committers here in the thread also showed interest in this framework =20
(maybe not in the current implementation, but at least in the idea =20
behind it). Just because you do not see how such a framework can be =20
useful to you (so far I have the impression from your mails, that you =20
object to the idea of this framework), it doesn't mean other people =20
are not interested in it. So it is not a gateway for concrete =20
implementations into -current, but it shows what people are interested =20
in for -current. The current implementation of the sensors framework =20
may not use the best way of doing something in some places, but it is =20
not as bad as you give people the impression. In the mean time =20
Constantine had a look at the constructive critics we got so far, and =20
thinks that moving to the taskqueue API is not a bug problem. For the =20
newbus suggestion so far he asked for more information, as the man =20
pages didn't gave him enough input. As far as I know, the man pages of =20
newbus are known to be in a suboptimal shape. But as you seem to be =20
not interested in the idea of the sensors framework, one can have the =20
impression, that you are not interested to hear about improvements of =20
the implementation.

>>> >>    C) How it integrates into FreeBSD and for the places where
>>> >>       it does not (newbus ?) why it doesn't.
>>> >
>>
>> We're discussing it right now. And I was willing to discuss this even
>> back when you talked about it the first time.
>
> It should have been dicussed and fixed before being committed to
> -current.  Current is not where you start, it is where you end, these

We where willing to listed to your concerns. You stopped talking about =20
them back then.

> kind of things are not details to be hashed out in CVS.

Constantine asked for review several times on -current. He got some =20
reviews several times for commits to perforce. He incorporated =20
suggestions from those reviews, or explained why it is like it is and =20
why he can not switch (with no replies with suggestions how to solve =20
the problems he sees with the suggestions). Now you come and ask why =20
nobody pointed out some flaws before (without telling us which =20
technical flaws you talk about).

> Ten years ago when we didn't have P4 and the _extensive_ infrastructure
> for making it easy for people to work out of the tree, we had to do
> stuff like that, but there is no excuse for it today.

Nobody is perfect. There will always be some bugs when something is =20
committed to -current. You don't talk about obvious problems here. =20
There's no destabilized system, there are no panics. You talk about =20
not using an underdocumented API and not using a generic framework for =20
creating tasks (so basically we talk about doing something by hand =20
what can be done with less work by using an API... sounds like what =20
the sensors framework tries to do for some kinds of data export).

Note: the commit of the code from the sensors project was done in 3 =20
steps on purpose, the sensors framework, 2 temperature/voltage chip =20
drivers as examples how to use the sensors framework (and the benefit =20
of getting rid of a SUID root application in the userland), and =20
converting the existing CPU temperature sysctl (for some specific =20
Intel CPUs) from doing a sysctl by hand to the sensors framework (as =20
an example how to convert an existing part to the sensors framework). =20
So far we mixed the discussion about various parts of this together =20
under the umbrella of "sensors framework", which is not ok.

>>> >The code we have currently doesn't harm the system, [...]
>>>
>>> This is where I disagree: it does harm the system.
>>>
>>> It dumps a load of badly thought out code into our source tree, and
>>> that will effectively be in the way of any well thought out solution
>>> that might ever appear.
>>>
>>> This stuff should be backed out, and forced to live outside the tree
>>> until it satisfies our quality and architectural requirements.
>>
>> How does this compare to your attitude of bringing stuff into the tree
>> and letting other people fixup collateral damage in the past? But I
>> drift away from the topic... so back to the sensors framework.
>
> It does not compare, please answer the question instead of trying

Which question? I don't see a question from you in the quoted part =20
above. And without mentioning the bad code, we can not proceed =20
discussing about it. The constructive responses so far don't give the =20
impression that there code is that bad as you suggest here.

> to insult me, because you're no good at it.  I have been insulted

I don't try to insult you.

>> To be able to address our quality and architectural requirements, we
>> need first to know which quality and architectural requirements are
>> violated by which part of the code in question. As you seem to have
>> identified them, would you please so kind and share your concrete
>> findings?
>
> Would you be so kind as to read the emails I've sent you ?  Maybe
> this time expending some effort to understand what people tell you in
> them, rather than pretending they contained nothing ?

So far I see from you that you are not interested in the idea of the =20
framework. And I see mails where people are interested in the idea of =20
the framework. And I see private mails which point out some =20
improvements. According specific quality problems, I haven't seen a =20
concrete mail from you, like saying "you use X, but this is should be =20
done with Y, see at place A how to do it better". I have seen such =20
mails from other people, but not from you.

>>> Forcing it to stay out of -current, means that the motivational
>>> pressure is on the people who thing we badly need this featureset,
>>> and gives them reason to improve it.
>>>
>>> Leaving it in the tree, is the sure fire way for it to become an
>>> unmaintained lump of code that slowly rots away.
>>
>> You did a psycho-analysis of Constantine so that you are able to tell
>> that he doesn't fix the issues while he agreed to improve the framework?
>
> No, it's far worse.
>
> I have 15 years of experience with this kind of code being slammed
> into the tree in an unfinished and badly thought out shape.
>
> And I have wasted far more time cleaning it up afterwards, than you
> or constantine will have spent on FreeBSD five years from now.

So would you please share your experience and tell us where parts are =20
wrong because of what reason and what kind of technique is better =20
suited to do it? So far I've only seen that you don't like the idea =20
itself and that you for this reason haven't looked at concrete problems.

If someone reading this does not share this point of view, please, =20
tell me about it. So far I haven't got any mail to my similar question =20
in a previous mail, so please tell me about it if you think that I'm =20
wrong when I say that Poul gives the impression he doesn't like the =20
idea (while several other committers see benefit in it) and that it =20
seems that he didn't provided concrete pointers to problems in the =20
implementation.

Bye,
Alexander.

--=20
http://www.Leidinger.net  Alexander @ Leidinger.net: PGP ID =3D B0063FE7
http://www.FreeBSD.org     netchild @ FreeBSD.org  : PGP ID =3D 72077137
Who on earth would eat a charred caterpillar!?
No, no, you SINGE 'em!  You SINGE 'em and eat 'em!