From owner-freebsd-stable@FreeBSD.ORG Wed Feb 3 18:50:35 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DE2E91065676 for ; Wed, 3 Feb 2010 18:50:34 +0000 (UTC) (envelope-from gperez@entel.upc.edu) Received: from violet.upc.es (violet.upc.es [147.83.2.51]) by mx1.freebsd.org (Postfix) with ESMTP id 660D08FC13 for ; Wed, 3 Feb 2010 18:50:34 +0000 (UTC) Received: from ackerman2.upc.es (ackerman2.upc.es [147.83.2.244]) by violet.upc.es (8.14.1/8.13.1) with ESMTP id o13IoVn2032705 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL); Wed, 3 Feb 2010 19:50:31 +0100 Received: from [192.168.100.200] (146.Red-83-37-87.dynamicIP.rima-tde.net [83.37.87.146]) (authenticated bits=0) by ackerman2.upc.es (8.13.8/8.13.8) with ESMTP id o13IoQQN027533 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 3 Feb 2010 19:50:31 +0100 Message-ID: <4B69C572.1020601@entel.upc.edu> Date: Wed, 03 Feb 2010 19:50:26 +0100 From: =?ISO-8859-1?Q?Gustau_P=E9rez?= User-Agent: Thunderbird 2.0.0.23 (X11/20100112) MIME-Version: 1.0 To: Mikolaj Golub References: <4B62C890.3020802@entel.upc.edu> <86ljfg7hl3.fsf@kopusha.onet> In-Reply-To: <86ljfg7hl3.fsf@kopusha.onet> X-Enigmail-Version: 0.95.7 X-Enigmail-Version: 0.95.7 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.63 on 147.83.2.244 X-Mail-Scanned: Criba 2.0 + Clamd X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-3.0 (violet.upc.es [147.83.2.51]); Wed, 03 Feb 2010 19:50:32 +0100 (CET) Cc: freebsd-stable@freebsd.org Subject: Re: bsnmpd returns incorrect hrProcessorLoad values X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Feb 2010 18:50:35 -0000 En/na Mikolaj Golub ha escrit: > On Fri, 29 Jan 2010 12:37:52 +0100 Gustau Pérez wrote: > > >> Hi, >> >> I'm using cacti to monitor some servers running FBSD. I was using 7.2 >> with SCHED_4BSD. With this configuration : bsnmpd+bsnmp-ucd was >> returning right values for the cores' load. >> >> I recently updated the servers (via csup) to RELENG_8 and bsnmpd is >> returning negative values for the cores' load. If I try something like >> in a 4-core system : >> >> snmpwalk -v 2c -c community server .1.3.6.1.2.1.25.3.3.1 >> >> what I get is : >> >> .1.3.6.1.2.1.25.3.3.1.1.6 = OID: .0.0 >> .1.3.6.1.2.1.25.3.3.1.1.10 = OID: .0.0 >> .1.3.6.1.2.1.25.3.3.1.1.14 = OID: .0.0 >> .1.3.6.1.2.1.25.3.3.1.1.18 = OID: .0.0 >> .1.3.6.1.2.1.25.3.3.1.2.6 = INTEGER: -182 >> .1.3.6.1.2.1.25.3.3.1.2.10 = INTEGER: -182 >> .1.3.6.1.2.1.25.3.3.1.2.14 = INTEGER: -182 >> .1.3.6.1.2.1.25.3.3.1.2.18 = INTEGER: -182 >> >> I tried and old bsnmpd-ucd (0.2.1, works fine in a 7,2 system) with a >> 8.0 system. Same wrong results. And it seems bsnmpd in /usr/src/contrib >> has not changed between 7.2 and 8.0. >> >> Any ideas ? I'm not an expert, but with tcpdump I see different >> results. Against an old 7.2 system, the field related to each core load >> gives the right value. Instead, against and 8.0 system, those field show >> (in hex) values like fd 4b. What I don't know is how bsdnmp-ucb retrives >> those values and how it construct the udp response packet. >> > > bsnmpd-ucd has nothing to do with HOST-RESOURCES-MIB. These mibs are provided > by snmp_hostres(3) module (/usr/lib/snmp_hostres.so). So something wrong is > there (I suppose it is not in sync with some recent changes in kernel or > libkvm). > > You are right. I checked the usr.sbin/bsnmpd/modules/snmp_hostres/hostres_processor_tbl.c. I think it has something to do with the processor_getpcpu function (line 122). The code is : > if (ccpu == 0 || fscale == 0) > return (0.0); > > #define fxtofl(fixpt) ((double)(fixpt) / fscale) > return (100.0 * fxtofl(ki_p->ki_pctcpu) / > (1.0 - exp(ki_p->ki_swtime * log(fxtofl(ccpu))))); With 4 core SCHED_ULE system I checked it and ccpu is always 0 (sysctl kern.ccpu gives 0 too). So this routine always returns 0.0. That makes the save_sample routine to fill e->samples[#cpu] with 100. If I comment the ccpu ==0, the I see strange values. I know, I changed the code. With some printfs, I see the returned value when starting bsnmpd is 98~99. But the it goes up until 350~400 (strange). I put some others printfs and then I saw that when starting the daemon it return 98~99 for each processor and the ki_pctcpu is 2026 (in my case). Then, the next time bsnmpd refreshes its values I see it returns wrong values and ki_pctcpu goes up four times. So the function returns nearly 400% of idle time for each processor... So I checked it with SCHED_4BSD with an 8 core system. The same behaviour, but this time I got an increase of eight times for the ki_pctcpu. Now I'm stuck in here. I think the kinfo_proc info is obtained ny using kvm_getprocs. Do you have any idea why it returns those values ? Regards, Gus - -- PGP KEY : http://www-entel.upc.edu/gus/gus.asc