Date: Wed, 03 Feb 2010 19:50:26 +0100 From: =?ISO-8859-1?Q?Gustau_P=E9rez?= <gperez@entel.upc.edu> To: Mikolaj Golub <to.my.trociny@gmail.com> Cc: freebsd-stable@freebsd.org Subject: Re: bsnmpd returns incorrect hrProcessorLoad values Message-ID: <4B69C572.1020601@entel.upc.edu> In-Reply-To: <86ljfg7hl3.fsf@kopusha.onet> References: <4B62C890.3020802@entel.upc.edu> <86ljfg7hl3.fsf@kopusha.onet>
next in thread | previous in thread | raw e-mail | index | archive | help
En/na Mikolaj Golub ha escrit: > On Fri, 29 Jan 2010 12:37:52 +0100 Gustau Pérez wrote: > > >> Hi, >> >> I'm using cacti to monitor some servers running FBSD. I was using 7.2 >> with SCHED_4BSD. With this configuration : bsnmpd+bsnmp-ucd was >> returning right values for the cores' load. >> >> I recently updated the servers (via csup) to RELENG_8 and bsnmpd is >> returning negative values for the cores' load. If I try something like >> in a 4-core system : >> >> snmpwalk -v 2c -c community server .1.3.6.1.2.1.25.3.3.1 >> >> what I get is : >> >> .1.3.6.1.2.1.25.3.3.1.1.6 = OID: .0.0 >> .1.3.6.1.2.1.25.3.3.1.1.10 = OID: .0.0 >> .1.3.6.1.2.1.25.3.3.1.1.14 = OID: .0.0 >> .1.3.6.1.2.1.25.3.3.1.1.18 = OID: .0.0 >> .1.3.6.1.2.1.25.3.3.1.2.6 = INTEGER: -182 >> .1.3.6.1.2.1.25.3.3.1.2.10 = INTEGER: -182 >> .1.3.6.1.2.1.25.3.3.1.2.14 = INTEGER: -182 >> .1.3.6.1.2.1.25.3.3.1.2.18 = INTEGER: -182 >> >> I tried and old bsnmpd-ucd (0.2.1, works fine in a 7,2 system) with a >> 8.0 system. Same wrong results. And it seems bsnmpd in /usr/src/contrib >> has not changed between 7.2 and 8.0. >> >> Any ideas ? I'm not an expert, but with tcpdump I see different >> results. Against an old 7.2 system, the field related to each core load >> gives the right value. Instead, against and 8.0 system, those field show >> (in hex) values like fd 4b. What I don't know is how bsdnmp-ucb retrives >> those values and how it construct the udp response packet. >> > > bsnmpd-ucd has nothing to do with HOST-RESOURCES-MIB. These mibs are provided > by snmp_hostres(3) module (/usr/lib/snmp_hostres.so). So something wrong is > there (I suppose it is not in sync with some recent changes in kernel or > libkvm). > > You are right. I checked the usr.sbin/bsnmpd/modules/snmp_hostres/hostres_processor_tbl.c. I think it has something to do with the processor_getpcpu function (line 122). The code is : > if (ccpu == 0 || fscale == 0) > return (0.0); > > #define fxtofl(fixpt) ((double)(fixpt) / fscale) > return (100.0 * fxtofl(ki_p->ki_pctcpu) / > (1.0 - exp(ki_p->ki_swtime * log(fxtofl(ccpu))))); With 4 core SCHED_ULE system I checked it and ccpu is always 0 (sysctl kern.ccpu gives 0 too). So this routine always returns 0.0. That makes the save_sample routine to fill e->samples[#cpu] with 100. If I comment the ccpu ==0, the I see strange values. I know, I changed the code. With some printfs, I see the returned value when starting bsnmpd is 98~99. But the it goes up until 350~400 (strange). I put some others printfs and then I saw that when starting the daemon it return 98~99 for each processor and the ki_pctcpu is 2026 (in my case). Then, the next time bsnmpd refreshes its values I see it returns wrong values and ki_pctcpu goes up four times. So the function returns nearly 400% of idle time for each processor... So I checked it with SCHED_4BSD with an 8 core system. The same behaviour, but this time I got an increase of eight times for the ki_pctcpu. Now I'm stuck in here. I think the kinfo_proc info is obtained ny using kvm_getprocs. Do you have any idea why it returns those values ? Regards, Gus - -- PGP KEY : http://www-entel.upc.edu/gus/gus.asc
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4B69C572.1020601>