Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 05 Apr 2013 11:42:22 +0300
From:      Andriy Gapon <avg@FreeBSD.org>
To:        Alfred Perlstein <alfred@ixsystems.com>
Cc:        arch@FreeBSD.org
Subject:   collecting statistics / metrics
Message-ID:  <515E8E6E.4030706@FreeBSD.org>
In-Reply-To: <515C68B5.2010006@ixsystems.com>
References:  <20130401115128.GZ76816@FreeBSD.org> <20130402232606.GC1810@garage.freebsd.pl> <20130403002846.GB15334@onelab2.iet.unipi.it> <20130403100401.GA1349@garage.freebsd.pl> <515C68B5.2010006@ixsystems.com>

next in thread | previous in thread | raw e-mail | index | archive | help
on 03/04/2013 20:36 Alfred Perlstein said the following:
> Hey folks, sorry for the top post here, but I just came into this thread.
> 
> Here at iXsystems we've just developed a set of scripts to scrape the various
> FreeBSD user land utilities (sysctl, netstat, nfsstat, vmstat, etc, etc) and put
> them into graphs based on time.
> 
> The goal is to be able to line up all these metrics with whatever benchmark we
> are currently running and be able to see what may be causing issues.
> 
> Potentially you should be able to scroll through the graphs and see things like
> "ran out of mbufs @time", "vm system began paging at @time", "buffer deaemon
> went nuts @time"
> 
> Then we can take the information back and leverage it to make tuning decisions,
> or potentially change kernel algorithms.

This is very very useful!

> The only problem we have is that every user land tool has its own format, so
> along with my team we have written some shell to coerce the output from the
> various programs into pseudo-CSV (key/value pair) which can then be post
> processed by tools to convert to CSV which can then be put into something like
> open office, or put through an R program to graph it.
> 
> I'm hoping to have something shortly.
> 
> What I was hoping to do over the next few days was discuss with people how we
> can (or should we even) fix the user land statistics tools to output machine
> readable output that can be easily parsed.
> 
> Example: netstat -m  (hard to parse) versus 'vmstat -z | grep mbuf' easy to parse.
> 
> The idea of outputting xml is good, CSV is OK, however CSV is problematic as in
> the case of sysctl, if new nodes appear, then we can't begin to emit them, we
> must either ignore them, or abort, or log them to auxiliary files.  Anything
> that makes life easier is good.
> 
> I should be able to share our scripts within the next couple of days.

Just an alternative idea...
I think gathering all this information via plugins to e.g. collectd could be
more flexible and less processing / parsing intensive.  That would allow to
avoid unnecessary formatting and re-parsing and to store the data in a
convenient format. Ideally it would be great to have an umbrella library on top
of sysctl, devstat, etc that would expose various stats in a convenient form.
Another thing of convenience would be an ability to know which sysctls are
actually stats.  I think that you have already done work towards this goal.
There are certain heuristics that may help to distinguish stats from knobs,
constants, etc, but the explicit "this is a metric" should be used.  Of course,
it would take a lot of work to properly mark all the sysctls.

Just thinking out loud.

-- 
Andriy Gapon



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?515E8E6E.4030706>