Date: Fri, 5 Apr 2013 07:59:50 -0700 From: Alfred Perlstein <alfred@ixsystems.com> To: Andriy Gapon <avg@FreeBSD.org> Cc: "arch@FreeBSD.org" <arch@FreeBSD.org> Subject: Re: collecting statistics / metrics Message-ID: <471D2765-393A-473F-A17C-FE1B77D15A6B@ixsystems.com> In-Reply-To: <515E8E6E.4030706@FreeBSD.org> References: <20130401115128.GZ76816@FreeBSD.org> <20130402232606.GC1810@garage.freebsd.pl> <20130403002846.GB15334@onelab2.iet.unipi.it> <20130403100401.GA1349@garage.freebsd.pl> <515C68B5.2010006@ixsystems.com> <515E8E6E.4030706@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Apr 5, 2013, at 1:42 AM, Andriy Gapon <avg@FreeBSD.org> wrote: > on 03/04/2013 20:36 Alfred Perlstein said the following: >> Hey folks, sorry for the top post here, but I just came into this thread. >> >> Here at iXsystems we've just developed a set of scripts to scrape the various >> FreeBSD user land utilities (sysctl, netstat, nfsstat, vmstat, etc, etc) and put >> them into graphs based on time. >> >> The goal is to be able to line up all these metrics with whatever benchmark we >> are currently running and be able to see what may be causing issues. >> >> Potentially you should be able to scroll through the graphs and see things like >> "ran out of mbufs @time", "vm system began paging at @time", "buffer deaemon >> went nuts @time" >> >> Then we can take the information back and leverage it to make tuning decisions, >> or potentially change kernel algorithms. > > This is very very useful! > >> The only problem we have is that every user land tool has its own format, so >> along with my team we have written some shell to coerce the output from the >> various programs into pseudo-CSV (key/value pair) which can then be post >> processed by tools to convert to CSV which can then be put into something like >> open office, or put through an R program to graph it. >> >> I'm hoping to have something shortly. >> >> What I was hoping to do over the next few days was discuss with people how we >> can (or should we even) fix the user land statistics tools to output machine >> readable output that can be easily parsed. >> >> Example: netstat -m (hard to parse) versus 'vmstat -z | grep mbuf' easy to parse. >> >> The idea of outputting xml is good, CSV is OK, however CSV is problematic as in >> the case of sysctl, if new nodes appear, then we can't begin to emit them, we >> must either ignore them, or abort, or log them to auxiliary files. Anything >> that makes life easier is good. >> >> I should be able to share our scripts within the next couple of days. > > Just an alternative idea... > I think gathering all this information via plugins to e.g. collectd could be > more flexible and less processing / parsing intensive. That would allow to > avoid unnecessary formatting and re-parsing and to store the data in a > convenient format. Ideally it would be great to have an umbrella library on top > of sysctl, devstat, etc that would expose various stats in a convenient form. > Another thing of convenience would be an ability to know which sysctls are > actually stats. I think that you have already done work towards this goal. > There are certain heuristics that may help to distinguish stats from knobs, > constants, etc, but the explicit "this is a metric" should be used. Of course, > it would take a lot of work to properly mark all the sysctls. > > Just thinking out loud. I'm going to bring these suggestions to my team and I think we can incorporate some of these ideas for sure. -Alfred
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?471D2765-393A-473F-A17C-FE1B77D15A6B>
