From owner-freebsd-current@FreeBSD.ORG  Sat Nov  9 13:38:02 2013
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: freebsd-current@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTP id CE1A817A
 for <freebsd-current@freebsd.org>; Sat,  9 Nov 2013 13:38:02 +0000 (UTC)
 (envelope-from symbolics@gmx.com)
Received: from mout.gmx.net (mout.gmx.net [212.227.17.22])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-SHA (128/128 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id 666072DAC
 for <freebsd-current@freebsd.org>; Sat,  9 Nov 2013 13:38:02 +0000 (UTC)
Received: from lemon ([80.7.17.14]) by mail.gmx.com (mrgmx003) with ESMTPSA
 (Nemesis) id 0M6RmV-1Vq2kC1LZL-00yUWY for <freebsd-current@freebsd.org>; Sat,
 09 Nov 2013 14:37:55 +0100
Received: by lemon (Postfix, from userid 1001)
 id C524EEB2F2; Sat,  9 Nov 2013 13:37:54 +0000 (GMT)
Date: Sat, 9 Nov 2013 13:37:54 +0000
From: symbolics@gmx.com
To: freebsd-current@freebsd.org
Subject: Re: freebsd perf testing
Message-ID: <20131109133754.GA11249@lemon>
References: <527C462F.9040707@elischer.org>
 <CA+q+TcrVYbQVjTKQrAreksZRUEtBF-SUUWp6ojxY_mRZduVCbA@mail.gmail.com>
 <527D4952.7040407@freebsd.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
In-Reply-To: <527D4952.7040407@freebsd.org>
Content-Transfer-Encoding: quoted-printable
X-Provags-ID: V03:K0:SLAYXEe0EA6JAxkeL4MMnPggBB6558BN0D3SFdbAzbDVeqKLpCd
 mbYzyryrJR7Xaa5Tu7X9ZUxrq+JBnqoO22d7wS8I0f5tgMc2zSvwC8ZB06PoSpvHzcCEVVq
 drKVKi58xGkaMpdtLDSZQF6qLFhFTkJtqhmJw8yBdrJnjPNzyvxniCMGHIoNfGxOCfj4L9A
 9md0BsKnNniyXBYNLu1Yw==
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
 <freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-current>, 
 <mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
 <mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 09 Nov 2013 13:38:03 -0000

On Fri, Nov 08, 2013 at 12:28:02PM -0800, Julian Elischer wrote:
> On 11/8/13, 1:54 AM, Olivier Cochard-Labb=E9 wrote:
> > On Fri, Nov 8, 2013 at 3:02 AM, Julian Elischer <julian@elischer.org=20
> > <mailto:julian@elischer.org>>wrote:
> >
> >     Some time ago someone showed some freebsd performance graphs
> >     graphed against time.
> >     He had them up on a website that was updated each day or so.
> >
> >     I think they were network perf tests but I'm not sure.
> >     He indicated that he was going to continue the daily testing
> >     but I've not seen any mention of them since.
> >
> >     If you know who that was or how to find him let me (or gnn) know.=
..
> >
> >
> > Hi Julian,
> >
> > Perhaps you are referring to my network performance graphs on this=20
> > thread:
> > http://lists.freebsd.org/pipermail/freebsd-current/2013-April/041323.=
html
> >
>=20
> yes, you are the person we are looking for.
> In yesterday's 'vendor summit' we were discussing performance=20
> monitoring and your methodology was cited as one worth looking at.
>=20
> The idea of graphing the output of various performance tests against=20
> svn commit number is a very good one.
> I thonk it migh teven be worth doing these tests daily, and putting=20
> the output onto a web site, showing the last month, the last year and=20
> the whole range.
> it would even be interesting to put out 'xplot' files so that people=20
> can zoom in and out using xplot to see exactly which revision was=20
> responsinble for reversions or problems.
>=20
> George..  this is what we mentioned at the meeting yesterday.
>=20
> Julian
>=20

As it happens I've been thinking over a design for something along these
lines recently. It's just some ideas at the moment but it might be of
interest to others. Forgive me; it's a long E-mail and it gets a bit
pie-in-the-sky too.

I was prompted to think about the problem in the first place because I
read commit mail and I see performance related changes going into the
tree from time to time. These changes often do not come with any
specific data and when they do its normally quite narrow in focus. For
instance, an organisation contributes performance improvements specific
to their workloads and without interest in anyone elses (fair enough).

Initially, what I wanted was a way of viewing how performance changed
for a number of workloads on a commit by commit basis. This sounds very
much like what you are after.

Anyway, after thinking about this for sometime it occurred to me that
much of the infrastructure required to do performance testing could be
generalised to all sorts of software experiments. E.g. software builds,
regression tests, and so on. So, my first conclusion was: build an
experimentation framework within which performance is one aspect.

Having decided this, I thought about the scope of experiments I wanted
to make. For instance, it would be good to test at least every supported
platform. On top of that I would like to be able to vary the relevant
configuration options too. Taking the product of commit, platform,
n-configuration options (not to mention compilers, etc...) you start to
get some pretty big numbers. The numbers grow far too fast and no person
or even organisation could feasibly cover the hardware resources
required to test every permutation. This led me to my next conclusion:
build a distributed system that allows for anyone to contribute their
hardware to the cause. Collectively the project, vendors, and users
could tackle a big chunk of this.

My rough sketch for how this would work is as follows. A bootable USB
image would be made for all platforms. This would boot up, connect to
the network and checkout a repository. The first phase of the process
would be to profile what the host can offer. For example, we might have
experiments that require four identical hard drives, or a particular CPU
type, and so on. Shell scripts or short programmes would be written,
e.g. "has-atom-cpu", with these returning either 1 or 0.

The results of this profiling would be submitted to a service. The
service matches the host with available experiments based on its
particular capabilities and current experimental priorities laid down by
the developers. A priority system would allow for the system to be
controlled precisely. If, for instance, major work is done to the VM
subsystem, relevant experiments could be prioritised over others for a
period.

Once a decision on the experiment to conduct has been made, the relevant
image must be deployed to the system. Free space on the USB device would
be used a staging area, with a scripted installation occurring after
reboot. The images would need to be built somewhere, since it doesn't
make sense to rebuild the system endlessly, especially if we're
including low-powered embedded devices (which we should be). One
possible solution to this would be to use more powerful contributed
hosts to cross-build images and make them available for download.

Finally, the experiments would be conducted. Data produced would be
submitted back to the project using another service where it could be
processed and analysed. To keep things flexible this would just consist
of a bunch of flat files, rather than trying to find some standardised,
one-size-fits all format. Statistics and graphics could be performed on
the data with R/Julia/etc. In particular I imagined DTrace scripts being
attached to experiments so that specific data can be collected. If
something warranting further investigation is found the experiment could
be amended with additional scripts, allowing developers to drill down
into issues.

After some time the process repeats with new image deployed and new
experiments conducted. I envisage some means of identifying individual
hosts so that a developer could repeat the same experiment on the same
host if desired.

Among the many potential problems with this plan, a big one is how would
we protect contributors privacy and security whilst still having a
realistic test environment? I guess the only way to do this would be to
(1) tell users that they should treat the system as if its hacked and
put it in its own network, (2) prevent the experiment images from
accessing anything besides FreeBSD.org.

In relation to network performance, this might not be much good, since
multiple hosts might be necessary. It might be possible to build that
into the design too but it's already more than complicated enough.

Anyhow, I think such a facility could be an asset if could be built. I
may try and put this together, but I've committed myself to enough
things already recently to take this any further at the moment. I'd be
interested to hear what people think, naturally.

--sym