From owner-freebsd-current@FreeBSD.ORG Sat Nov 9 17:38:01 2013 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 00519D6D for ; Sat, 9 Nov 2013 17:38:00 +0000 (UTC) (envelope-from asomers@gmail.com) Received: from mail-wi0-x22d.google.com (mail-wi0-x22d.google.com [IPv6:2a00:1450:400c:c05::22d]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 81D802A2F for ; Sat, 9 Nov 2013 17:38:00 +0000 (UTC) Received: by mail-wi0-f173.google.com with SMTP id ey11so678105wid.12 for ; Sat, 09 Nov 2013 09:37:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type:content-transfer-encoding; bh=krHA9KC/JxJ1MpQStrvU4zXzU5DnugahSLAXSIZHG2o=; b=OUw9ZSatkcOO3IPvtx7etIZcVObUGaxdzq99/fUDfZUvyYPm22jODhZw8I2OFCOQSX MJ13G0uluqtF+/qoXM+7oLZjx/lV/mGi5LU4iBsmyLVy2ppYvHuGxqRf+9sc1otlP0ur JtT3xuMP5DahiymQIpw4d5cIghPpu3oaQNoblUKrxEeyAClsj5crK6xFSiXZ0tTKpD1t KbC93/+3V5oZvRle0ry+3axuECebkqz5dr0AoxFBonw8jT55fVKcnVSeV3vbk+ZfEnud twH0/u+r9TrxIO8gyXysZvgDsqQu7pJCMJa3EafXEEDJbkUq2ixuxkTCaSo9GYwMx3Jj dfaA== MIME-Version: 1.0 X-Received: by 10.194.77.167 with SMTP id t7mr16583761wjw.27.1384018678749; Sat, 09 Nov 2013 09:37:58 -0800 (PST) Sender: asomers@gmail.com Received: by 10.194.171.35 with HTTP; Sat, 9 Nov 2013 09:37:58 -0800 (PST) In-Reply-To: <20131109133754.GA11249@lemon> References: <527C462F.9040707@elischer.org> <527D4952.7040407@freebsd.org> <20131109133754.GA11249@lemon> Date: Sat, 9 Nov 2013 10:37:58 -0700 X-Google-Sender-Auth: Pzb_F95RE4pdVCBRkF6ySV2htX8 Message-ID: Subject: Re: freebsd perf testing From: Alan Somers To: symbolics@gmx.com Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: FreeBSD CURRENT X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 09 Nov 2013 17:38:01 -0000 On Sat, Nov 9, 2013 at 6:37 AM, wrote: > On Fri, Nov 08, 2013 at 12:28:02PM -0800, Julian Elischer wrote: >> On 11/8/13, 1:54 AM, Olivier Cochard-Labb=E9 wrote: >> > On Fri, Nov 8, 2013 at 3:02 AM, Julian Elischer > > >wrote: >> > >> > Some time ago someone showed some freebsd performance graphs >> > graphed against time. >> > He had them up on a website that was updated each day or so. >> > >> > I think they were network perf tests but I'm not sure. >> > He indicated that he was going to continue the daily testing >> > but I've not seen any mention of them since. >> > >> > If you know who that was or how to find him let me (or gnn) know..= . >> > >> > >> > Hi Julian, >> > >> > Perhaps you are referring to my network performance graphs on this >> > thread: >> > http://lists.freebsd.org/pipermail/freebsd-current/2013-April/041323.h= tml >> > >> >> yes, you are the person we are looking for. >> In yesterday's 'vendor summit' we were discussing performance >> monitoring and your methodology was cited as one worth looking at. >> >> The idea of graphing the output of various performance tests against >> svn commit number is a very good one. >> I thonk it migh teven be worth doing these tests daily, and putting >> the output onto a web site, showing the last month, the last year and >> the whole range. >> it would even be interesting to put out 'xplot' files so that people >> can zoom in and out using xplot to see exactly which revision was >> responsinble for reversions or problems. >> >> George.. this is what we mentioned at the meeting yesterday. >> >> Julian >> > > As it happens I've been thinking over a design for something along these > lines recently. It's just some ideas at the moment but it might be of > interest to others. Forgive me; it's a long E-mail and it gets a bit > pie-in-the-sky too. > > I was prompted to think about the problem in the first place because I > read commit mail and I see performance related changes going into the > tree from time to time. These changes often do not come with any > specific data and when they do its normally quite narrow in focus. For > instance, an organisation contributes performance improvements specific > to their workloads and without interest in anyone elses (fair enough). > > Initially, what I wanted was a way of viewing how performance changed > for a number of workloads on a commit by commit basis. This sounds very > much like what you are after. > > Anyway, after thinking about this for sometime it occurred to me that > much of the infrastructure required to do performance testing could be > generalised to all sorts of software experiments. E.g. software builds, > regression tests, and so on. So, my first conclusion was: build an > experimentation framework within which performance is one aspect. > > Having decided this, I thought about the scope of experiments I wanted > to make. For instance, it would be good to test at least every supported > platform. On top of that I would like to be able to vary the relevant > configuration options too. Taking the product of commit, platform, > n-configuration options (not to mention compilers, etc...) you start to > get some pretty big numbers. The numbers grow far too fast and no person > or even organisation could feasibly cover the hardware resources > required to test every permutation. This led me to my next conclusion: > build a distributed system that allows for anyone to contribute their > hardware to the cause. Collectively the project, vendors, and users > could tackle a big chunk of this. > > My rough sketch for how this would work is as follows. A bootable USB > image would be made for all platforms. This would boot up, connect to > the network and checkout a repository. The first phase of the process > would be to profile what the host can offer. For example, we might have > experiments that require four identical hard drives, or a particular CPU > type, and so on. Shell scripts or short programmes would be written, > e.g. "has-atom-cpu", with these returning either 1 or 0. > > The results of this profiling would be submitted to a service. The > service matches the host with available experiments based on its > particular capabilities and current experimental priorities laid down by > the developers. A priority system would allow for the system to be > controlled precisely. If, for instance, major work is done to the VM > subsystem, relevant experiments could be prioritised over others for a > period. > > Once a decision on the experiment to conduct has been made, the relevant > image must be deployed to the system. Free space on the USB device would > be used a staging area, with a scripted installation occurring after > reboot. The images would need to be built somewhere, since it doesn't > make sense to rebuild the system endlessly, especially if we're > including low-powered embedded devices (which we should be). One > possible solution to this would be to use more powerful contributed > hosts to cross-build images and make them available for download. > > Finally, the experiments would be conducted. Data produced would be > submitted back to the project using another service where it could be > processed and analysed. To keep things flexible this would just consist > of a bunch of flat files, rather than trying to find some standardised, > one-size-fits all format. Statistics and graphics could be performed on > the data with R/Julia/etc. In particular I imagined DTrace scripts being > attached to experiments so that specific data can be collected. If > something warranting further investigation is found the experiment could > be amended with additional scripts, allowing developers to drill down > into issues. > > After some time the process repeats with new image deployed and new > experiments conducted. I envisage some means of identifying individual > hosts so that a developer could repeat the same experiment on the same > host if desired. > > Among the many potential problems with this plan, a big one is how would > we protect contributors privacy and security whilst still having a > realistic test environment? I guess the only way to do this would be to > (1) tell users that they should treat the system as if its hacked and > put it in its own network, (2) prevent the experiment images from > accessing anything besides FreeBSD.org. > > In relation to network performance, this might not be much good, since > multiple hosts might be necessary. It might be possible to build that > into the design too but it's already more than complicated enough. > > Anyhow, I think such a facility could be an asset if could be built. I > may try and put this together, but I've committed myself to enough > things already recently to take this any further at the moment. I'd be > interested to hear what people think, naturally. > This sounds exactly like the Phoronix test suite and its web-based reporting platform, openbenchmarking.org. It already has a large number of benchmarks to choose from, and it runs on FreeBSD. The downsides are that it can't do anything involving multiple hosts and it doesn't have a good interface to query results vs machine parameters, eg how does the score of benchmark X vary with the amount of RAM? But it's open-source, and I'm sure that patches are welcome ;) -Alan