From owner-freebsd-hackers  Tue Sep 10 11:24:19 1996
Return-Path: owner-hackers
Received: (from root@localhost)
          by freefall.freebsd.org (8.7.5/8.7.3) id LAA03714
          for hackers-outgoing; Tue, 10 Sep 1996 11:24:19 -0700 (PDT)
Received: from yangtze.cs.UMD.EDU (yangtze.cs.umd.edu [128.8.128.118])
          by freefall.freebsd.org (8.7.5/8.7.3) with ESMTP id LAA03704
          for <freebsd-hackers@FreeBSD.ORG>; Tue, 10 Sep 1996 11:24:15 -0700 (PDT)
Received: by yangtze.cs.UMD.EDU (8.7.5/UMIACS-0.9/04-05-88)
	id OAA03213; Tue, 10 Sep 1996 14:22:51 -0400 (EDT)
From: fwmiller@cs.UMD.EDU (Frank W. Miller)
Message-Id: <199609101822.OAA03213@yangtze.cs.UMD.EDU>
Subject: Re: kernel performance
To: terry@lambert.org (Terry Lambert)
Date: Tue, 10 Sep 1996 14:22:50 -0400 (EDT)
Cc: fwmiller@cs.UMD.EDU (Frank W. Miller), freebsd-hackers@FreeBSD.ORG
In-Reply-To: <199609101743.KAA03054@phaeton.artisoft.com> from "Terry Lambert" at Sep 10, 96 10:43:00 am
X-Mailer: ELM [version 2.4 PL25]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-hackers@FreeBSD.ORG
X-Loop: FreeBSD.org
Precedence: bulk

> 
> There are performance figure you can get via gprof.  These are
> statistical in nature, so it will be impossible to make a reasonable
> distinction between cache/non-cache cases (which is why statistical
> profiling sucks).
> 

Statistical should be fine for my purpose.  I am interested in the
performance trends for a large number of reads from disk followed by
writes to UDP.  By large number, we're talking in the 10s of millions,
i. e. moving a half-hour mpeg from disk to network.
I have made some very gross measurements at the system call level.
As you might expect, the numbers indicate dramatic variance in latency
and jitter.  I am seeking to breakdown these measurement further to
determine what elements of the kernel play the largest role in the
variance.

I thank all who responded, it looks like configuring the gprof
sections of the kernel is going to be a good place to start.

> I have non-statistical profiling data starting from the VFS consumer
> layer, and working its way down through the supporting code, but
> excluding some VM and driver effects... it was collected on Win95
> using the Pentium instruction clock using highly modified gprof code
> and compiler generated function entry points + stack hacking to get
> function exit counters.  The Win95 code had all of the gross
> architectural modifications I've been discussing for the past two
> years, so there are some functional bottlenecks removed.  The data
> is proprietary to my employer.
> 

Fortunately, I'm working with FreeBSD and BSD/OS. :P

> Statistical profiling operates by dividing the address space into
> "count buckets" and sampling the PC at intervals.  This is not a
> higly reliable mechanism, but barring a lot of hacking, you will
> probably not be able to easily get more useful numbers.
> 

We'll see how much hacking my advisor wants me to do. ;)

Later,
FM

--
Frank W. Miller                           Department of Computer Science
fwmiller@cs.umd.edu                 University of Maryland, College Park
http://www.cs.umd.edu/~fwmiller             College Park, Maryland 20742