From owner-freebsd-performance@FreeBSD.ORG  Fri Jun 15 00:26:06 2007
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
X-Original-To: performance@FreeBSD.org
Delivered-To: freebsd-performance@FreeBSD.ORG
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 8A2FC16A400;
	Fri, 15 Jun 2007 00:26:06 +0000 (UTC) (envelope-from cswiger@mac.com)
Received: from mail-out3.apple.com (mail-out3.apple.com [17.254.13.22])
	by mx1.freebsd.org (Postfix) with ESMTP id 6FA5813C45A;
	Fri, 15 Jun 2007 00:26:06 +0000 (UTC) (envelope-from cswiger@mac.com)
Received: from relay7.apple.com (relay7.apple.com [17.128.113.37])
	by mail-out3.apple.com (Postfix) with ESMTP id 864088DE567;
	Thu, 14 Jun 2007 17:24:56 -0700 (PDT)
Received: from relay7.apple.com (unknown [127.0.0.1])
	by relay7.apple.com (Symantec Mail Security) with ESMTP id 4323630076; 
	Thu, 14 Jun 2007 17:26:06 -0700 (PDT)
X-AuditID: 11807125-9ff64bb000000801-ce-4671dc9e5429
Received: from [17.214.13.96] (cswiger1.apple.com [17.214.13.96])
	(using TLSv1 with cipher AES128-SHA (128/128 bits))
	(No client certificate requested)
	by relay7.apple.com (Apple SCV relay) with ESMTP id 1818530041;
	Thu, 14 Jun 2007 17:26:06 -0700 (PDT)
In-Reply-To: <20070615000320.GA94458@rot13.obsecurity.org>
References: <20070614084817.GA81087@rot13.obsecurity.org>
	<449EAA15-A4BC-4AAE-B3ED-B65E7A079877@mac.com>
	<20070615000320.GA94458@rot13.obsecurity.org>
Mime-Version: 1.0 (Apple Message framework v752.2)
Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed
Message-Id: <7A845D91-435E-4F1C-A05A-270A04DAC20E@mac.com>
Content-Transfer-Encoding: 7bit
From: Chuck Swiger <cswiger@mac.com>
Date: Thu, 14 Jun 2007 17:26:05 -0700
To: Kris Kennaway <kris@obsecurity.org>
X-Mailer: Apple Mail (2.752.2)
X-Brightmail-Tracker: AAAAAA==
Cc: smp@FreeBSD.org, performance@FreeBSD.org, current@FreeBSD.org
Subject: Re: BIND 9.4.1 performance on FreeBSD 6.2 vs. 7.0
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 15 Jun 2007 00:26:06 -0000

On Jun 14, 2007, at 5:03 PM, Kris Kennaway wrote:
>> It's at least arguable that doing queries against a data set
>> including a bunch of repeats is "skewed" in a more realistic
>> fashion. :-)  A quick look at some of the data sources I have handy
>> such as http access logs or Squid proxy logs suggests that (for
>> example) out of a database of 17+ million requests, there were only
>> 46000 unique IPs involved.
>
> There were still lots of repeats, just some of them were repeated
> hundreds of thousands of times - I stripped about a dozen of those
> (googlebots, I'm looking at you ;-), leaving a distribution that was
> less biased to the top end.

Heh, yes, it's surprising how happy a webspider is to crawl around a  
heavily-interlinked site.  :-)

Perhaps someone ought to add a:

   Crawl-delay: 600

...statement to http://www.freebsd.org/robots.txt...?

>> You might find it interesting to compare doing queries against your
>> raw and filtered datasets, just to see what kind of difference you
>> get, if any.
>
> Cached queries perform much better, as you might expect.  As an
> estimate I was getting query rates exceeding 120000 qps when serving
> entirely out of cache, and I dont think I reached the upper bound yet.

Sure, anything cached or anything the nameserver is authoritative for  
is going to be directly answerable without having to do an external  
recursive query.

>> What was the external network connectivity in terms of speed?  The
>> docs suggest you need something like a 16MBs up/8 Mbs down
>> connectivity in order to get up to 50K requests/sec....
>
> I wasn't seeing anything close to this, so I guess it depends how much
> data is being returned by the queries (I was doing PTR lookups).  I
> forget the exact numbers but it wasn't exceeding about 10Mbit in both
> directions, which should have been well within link capacity.  Also
> the lock profiling data bears out the interpretation that it was BIND
> that was becoming saturated and not the hardware.

OK, thanks for the info.  Maybe I'll get a chance to run some numbers  
of my own testing, if I can free up some time from WWDC....

>> [ ... ]
>>> It would be interesting to test BIND performance when acting as an
>>> authoritative server, which probably has very different performance
>>> characteristics; the difficulty there is getting access to a  
>>> suitably
>>> interesting and representative zone file and query data.
>>
>> I suppose you could also set up a test nameserver which claims to be
>> authoritative for all of in-addr.arpa, and set up a bunch (65K?) /16
>> reverse zone files, and then test against real unmodified IPs, but it
>> would be easier to do something like this:
>>
>> Set up a nameserver which is authoritative for 1.10.in-addr.arpa (ie,
>> the reverse zone for 10.1/16), and use a zonefile with the $GENERATE
>> directive to populate your PTR records:
>>
>> [ ...zonefile snipped for brevity... ]
>>
>> ...and then feed it a query database consisting of PTR lookups.  If
>> you wanted to, you could take your existing IP database, and glue the
>> last two octets of the real IPs onto 10.1 to produce a reasonable
>> assortment of IPs to perform a reverse lookup upon.
>
> I could construct something like this but I'd prefer a more
> "realistic" workload (i.e. an uneven distribution of queries against
> different subsets of the data).  I don't have a good idea what
> "realistic" means here, which makes it hard to construct one from
> scratch.  Fortunately I have an offer from someone for access to a
> real large zone file and a large sample of queries.

Ah, very good, then.

While I expect there to be quite a difference between recursive  
queries vs. authoritative/locally answerable queries (after all, that  
seems to be why both dnsperf and resperf were created as distinct  
programs), I'm not convinced that there is too much difference  
between doing reverse lookups for one set of IPs versus another if  
those IPs are all in zones the server is authoritative for.

-- 
-Chuck