From owner-freebsd-current  Mon Sep 22 10:55:37 1997
Return-Path: <owner-freebsd-current>
Received: (from root@localhost)
          by hub.freebsd.org (8.8.7/8.8.7) id KAA04399
          for current-outgoing; Mon, 22 Sep 1997 10:55:37 -0700 (PDT)
Received: from pluto.plutotech.com (root@mail.plutotech.com [206.168.67.137])
          by hub.freebsd.org (8.8.7/8.8.7) with ESMTP id KAA04393
          for <current@freebsd.org>; Mon, 22 Sep 1997 10:55:34 -0700 (PDT)
Received: from narnia.plutotech.com (narnia.plutotech.com [206.168.67.130])
	by pluto.plutotech.com (8.8.5/8.8.5) with ESMTP id LAA25129;
	Mon, 22 Sep 1997 11:55:22 -0600 (MDT)
Message-Id: <199709221755.LAA25129@pluto.plutotech.com>
X-Mailer: exmh version 2.0zeta 7/24/97
To: Nate Williams <nate@mt.sri.com>
cc: "Justin T. Gibbs" <gibbs@plutotech.com>, Bruce Evans <bde@zeta.org.au>,
        current@freebsd.org
Subject: Re: cvs commit: src/sys/conf files src/sys/dev/vx if_vx.c if_vxreg.h src/sys/i386/apm apm.c src/sys/i386/conf GENERIC files.i386 src/sys/i386/eisa 3c5x9.c aha1742.c aic7770.c bt74x.c eisaconf.c eisaconf.h if_fea.c if_vx_eisa.c src/sys/i386/i386 autoconf.c ... 
In-reply-to: Your message of "Mon, 22 Sep 1997 10:30:40 MDT."
             <199709221630.KAA01072@rocky.mt.sri.com> 
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Date: Mon, 22 Sep 1997 11:55:12 -0600
From: "Justin T. Gibbs" <gibbs@plutotech.com>
Sender: owner-freebsd-current@freebsd.org
X-Loop: FreeBSD.org
Precedence: bulk

>Except that processing a tick now runs O(n), where n is the # of
>elements in the callout wheel (which gets pretty big when you have 630
>callouts in your example above, since the initial size of the callout
>wheel is set), so I still don't see a big win.  I'm not saying that it's
>any worse/better than the existing scheme, in fact I don't see a win.

Let's be more precise.
	n = number of active callouts
	h = length of a hash chain in a call wheel bucket
	x = number of callout expired on a given tick.

Old implementation:
	timeout = O(n)
	untimeout = O(n)
	softclock = O(x) ~= O(1)

New implemenation:
	timeout = O(1)
	untimeout = O(1)
	softclock = O(h)

Now depending on how you size your callwheel,  h << n.  With a MAXUSERS
of 32, the callwheel is 256 entries big if I recall correctly.

But running time isn't the only thing to consider.  As I mentioned
before, untimeout/timeout are often called from an interrupt context
and the old algorithm caused an indeterminate delay in this scenario,
potentially causing problems for that device and any that share the
same interrupt.

You also have to consider that timeout/untimeout calls occur at 
indeterminate rates, but softclock runs at a fixed rate meaning that
the amount of work it performs scales better than if that work was
pushed off into either of timeout or untimeout.

>Finally, you'll notice that I'm not proposing the we rip out the new
>solution (since I have nothing better to propose, although the original
>solution could be sped up by adding the setcallout()/unsetcallout()
>additions w/out the call wheel).

That only fixes untimeout.

>I just want to understand *why* it's
>making things better for the average user, since the average user
>doesn't have > 10 drives running tags.  Penalizing the 'extreme' user
>doesn't seem to be a big win in my book.

The typical user will have perhaps 30 callouts outstanding.  In this
scenario, both algorithms work fairly well.  Although PHK's BB profiling
did show that the new implementation was faster, he didn't perceive
a difference in performance.  Since he was using IDE devices, he had
even fewer timeouts active than if he were using SCSI disks.

The main difference between the two approaches is that one algorithm 
scales better than the other.

>In summary, all of the operations are called 'often'.
>
>1) clock ticks
>2) registering timeouts
>3) de-registering timeouts
>
>I would like to see some #'s that show us the # & amount of time it
>takes in section on a 'normal' system, using both approaches.  If
>someone has some good ideas how to do that, I'm willing to learn
>something and go do it.

Allocate an array of ints of size ncallout.  In softclock, increment the
array entry corresponding to the number of entries traversed in a softclock
run.  By periodically scanning this array, you'll get a good idea of the
value of 'h' for you system.  Add three more ints that count the number of
invocations of softclock, untimeout, and timeout and you should be able to
draw conclusions from that.

You may also want to look at the other paper I mention in the timeout.9
man page.  It may have some of the statistics you want, but I don't know
for sure that they ever implemented their schemes in a real system.

>Nate

--
Justin T. Gibbs
===========================================
  FreeBSD: Turning PCs into workstations
===========================================