Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 17 Dec 2009 19:10:25 +1100 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Bruce Evans <brde@optusnet.com.au>
Cc:        Harti Brandt <harti@FreeBSD.org>, John Baldwin <jhb@FreeBSD.org>, freebsd-arch@FreeBSD.org
Subject:   Re: network statistics in SMP
Message-ID:  <20091217181553.Q36492@delplex.bde.org>
In-Reply-To: <20091217021211.O35780@delplex.bde.org>
References:  <20091215103759.P97203@beagle.kn.op.dlr.de> <200912150812.35521.jhb@freebsd.org> <20091215183859.S53283@beagle.kn.op.dlr.de> <200912151313.28326.jhb@freebsd.org> <20091217021211.O35780@delplex.bde.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 17 Dec 2009, Bruce Evans wrote:

> ...
> Actually, you can do better with a generation count.  The generation count
> would at least tell you if you lost a race.  The generation count should
> only be maintained while summing other counts, since it must be global and
> incremented by atomic ops (to avoid the races without even more costly
> locking which would make the generation count irrelevant) so maintaining
> it all the time would more than defeat the point of having per-CPU counters
> (all CPUs would compete for it at the same address).  ...

Actually3, the generation count can be per-CPU and accessed without atomic
ops (provided reads of it on other CPUs return a consistent possibly-stale
value).

> Simple version:
> - bloat PCPU_INC(var) to do something like the following:
> 	if (PCPU_GET(counter_summing_mode))
> 		atomic_add_int(&counter_gen, 1);
> 	OLD_PCPU_INC(var);
> - set PCPU_GET(counter_summing_mode) while summing.  Needs heavyweight
>  synchronization (IPIs?) to set and clear the flag on other CPUs.  Must
>  also make all other CPUs flush pending writes (so that a 64-bit counter
>  cannot be half-written at the beginning of the summing), but this will
>  happen automatically with any heavyweight synchronization.

Better version:
- bloat PCPU_INC(var) to do something like the following:
 	OLD_PCPU_INC(counter_gen);
 	OLD_PCPU_INC(var);
- sum all PCPU_GET(counter_gen) before summing the subset of ordinary
   counters of interest.  This gives a value <= the unracy current sum
   of the generation counters, by reading consistent possibly-stale
   values.

   Then sync all counters as above.  Note that the order of the above
   increments would be backwards if we used write ordering instead of
   a full sync -- with only write ordering the sum of the generation
   counts would be too high here if we happened to read it on 1 of the
   CPUs in between the above increments.  This order is chosen since
   I don't want to have 2 increments of counter_gen in the above and/or
   further complications and bloat, so there must be some order, and
   the above order works right later.

   Then sum selected ordinary counters.

   Then sync the generation counters (or all counters, or arrange for
   write ordering) as above.

   Then sum the generation counters.  This gives a value >= the unracy
   current sum at the end of summing the selected counters.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20091217181553.Q36492>