Date: Thu, 17 Dec 2009 19:10:25 +1100 (EST) From: Bruce Evans <brde@optusnet.com.au> To: Bruce Evans <brde@optusnet.com.au> Cc: Harti Brandt <harti@FreeBSD.org>, John Baldwin <jhb@FreeBSD.org>, freebsd-arch@FreeBSD.org Subject: Re: network statistics in SMP Message-ID: <20091217181553.Q36492@delplex.bde.org> In-Reply-To: <20091217021211.O35780@delplex.bde.org> References: <20091215103759.P97203@beagle.kn.op.dlr.de> <200912150812.35521.jhb@freebsd.org> <20091215183859.S53283@beagle.kn.op.dlr.de> <200912151313.28326.jhb@freebsd.org> <20091217021211.O35780@delplex.bde.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 17 Dec 2009, Bruce Evans wrote: > ... > Actually, you can do better with a generation count. The generation count > would at least tell you if you lost a race. The generation count should > only be maintained while summing other counts, since it must be global and > incremented by atomic ops (to avoid the races without even more costly > locking which would make the generation count irrelevant) so maintaining > it all the time would more than defeat the point of having per-CPU counters > (all CPUs would compete for it at the same address). ... Actually3, the generation count can be per-CPU and accessed without atomic ops (provided reads of it on other CPUs return a consistent possibly-stale value). > Simple version: > - bloat PCPU_INC(var) to do something like the following: > if (PCPU_GET(counter_summing_mode)) > atomic_add_int(&counter_gen, 1); > OLD_PCPU_INC(var); > - set PCPU_GET(counter_summing_mode) while summing. Needs heavyweight > synchronization (IPIs?) to set and clear the flag on other CPUs. Must > also make all other CPUs flush pending writes (so that a 64-bit counter > cannot be half-written at the beginning of the summing), but this will > happen automatically with any heavyweight synchronization. Better version: - bloat PCPU_INC(var) to do something like the following: OLD_PCPU_INC(counter_gen); OLD_PCPU_INC(var); - sum all PCPU_GET(counter_gen) before summing the subset of ordinary counters of interest. This gives a value <= the unracy current sum of the generation counters, by reading consistent possibly-stale values. Then sync all counters as above. Note that the order of the above increments would be backwards if we used write ordering instead of a full sync -- with only write ordering the sum of the generation counts would be too high here if we happened to read it on 1 of the CPUs in between the above increments. This order is chosen since I don't want to have 2 increments of counter_gen in the above and/or further complications and bloat, so there must be some order, and the above order works right later. Then sum selected ordinary counters. Then sync the generation counters (or all counters, or arrange for write ordering) as above. Then sum the generation counters. This gives a value >= the unracy current sum at the end of summing the selected counters. Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20091217181553.Q36492>