Date: Tue, 16 Mar 2004 12:31:42 +0100 (CET) From: Harti Brandt <brandt@fokus.fraunhofer.de> To: Brooks Davis <brooks@one-eyed-alien.net> Cc: Max Laier <max@love2party.net> Subject: Re: Byte counters reset at ~4GB Message-ID: <20040316122840.E28777@beagle.fokus.fraunhofer.de> In-Reply-To: <20040316014206.GA12382@Odin.AC.HMC.Edu> References: <2650.192.168.0.200.1079393908.squirrel@192.168.0.1> <2662.192.168.0.200.1079396323.squirrel@192.168.0.1> <2697.192.168.0.200.1079398101.squirrel@192.168.0.1> <20040316014206.GA12382@Odin.AC.HMC.Edu>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 15 Mar 2004, Brooks Davis wrote: BD>On Mon, Mar 15, 2004 at 07:48:21PM -0500, Mike Jakubik wrote: BD>> Max Laier said: BD>> BD>> > Sure, you measure it ;) ... no, of course it is more expensive to update a BD>> > 64bit counter on a 32bit arch, but the key (once again) is descision: BD>> > While BD>> > (almost) all of the pf counters are 64bit types you can configure it not BD>> > to BD>> > use the loginterface or whatsoever more. So it's up to you: You need 64bit BD>> > counters? You shall have them! You need *fast* 64bit counters? AMD sells BD>> > nice processors (they say)! ... you get the idea. BD>> BD>> Got it. In just curious though... realistically, how big of an impact on BD>> performance is this on a modern CPU? Is it not simply the original 32bit BD>> calculation x 2? BD> BD>No, you have to do overflow handling so that adds some to the cost. BD> BD>I was curious what the actual overhead was so I ran the following BD>program with both uint32_t and uint64_t counters. With 64-bit counters, BD>it was a bit over four times slower on a the dual 2.2GHz Xeon (~2sec vs BD>~8.4sec). On a dual opteron, the 32-bit math had a slight edge, but BD>not much. Intestingly, runtime was longer then on the Xeon (~3.1s for BD>32-bit and ~3.8 for 64-bit.) BD> BD>If you do this test, be sure not to use any optimizer flags or the whole BD>loop gets optimized out. BD> BD>-- Brooks BD> BD>#include <stdio.h> BD>#include <stdint.h> BD> BD>int BD>main (int argc, char **argv) BD>{ BD> uint32_t j = 0; BD> BD> for (j = 0; j < 1000000000; j++) {} BD> printf("%d\n", j); BD>} Isn't the actual problem the required atomicity? While on 32-bit architectures you can increment a 32-bit value without taking a lock, you need a lock to increment a 64-bit value. harti
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20040316122840.E28777>