Date: Sun, 16 Nov 2003 12:48:14 +1100 From: Peter Jeremy <peterjeremy@optushome.com.au> To: Andre Oppermann <oppermann@pipeline.ch> Cc: cvs-all@freebsd.org Subject: Re: cvs commit: src/sys/netinet in_var.h ip_fastfwd.c ip_flow.c ip_flow.h ip_input.c ip_output.c src/sys/sys mbuf.h src/sys/conf files src/sys/net if_arcsubr.c if_ef.c if_ethersubr.c if_fddisubr.c if_iso88025subr.c if_ppp.c Message-ID: <20031116014814.GB74756@server.vk2pj.dyndns.org> In-Reply-To: <3FB60181.4256A519@pipeline.ch> References: <200311142102.hAEL2Nen073186@repoman.freebsd.org> <20031114153145.A54064@xorpc.icir.org> <3FB593F5.1053E7E2@pipeline.ch> <20031115002921.B68056@xorpc.icir.org> <3FB60181.4256A519@pipeline.ch>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Nov 15, 2003 at 11:35:45AM +0100, Andre Oppermann wrote: > To put this more into perspective wrt counter wrapping, on >my interfaces I have a byte counter wrap every 40 minutes or so. >So the true ratio is probably even far less than one percent and >more in the region of one per mille. The wrapping looks really ugly >on MRTG and RRtool graphs. Interface counters should be 64bit or >they become useless with todays traffic levels... A perennial favourite. Atomically incremented 64-bit counters are _very_ expensive on i386 and the concensus is that the cost is unjustified in the general case. Feel free to supply patches to optionally (at build-time) allow selection of 32-bit or 64-bit counters. A work-around would be to simulate the top 32-bits by counting rollovers in the bottom 32 bits (though this requires co-operation by all consumers that want to see 64-bit values as well as a background process). I notice that even DEC/Compaq/HP Tru64 uses 32-bit counters for network stats. >> i am pretty sure that in any non-trivial case you will end up having >> both the slow path and the fast path conflicting for the instruction >> cache. Merging them might help -- i have seen many cases where >> inlining code as opposed to explicit function calls makes things >> slower for this precise reason. > >I will try to measure that with more precision. You did have >code which was able to record and timestamp events several >thousand times per second. Do still have that code somewhere? I've done similar things a couple of times using circular buffers along the following lines: #define RING_SIZE (1 << some_suitable_value) int next_entry; struct entry { some_time_t now; foo_t event; } ring[RING_SIZE]; void __inline insert_event(foo_t event) { int ix; /* following two lines need to be atomic to make this re-entrant */ ix = next_entry; next_entry = (ix + 1) & (RING_SIZE - 1); ring[ix].now = read_time(); ring[ix].event = event; } In userland, mmap(2) next_entry and ring to unload the events. Pick RING_SIZE and the time types to suit requirements. The TSC has the lowest overhead but worst jitter. Peter
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20031116014814.GB74756>