Date: Sun, 24 Oct 2010 19:50:26 +0100 (BST) From: Robert Watson <rwatson@FreeBSD.org> To: Lawrence Stewart <lstewart@freebsd.org> Cc: freebsd-net@freebsd.org, Andre Oppermann <andre@freebsd.org>, Sriram Gorti <gsriram@gmail.com> Subject: Re: Question on TCP reassembly counter Message-ID: <alpine.BSF.2.00.1010241948240.90390@fledge.watson.org> In-Reply-To: <4CC2254C.7070104@freebsd.org> References: <AANLkTikWWmrnBy_DGgSsDbh6NAzWGKCWiFPnCRkwoDRi@mail.gmail.com> <4CA5D1F0.3000307@freebsd.org> <4CA9B6AC.20403@freebsd.org> <4CBB6CE9.1030009@freebsd.org> <AANLkTinvt4kCQNkf1ueDw0CFaYE9SELsBK8nR2yQKytZ@mail.gmail.com> <4CC2254C.7070104@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, 23 Oct 2010, Lawrence Stewart wrote: >> One observation though: net.inet.tcp.reass.cursegments was non-zero (it was >> just 1) after 30 rounds, where each round is (as earlier) 15-concurrent >> instances of netperf for 20s. This was on the netserver side. And, it was >> zero before the netperf runs. On the other hand, Andre told me (in a >> separate mail) that this counter is not relevant anymore - so, should I >> just ignore it ? > > It's relevant, just not guaranteed to be 100% accurate at any given point in > time. The value is calculated based on synchronised access to UMA zone stats > and unsynchronised access to UMA per-cpu zone stats. The latter is safe, but > causes the overall result to potentially be inaccurate due to use of stale > data. The accuracy vs overhead tradeoff was deemed worthwhile for > informational counters like this one. > > That being said, I would not expect the value to remain persistently at 1 > after all TCP activity has finished on the machine. It won't affect > performance, but I'm curious to know if the calculation method has a flaw. > I'll try to reproduce locally, but can you please confirm if the value stays > at 1 even after many minutes of no TCP activity? It's possible we should revisit the current synchronisation model for per-CPU caches in this regard. We switched to soft critical sessions when the P4 Xeon was a popular CPU line -- it had extortionately expensive atomic operations, even when a cache line was in the local cache. If we were to move back to mutexes for per-CPU caches, then we could acquire all the locks in sequence and get an atomic snapshot across them all (if desired). This isn't a hard technical change, but would require very careful performance evaluation. Robert
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?alpine.BSF.2.00.1010241948240.90390>