From owner-svn-src-head@FreeBSD.ORG Tue Jun 25 03:24:43 2013 Return-Path: Delivered-To: svn-src-head@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 09E2B29B; Tue, 25 Jun 2013 03:24:43 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail108.syd.optusnet.com.au (mail108.syd.optusnet.com.au [211.29.132.59]) by mx1.freebsd.org (Postfix) with ESMTP id 90A9C1030; Tue, 25 Jun 2013 03:24:42 +0000 (UTC) Received: from c122-106-156-23.carlnfd1.nsw.optusnet.com.au (c122-106-156-23.carlnfd1.nsw.optusnet.com.au [122.106.156.23]) by mail108.syd.optusnet.com.au (Postfix) with ESMTPS id 1467C1A216D; Tue, 25 Jun 2013 13:24:34 +1000 (EST) Date: Tue, 25 Jun 2013 13:24:32 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Bruce Evans Subject: Re: svn commit: r252032 - head/sys/amd64/include In-Reply-To: <20130625102023.K899@besplex.bde.org> Message-ID: <20130625130051.W1285@besplex.bde.org> References: <20130621081116.E1151@besplex.bde.org> <20130621090207.F1318@besplex.bde.org> <20130621064901.GS1214@FreeBSD.org> <20130621184140.G848@besplex.bde.org> <20130621135427.GA1214@FreeBSD.org> <20130622110352.J2033@besplex.bde.org> <20130622124832.S2347@besplex.bde.org> <20130622174921.I3112@besplex.bde.org> <20130623073343.GY91021@kib.kiev.ua> <20130623181458.J2256@besplex.bde.org> <20130624170849.GH91021@kib.kiev.ua> <20130625102023.K899@besplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.0 cv=eqSHVfVX c=1 sm=1 a=0l9hOOMEwYoA:10 a=kj9zAlcOel0A:10 a=PO7r1zJSAAAA:8 a=JzwRw_2MAAAA:8 a=gvJhbHXk4isA:10 a=hzzoW8uqM0_kw7-nK-MA:9 a=CjuIK1q_8ugA:10 a=ebeQFi2P/qHVC0Yw9JDJ4g==:117 Cc: Konstantin Belousov , svn-src-head@FreeBSD.org, svn-src-all@FreeBSD.org, Gleb Smirnoff , src-committers@FreeBSD.org X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 25 Jun 2013 03:24:43 -0000 On Tue, 25 Jun 2013, I wrote: > My current best design: > - use ordinary mutexes to protect counter fetches in non-per-CPU contexts. > - use native-sized or always 32-bit counters. Counter updates are done > by a single addl on i386. Fix pcpu.h on arches other than amd64 and > i386 and use the same method as there. > - counter fetches add the native-sized pcpu counters to 64-bit non-pcpu > counters, when the native-size counters are in danger of overflowing > or always, under the mutex. Transferring data uses an ordinary > atomic_cmpset. To avoid ifdefs, always use u_int pcpu counters. > The 64-bit non-pcpu counters can easily be changed to pcpu counters > if the API is extended to support pcpu data. > - run a daemon every few minutes to fetch all the counters, so that > the native-sized counters are in no danger of overflowing on systems > that don't run statistics programs often enough to fetch the counters > to actually use. There is at least 1 counter decrement (add -1) in tcp, so the native counters need to be signed. > ... > With my design: > > extern struct mtx counter_locks[]; > extern uint64_t counters[]; This is pseudo-code. The extra structure must be dynamically allocated with each counter. I'm not sure how to do that. uint64_t_pcpu_zone is specialized for pcpu counters, and a non-pcpu part is needed. > uint64_t r; > volatile u_int *p; > u_int v; Change to int. > int cnum; > > cnum = ctonum(c); > mtx_lock(&counter_locks[cnum]); /* might not need 1 per counter */ > r = counters[cnum]; > for (i = 0; i < mp_ncpus; i++) { > p = (u_int *)((char *)c + sizeof(struct pcpu) * i); Change to int *. > v = *p; /* don't care if it is stale */ > if (v >= 0x80000000) { Change the critical level to 2 critical levels, 0x40000000 for positive values and -0x40000000 for negative values. > /* Transfer only when above critical level. */ > while (atomic_cmpset_rel_int(p, v, 0) == 0) > v = *p; /* still don't care if it is stale */ > counters[cnum] += v; Even though full counter values are not expected to become negative, the native counters can easily become negative when a decrement occurs after the transfer resets them to 0. > } > r += v; > } > mtx_unlock(&counter_locks[cnum]); > return (r); > > Mutexes give some slowness in the fetching code, but fetches eare expected > to be relatively rare. > > I added the complication to usually avoid atomic ops at the last minute. > The complication might not be worth it. The complication is to test v so as to normally avoid doing the transfer and its atomic ops (and to not use atomic ops for loading v). The complication is larger with 2 thresholds. If we always transferred, then *p would usually be small and often 0, so that decrementing it would often make it -1. This -1 must be transferred by adding -1, not by adding 0xfffffffff. Changing the type of the native counter to int gives this. Bruce