From owner-freebsd-arch@FreeBSD.ORG Wed Apr 3 10:02:11 2013 Return-Path: Delivered-To: arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 6C1047DE; Wed, 3 Apr 2013 10:02:11 +0000 (UTC) (envelope-from pawel@dawidek.net) Received: from mail.dawidek.net (garage.dawidek.net [91.121.88.72]) by mx1.freebsd.org (Postfix) with ESMTP id 07E131B4; Wed, 3 Apr 2013 10:02:10 +0000 (UTC) Received: from localhost (58.wheelsystems.com [83.12.187.58]) by mail.dawidek.net (Postfix) with ESMTPSA id 93986C07; Wed, 3 Apr 2013 11:58:35 +0200 (CEST) Date: Wed, 3 Apr 2013 12:04:01 +0200 From: Pawel Jakub Dawidek To: Luigi Rizzo Subject: Re: [CFR][CFT] counter(9): new API for faster and raceless counters Message-ID: <20130403100401.GA1349@garage.freebsd.pl> References: <20130401115128.GZ76816@FreeBSD.org> <20130402232606.GC1810@garage.freebsd.pl> <20130403002846.GB15334@onelab2.iet.unipi.it> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="ZPt4rx8FFjLCG7dd" Content-Disposition: inline In-Reply-To: <20130403002846.GB15334@onelab2.iet.unipi.it> X-OS: FreeBSD 10.0-CURRENT amd64 User-Agent: Mutt/1.5.21 (2010-09-15) Cc: arch@FreeBSD.org, Gleb Smirnoff X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Apr 2013 10:02:11 -0000 --ZPt4rx8FFjLCG7dd Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Apr 03, 2013 at 02:28:46AM +0200, Luigi Rizzo wrote: > On Wed, Apr 03, 2013 at 01:26:07AM +0200, Pawel Jakub Dawidek wrote: > > On Mon, Apr 01, 2013 at 03:51:28PM +0400, Gleb Smirnoff wrote: > > > Hi! > > >=20 > > > Together with Konstantin Belousov (kib@) we developed a new API tha= t is > > > initially purposed for (but not limited to) collecting statistical > > > data in kernel. > >=20 > > Is there any plan to implement universal way of exporting those > > statistics out of the kernel? > >=20 > > Solaris has a framework for in-kernel statistics, which are exported via > > kstat tool. For ZFS I export them via sysctl. If you have ZFS loaded you > > can try 'sysctl kstat'. > >=20 > > It would be nice for counter_u64_alloc() to take additional argument > > 'name' and to create sysctl for the counter automatically. We could then > > slowly start migrating userland tools to use sysctls (or some wrapper > > userland API), but we immediately make those statistics available for > > use in scripts. >=20 > that is an interesting idea but i believe it can be effectively > built as a wrapper on top of the counter_u64_alloc() routine: >=20 > name_counter(counter_t c, const char *fmt, ...); > free_named_counter(counter_t c); >=20 > After all the name->counter mapping is unidirectional, > and possibly not even necessary on every single counter > (think of ipfw dynamic rules, created on packet arrivals, so > the counter alloc/dealloc needs to be fast). Right, although I'd optimize API naming and usage for the common case. Eventhough we do want to able to alloc/free counters quickly sometimes, most of the time we don't care about alloc/free speed and we would like to have a name. Having a name argument that could be NULL for short-living counter would allow to call only one allocation function in the common case (actually in every case). > It might be useful for the name_counter() routine to support > a printf-style argument to make it easy to build names. Indeed. > > > o Tiny API for counter(9): > > >=20 > > > counter_u64_t > > > counter_u64_alloc(int wait); > > >=20 > > > void > > > counter_u64_free(counter_u64_t cnt); > > >=20 > > > void > > > counter_u64_add(counter_u64_t cnt, uint64_t inc); > > >=20 > > > uint64_t > > > counter_u64_fetch(counter_u64_t cnt); > >=20 > > Do you really expect other types in the future? If so, could we at least > > create generic counter_t that internally keeps the type? >=20 > I read the u64 in the name mostly as a reminder to users > of the counter size.=20 Should the users care? As a user of this KPI I'd prefer to have simpler name and just assume the counter is big enough. > It might actually make sense is to change the type to s64. > This way we could have counters that go negative, > and also use them to accumulate sbintime_t values. Agreed, int64_t seems better. > But otherwise i am not sure that we want other types. >=20 > u32/s32 might save atomic/critical_enter ops on some archs, > but they saturate so quickly that probably are a bad idea. > And 63/64 bits are quite large already. Right, I don't think 32bit counters are needed at all and I can't find any use for 128bit counters either. --=20 Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://tupytaj.pl --ZPt4rx8FFjLCG7dd Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iEYEARECAAYFAlFb/pEACgkQForvXbEpPzRuNQCeJF3FFX/ScKUvnlfQLICFTdkt zjUAoMXpy82nx9Ukq8RNB1g1JC4EYU0H =Qm/z -----END PGP SIGNATURE----- --ZPt4rx8FFjLCG7dd--