From owner-svn-src-all@FreeBSD.ORG Sun Aug 3 17:52:27 2014 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 3255E875; Sun, 3 Aug 2014 17:52:27 +0000 (UTC) Received: from mail109.syd.optusnet.com.au (mail109.syd.optusnet.com.au [211.29.132.80]) by mx1.freebsd.org (Postfix) with ESMTP id B917E2A43; Sun, 3 Aug 2014 17:52:26 +0000 (UTC) Received: from c122-106-147-133.carlnfd1.nsw.optusnet.com.au (c122-106-147-133.carlnfd1.nsw.optusnet.com.au [122.106.147.133]) by mail109.syd.optusnet.com.au (Postfix) with ESMTPS id D727BD61C2D; Mon, 4 Aug 2014 03:52:18 +1000 (EST) Date: Mon, 4 Aug 2014 03:52:18 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Xin LI Subject: Re: svn commit: r269466 - in head/sys/cddl: compat/opensolaris/sys contrib/opensolaris/uts/common/fs/zfs In-Reply-To: <53de0547.5601.7c60e05f@svn.freebsd.org> Message-ID: <20140804035125.A858@besplex.bde.org> References: <53de0547.5601.7c60e05f@svn.freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.1 cv=dZS5gxne c=1 sm=1 tr=0 a=7NqvjVvQucbO2RlWB8PEog==:117 a=PO7r1zJSAAAA:8 a=f4jRgUPOQhAA:10 a=zWNf4g-b35QA:10 a=kj9zAlcOel0A:10 a=JzwRw_2MAAAA:8 a=6I5d2MoRAAAA:8 a=PP_50AfKc3DqIB_g6tMA:9 a=CjuIK1q_8ugA:10 Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 03 Aug 2014 17:52:27 -0000 On Sun, 3 Aug 2014, Xin LI wrote: > Log: > Revert r269404 and use cpu_ticks() for dbuf allocation. > > Encode CPU's number by XOR'ing the CPU ID against the 64-bit cpu_ticks(). > > Reviewed by: mav, gibbs > Differential Revision: https://phabric.freebsd.org/D521 > MFC after: 2 weeks This is still broken. > Modified: head/sys/cddl/compat/opensolaris/sys/time.h > ============================================================================== > --- head/sys/cddl/compat/opensolaris/sys/time.h Sun Aug 3 09:40:50 2014 (r269465) > +++ head/sys/cddl/compat/opensolaris/sys/time.h Sun Aug 3 09:47:51 2014 (r269466) > @@ -60,17 +60,6 @@ gethrtime(void) { > struct timespec ts; > hrtime_t nsec; > > - nanouptime(&ts); > - nsec = (hrtime_t)ts.tv_sec * NANOSEC + ts.tv_nsec; > - return (nsec); > -} The in-between version wasn't broken. nanouptime() is not too slow to use provided it is properly implemented. Unfortunately, this includes proper implementation in the hardware. > - > -static __inline hrtime_t > -gethrtime_waitfree(void) { > - > - struct timespec ts; > - hrtime_t nsec; > - > getnanouptime(&ts); > nsec = (hrtime_t)ts.tv_sec * NANOSEC + ts.tv_nsec; > return (nsec); The existence of getnanouptime() is a bug. Someone might use it. > @@ -78,6 +67,7 @@ gethrtime_waitfree(void) { > > #define gethrestime_sec() (time_second) > #define gethrestime(ts) getnanotime(ts) > +#define gethrtime_waitfree() gethrtime() > > extern int nsec_per_tick; /* nanoseconds per clock tick */ > > > Modified: head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c > ============================================================================== > --- head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Sun Aug 3 09:40:50 2014 (r269465) > +++ head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Sun Aug 3 09:47:51 2014 (r269466) > @@ -70,7 +70,11 @@ dbuf_cons(void *vdb, void *unused, int k > cv_init(&db->db_changed, NULL, CV_DEFAULT, NULL); > refcount_create(&db->db_holds); > > +#if defined(illumos) || !defined(_KERNEL) > db->db_creation = gethrtime(); > +#else > + db->db_creation = cpu_ticks() ^ ((uint64_t)CPU_SEQID << 48); > +#endif > > return (0); > } cpu_ticks() is a hack to work around nanouptime() being too slow with some hardware. It is much harder to use than nanouptime(). There are no correct uses of it, including its main use. Frequency changes are hard to handle correctly. Here the bug is that it wraps after just 65536+ seconds (18+ hours) at 4GHz, even on x86 where ticks have 64 bits. This gives non-uniqeness in another way. The wrap is increased by wastefully supporting 65536 CPUs. This leaves only 48 bits before wrap occurs (the high bits are XORed. This helps for uniqueness but doesn't guarantee it. E.g., the value of 2**48 on CPU0 is indistingishable from the value of 0 on CPU1. These values occur just 65536+ seconds apart at 4GHz). There is likely to be noise in the lower bits, but this doesn't guarantee uniqueness. On other arches, ticks may have many fewer than 64 bits. Their main use assumes that they have enough bits to represent uptimes of several years, so the above probably has no additional problems. Arches without a TSC mostly use nanouptime() for ticks anyway. These tend to be the slowest arches. gibbs' original suggestion of: db->db_creation = pcup_counter++ ^ ((uint64_t)CPU_SEQID << 48); probably works. Now the counter doesn't race off to infinity at 4GHz so it is obviously unique for 2**48 creations per CPU. It increases in a different unscaled, often nonlinear timescale with much longer ticks. This is also much faster, especially on slower arches without a TSC that emulate cpu_ticks() using nanoptime(). Bruce