Date: Tue, 31 May 2011 09:48:26 -0700 From: Artem Belevich <art@freebsd.org> To: Andriy Gapon <avg@freebsd.org> Cc: freebsd-fs@freebsd.org Subject: Re: ZFS: arc_reclaim_thread running 100%, 8.1-RELEASE, LBOLT related Message-ID: <BANLkTinQkS36SQKpZhsavx3C_ad838DG=g@mail.gmail.com> In-Reply-To: <4DE50811.5060606@FreeBSD.org> References: <0EFD28CD-F2E9-4AE2-B927-1D327EC99DB9@bitgravity.com> <BANLkTikVq0-En7=4Dy_dTf=tM55Cqou_mw@mail.gmail.com> <4DE50811.5060606@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, May 31, 2011 at 8:24 AM, Andriy Gapon <avg@freebsd.org> wrote: >>> =A0 =A0 =A0 =A0 65 =A0 =A0 =A0 =A0 nsec =3D (hrtime_t)ts.tv_sec * NANOS= EC + ts.tv_nsec; >> >> Yup. This would indeed overflow in ~106.75 days. > > Have you referred to the LBOLT above? > gethrtime() should have several hundred years before overflowing. hrtime_t is 64-bit. NANOSEC=3D1000000000. When it's time to use LBOLT, we further multiply number of seconds by HZ: >> 41 #define LBOLT ((gethrtime() * hz) / NANOSEC) In the end we want 64-bit scalar to hold number of seconds times 10e12. 0x7fffffffffffffff/1000000000000 =3D 9223372 # number of seconds before signed overflow 9223372/(24*60*60) --> 106 # .. or about 106 days >> The side effect is that it limits bolt resolution to hz units. With >> HZ=3D100, that will be 10ms. Whether it's good enough or too coarse I > > Nope, I think you did your your math wrong here. > As shown above it limits the resolution to hz ticks, i.e. hz * 1/hz secon= ds :) Point taken. > >> have no idea. Perhaps we can compromise and update lbolt in >> microseconds. That should give us few hundred years until the >> overflow. > > Well, we can either use the ticks variable, since we are not switching to= tickless > mode yet. =A0But we would need to make it 64-bit to avoid early overflows= . > Or, perhaps, to be somewhat future-friendly we could do approximately wha= t > OpenSolaris [upstream :-)] code does: > > gethrtime() / (NANOSEC / hz) > > Or, given that NANOSEC is constant and hz is invariant, we could apply ex= tended > invariant division by multiplication approach to get precise and fast res= ult > without overflowing. =A0But likely that's an overkill here. =A0Though we = definitely > should pre-calculate, store and re-use (NANOSEC / hz) value just like Ope= nSolaris > does it. This should work. > >>> clock_t will still need the typedef'ed to 64bit to still address the l2= arc usage of LBOLT. > > Hm, I see that OpenSolaris defines clock_t to long, while we use int32_t. > So, I think that it means two things: > - i386 OpenSolaris (if it exists) should be affected by the problem as we= ll > - we should not use our native clock_t definition for ported OpenSolaris = code > > Maybe we should fix our clock_t to be something wider at least for 64-bit > platforms. =A0But I am not prepared to discuss this. > > To summarize: > 1. We must fix ddi_get_lbolt*() to avoid unnecessary overflow in multipli= cation Agreed. > 2. We could fix 31-bit clock_t overflow by using Solaris-compatible defin= ition of > it (long), but that still won't help on i386. =A0Maybe we should bring up= this issue > to the attention of upstream ZFS developers. =A0Universally using ddi_get= _lbolt64() > seems like a safer bet. FYI, we've already changed clock_t for opensolaris code to int64_t in r218169 regardless of $MACHINE. --Artem
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?BANLkTinQkS36SQKpZhsavx3C_ad838DG=g>