Date: Wed, 19 Sep 2007 14:55:35 +1000 (EST) From: Bruce Evans <brde@optusnet.com.au> To: Julian Elischer <julian@elischer.org> Cc: Andre Oppermann <andre@freebsd.org>, arch@freebsd.org Subject: Re: 64bit ticks, was Re: Changing p_swtime and td_slptime to ticks Message-ID: <20070919133525.E78370@besplex.bde.org> In-Reply-To: <46F0656A.7030802@elischer.org> References: <20070917165657.B558@10.0.0.1> <46EF644E.9050207@elischer.org> <20070918012555.G558@10.0.0.1> <46EFE4BD.4030505@freebsd.org> <20070918142115.C558@10.0.0.1> <20070918153536.D558@10.0.0.1> <46F05F88.5060809@errno.com> <46F0656A.7030802@elischer.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 18 Sep 2007, Julian Elischer wrote: > Sam Leffler wrote: >> Jeff Roberson wrote: >>> On Tue, 18 Sep 2007, Jeff Roberson wrote: >>> >>>> On Tue, 18 Sep 2007, Andre Oppermann wrote: >>>>> ... >>>>> ticks is 2^31 on x86 and at HZ=1000 is wraps within a reasonable >>>>> short uptime. You have to make sure that your code handles that >>>>> correctly or you run into lots of strange effects which are almost >>>>> impossible to reproduce. In TCP we've got bitten by that. >>>> >>>> Thanks Andre, this is a good point. For the td_slptime I don't think >>>> it's of practical concern. However, for swtime I think I will convert it >>>> then to seconds from boot. IIRC, the patch only uses deltas of `ticks', so its wraparound doesn't matter, provided you actually use it often enough. Often enough is at least once every 497 days with HZ = 100, or just every 49.7 days with HZ = 1000. >>> Is there a good reason for not making ticks 64bit? I think this would be only a small pessimization, unless accesses to `ticks' are fully locked or made atomic. The clock interrupt handler can interrupt almost anything, but almost nothing locks out clock interrupts. The lock for the write access to `ticks' in hardclock() is (rather bogusly) callout_lock, and sofclock() uses this lock for read accesses without really trying, but generic code can't be expected to know about this and in fact doesn't know about it -- `ticks' seems to be used mainly in netinet where there is no callout_lock'ing in sight. Without locking: with a 32-bit `ticks', read accesses not using atomic_read() are probably atomic, so the race would just give an out of date value, and this shouldn't matter -- the reader cannot tell the difference between a value that is out of date when it is read or one that becomes out of date 1 instruction after it is read; with a 64-bit `ticks' on 32-bit machines, read accesses would not be atomic and the value would sometimes be garbage. >>> math involving this >>> value is relatively infrequent. Bruce? Any comments? It'd sure let us >>> forget all of these counter wrapping problems. >> >> ticks is used a lot. I'd rather set hz back to 100 by default. This >> approach is a perfect example of ignoring low-end platforms. I agree. However, one variable isn't going to make much difference. fs code is full of unecessary 64-bit calculations but no one notices the real-time pessimization from this, at least on non-low-end platforms (I only notice the code bloat). > but it still needs to work on systems with high hz values. Just changing its type won't fix all such problems, but will cause new ones. E.g., in tcp_timer(): % int tick = ticks; This will truncate `ticks'. Truncation is the right thing to do in most places that only need a small delta, but... % ... % CTR6(KTR_NET, "%p %s inp %p active %x tick %i nextc %i", % tp, __func__, inp, tt->tt_active, tick, tt->tt_nextc); Easy to detect breakage of the printf format. Old printf format errors: %p for types other than void *. % ... % if (tick < tt->tt_nextc) % goto rescan; Comparison of truncated value, OK (since both tick and tt_nextc have type int and int overflow is benign on all supported machines). Types should be u_int to avoid undefined behaviour on int overflow. % ... % if (tp->t_state != TCPS_TIME_WAIT && % (ticks - tp->t_rcvtime) <= tcp_maxidle) % tcp_timer_activate(tp, TT_2MSL, tcp_keepintvl); This is already broken on all supported machines, since t_rcvtime has type u_long which is incompatible with the type of both `ticks' and tcp_maxidle (both int). All the types should be truncated to a common unsigned type to get defined behaviour that is not too hard to understand. I think the current behaviour is benign overflow -- t_rcvtime is set to an earlier value of `ticks', and the difference never grows very large. However, changing `ticks' to 64 bits breaks this on non-64-bit machines, -- `ticks' may grow large, but the copy of it in t_rcvtime is limited to u_long = 32 bits on non-64-bit machines, so after about 2^32 ticks of uptime (ticks - tp->t_rcvtime) will always exceed tcp_maxidle. Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070919133525.E78370>