FreeBSD Mail Archives

Date:      Wed, 17 Oct 2012 21:58:35 +1100 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Andrey Zonov <zont@FreeBSD.org>
Cc:        svn-src-head@FreeBSD.org, Maxim Sobolev <sobomax@FreeBSD.org>, svn-src-all@FreeBSD.org, src-committers@FreeBSD.org
Subject:   Re: svn commit: r241625 - head/usr.sbin/cron/cron
Message-ID:  <20121017203303.S4043@besplex.bde.org>
In-Reply-To: <507E651D.2060809@FreeBSD.org>
References:  <201210170044.q9H0iZHo055977@svn.freebsd.org> <507E651D.2060809@FreeBSD.org>

On Wed, 17 Oct 2012, Andrey Zonov wrote:

> On 10/17/12 4:44 AM, Maxim Sobolev wrote:
>> Log:
>>   o Use nanosleep(2) to sleep exact amount of time till the next second,
>>   not multiple of 1 second, which results in actual time to drift back
>>   and forth every run within 1 second of the actual action has
>>   been set for.
>> ...
>> +static int
>
> This should be type of time_t.

Actually void, since the value that it returns is especially useless for
the only use of this function, and is in fact not used.  The only use of
this function needs more than seconds resolution.

>
>> +timeval_subtract(struct timespec *result, struct timeval *x, struct timeval *y)
>> +{
>> +	int nsec;
>
> And this should be suseconds_t.
>
>> +
>> +	/* Perform the carry for the later subtraction by updating y. */
>> +	if (x->tv_usec < y->tv_usec) {
>> +		nsec = (y->tv_usec - x->tv_usec) / 1000000 + 1;
>> +		y->tv_usec -= 1000000 * nsec;
>> +		y->tv_sec += nsec;
>> +	}
>> +	if (x->tv_usec - y->tv_usec > 1000000) {
>> +		nsec = (x->tv_usec - y->tv_usec) / 1000000;
>> +		y->tv_usec += 1000000 * nsec;
>> +		y->tv_sec -= nsec;
>> +	}
>> +
>> +	/* tv_nsec is certainly positive. */
>> +	result->tv_sec = x->tv_sec - y->tv_sec;
>> +	result->tv_nsec = (x->tv_usec - y->tv_usec) * 1000;
>> +
>> +	/* Return difference in seconds */
>> +	return (x->tv_sec - y->tv_sec);
>> +}
>
> May be it's better to use timersub() and TIMEVAL_TO_TIMESPEC()?

I don't like them, but they somehow do things in half the space of the
above (only 7 lines for timersub()).

They are kernel internals that were originally intentionally left out
in userland.  Now they are NetBSD/OpenBSD compatibilty cruft.  They
only support timevals, and timevals should never be used after POSIX
standardized timespecs and left out timevals in 1990.  POSIX was broken
in 2001 to support all the old timeval cruft.  TIMEVAL_TO_TIMESPEC()
should have been left out in the kernel too.  cron is fairly portable
and belong[ed] in contrib and shouldn't use unportable bad extensions.

Applications can use floating point (FP), and taking the difference
of timespecs especially simple in FP:

/* Time difference of timespecs in nanoseconds. */
double
difftimespec(struct timeval *x, struct timeval *y)
{
 	return ((x->tv_sec - y->tv_sec) * 1e9 + x->tv_nsec - y->tv_nsec);
}

Converting this back to a timespec may take a bit more work than working
throughout with timespecs, but no more than the TIMEVAL_TO_TIMESPEC()
mistake, especially when it does the same amount of error checking (none).
It could easily be faster than TIMESPEC_TO_TIMEVAL(), since it only needs
FP multiplication while TIMESPEC_TO_TIMEVAL() needs integer division.

struct timespec
nanosecondstotimespec(double nsec)
{
 	struct timespec ts;

 	ts.tv_sec = nsec * 1e-9;	/* Only support nsec >= 0. */
 	ts.tv_nsec = nsec - ts.tv_sec * 1e9;
}

>>  static void
>>  cron_sleep(db)
>>  	cron_db	*db;
>>  {
>> -	int	seconds_to_wait = 0;
>> +	int seconds_to_wait;
>> +	int rval;
>> +	struct timeval ctime, ttime;
>> +	struct timespec stime, remtime;
>>
>>  	/*
>>  	 * Loop until we reach the top of the next minute, sleep when possible.
>>  	 */
>>
>>  	for (;;) {
>> -		seconds_to_wait = (int) (TargetTime - time((time_t*)0));
>> +		gettimeofday(&ctime, NULL);

It's just foot-shooting to use an old timeval API and then have to convert
to timespecs to use the "new" nanosleep() API.  It was new in 1990.

Someone mentioned using clock_getttime(CLOCK_REALTIME_BROKEN).  I might
complain about that separately :-).

>> +		ttime.tv_sec = TargetTime;
>> +		ttime.tv_usec = 0;
>> +		timeval_subtract(&stime, &ttime, &ctime);

This is especially easy in FP.  I wouldn't make difftimespec() an
actual function or obfuscate it as a macro, but would just open-code
its 1 line.  The timespec doesn't need to be initialized for this, and
the 1 line is especially simple because 1 of the tv_nsec's is 0:

 	/* stime now double. */
 	stime = (TargetTime - cttime.tv_sec) * 1e9 - ctime.tv_nsec;

>>
>>  		/*
>>  		 * If the seconds_to_wait value is insane, jump the cron
>>  		 */
>>
>> -		if (seconds_to_wait < -600 || seconds_to_wait > 600) {
>> +		if (stime.tv_sec < -600 || stime.tv_sec > 600) {

Again simpler in FP using fabs(), but the above could have used abs():

 		if (fabs(stime) > 600e9)

But maybe all negative stime's are insane.

>>  			cron_clean(db);
>>  			cron_sync();
>>  			continue;
>>  		}
>>
>> +		seconds_to_wait = (stime.tv_nsec > 0) ? stime.tv_sec + 1 : stime.tv_sec;
>> +

Again simpler in FP, using standard rounding functions:

 		seconds_to_wait = ceil(stime * 1e-9);

but probably, stime should be kept in nanoseconds throughout.  It could
also be in seconds, but keeping it in nansoseconds ensures that it is
an integer and reduces rounding problems, and allows it to be copied to
an int64_t or intmax_t.

>>  		Debug(DSCH, ("[%d] TargetTime=%ld, sec-to-wait=%d\n",
>>  			getpid(), (long)TargetTime, seconds_to_wait))
>>

seconds_to_wait is now (in the FP version) used mainly in debugging
printfs.  These could print nanoseconds to avoid converting stime,
and it is now a bug (in the non-FP version) for them to not print the
nanoseconds part, since that part is what you might need to debug this
change.  Nanoseconds is really to much info though.  I would try printing
the value in microseconds in %.6f format.

This is an ancient program which assumes that time_t is representable as
a log.  At least it casts TargetTime to long, so it doesn't assume that
time_t is long here.  Careful use of time_t and suseconds_t would mainly
give more things to cast.  The above assumes that seconds_to_wait is an
int.  More precisely the assumption was in the old code quoted above.
It uses a first uses a subtraction of time_t's instead of the technically
correct difftime(), and then uses a bogus cast to int to break warnings
about overflow.  Then it checks if seconds_to_wait is sane.  An insane
difference would have overflowed before the check.

In the FP pseudo-code, I use the fact that 53 bits is enough for anyone
to avoid overflow checks.  2**53 nanoseconds is actually only 104 days,
so a 53-bit integer would not be enough.  But FP also has exponents
and infinities.  Insane times will result in a difference that is
either an infinity or not in the range of -600 to +600 seconds, so the
insanity will be detected.  Than the 53 bits are plenty for exact
nanseconds over 600 seconds.

Another bug in the timeval and timespec APIs are that they don't
automatically avoid overflow problems like FP does.  Overflow just
occurs, and there are no infinities or fenv flags to record it.  (Types
in timespecs are correctly not required to be unsigned, so overflow
gives undefined behaviour which may be to trap so as to report the
problem.)  Neither the timeval compatibility cruft APIs nor your code
above have the complications to detect overflow.  There are kernel
bugs in nanosleep() from this -- pass any time interval which when
added to the current time will give an unrepresentable time.
nanosleep(9) will then overflow and do colaterally wrong things.
Older timeval APIs like setitimer() used to avoid overflow by not
permitting intervals of more than 100 million seconds (this works for
current times up to 100 million seconds before the maximum time
representable by a time_t, which is good enough until the current time
reaches 2035).  nanosleep() never had this restriction, and it was
removed in the older timer APIs, although it is still documented for
them.

>> @@ -380,13 +416,19 @@ cron_sleep(db)
>>  		 * to run, break
>>  		 */
>>
>> -		if (seconds_to_wait <= 0)
>> +		if (stime.tv_sec < 0)

Similar in FP.

>>  			break;
>>  		if (job_runqueue() == 0) {
>>  			Debug(DSCH, ("[%d] sleeping for %d seconds\n",
>>  				getpid(), seconds_to_wait))
>>
>> -			sleep(seconds_to_wait);
>> +			for (;;) {
>> +				rval = nanosleep(&stime, &remtime);

The only thing that is not simpler in FP -- we have to convert our
scalar variable back to a struct to use oldish kernelish APIs like
nanosleep().

>> +				if (rval == 0 || errno != EINTR)
>> +					break;
>> +				stime.tv_sec = remtime.tv_sec;
>> +				stime.tv_nsec = remtime.tv_nsec;
>> +			}
>>  		}
>>  	}
>>  }

My interval-handling timeout code in ping doesn't use FP and is about as
messy as the above :-).  It has to use timevals to interface with select
timeouts, and it uses gettimeofday() sicne clock_gettime() wasn't implemented
when it was written.  Oops, even nanosleep() is too new to have been in POSIX
in 1990.  ping has arbitrary intervals which take a bit more work to handle
than seconds intervals.

Bruce

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20121017203303.S4043>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation