Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 20 Oct 2012 20:49:24 +1100 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Marcel Moolenaar <marcel@freebsd.org>
Cc:        svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org
Subject:   Re: svn commit: r241752 - head/share/mk
Message-ID:  <20121020190530.P1781@besplex.bde.org>
In-Reply-To: <201210192013.q9JKD8si069344@svn.freebsd.org>
References:  <201210192013.q9JKD8si069344@svn.freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 19 Oct 2012, Marcel Moolenaar wrote:

> Log:
>  Improve upon the previous commit to fix the yacc rule.

Sorry I didn't complete replying to the previous thread on this.

You never said what behaviour in bmake causes more problems than
before.

There are certainly race cases, but the only ones that I'm sure
are not just bugs in make(1) are for exotic cases which were just
completely broken without my rules.  One of these cases is when
someone deletes y.tab.h (where it is used without renaming it).
The make sees y.tab.c (or parse.c etc.) and y.tab.h both depending
on parse.y.  y.tab.c is up to date, so make doesn't rebuild it,
and may run a command that uses it.  y.tab.h is out of date since
it doesn't exist, so make rebuilds it.  This rebuilding clobbers
y.tab.c, unknown to make, and any concurrent use of y.tab.c sees
the clobbered copy.  The ordering requirement is
'.ORDER: y.tab.c y.tab.h' so it is satisified by y.tab.c being
up to date before y.tab.h is built.  The reverse order might
work accidentally, but would cause the same problem if someone
deletes y.tab.h instead y.tab.c.  y.tab.o should depend on
y.tab.h via .depend, and this should prevent the 'cc -c' concurrent
use of y.tab.c in some cases, so even this exotic case will usually
not race.

>  1.  Have the resulting C file depend on the resulting H
>      file as it should be. Touch the C file to make sure

As it shouldn't be.  The C file doesn't depend on the H file.

>      the C file is newer than the H file to keep make
>      happy.

The .ORDER sort of asks for the reverse.  It says that y.tab.c must
be build before y.tab.h.  If they were built separately then it
would always be older and timestamps might be find-grained enough
to see the difference.  When they are built together, their
timestamps are determined by the order in which yacc writes them
and the order in which make or something else stat()s them.  The
order could be anything.

Perhaps this is the problem.  Old make seems to still only use mtimes
in seconds.  Any make that uses mtimes with finer granularity would
find the times different more often and then do bad things if they are
different in a certain order.  This depends on the filesystem actually
supporting mtimes in finer granularity than seconds.  Some versions
of this support are broken and the breakage may matter here (see below).

My rule depends on the .ORDER statement actually working, and this
involves delicacies with timestamps and with not starting jobs before
old ones have finished.  For it to work, it is necessary for yacc to
create both files and to not be run again, and for make to not look
at the files' timestamps until the yacc job has finished (this is
geenrally necessary, since writes to the files before the job has
finished will make the files up to date according to a simple timestamp
check, bit of course they are not up to date until the rule that builds
them has finished).  The above shows that there is a problem even when
the job has finished.  Writes normally make mtimes for update, but
don't normally update them.  The update normally occurs on close() or
stat().  close() usually applies here, so make usually can't decide
the order accidentally by looking at the times using stat().  But we
get random times for y.tab.c and y.tab.h depending on which order yacc
closes them.  The rule that creates parse.c by copying y.tab.c to it
is a bit more deterministic.  After yacc finishes, y.tab.c and y.tab.h
have times in indeterminate order.  Then cp'ing y.tab.c gives it an
mtime later than or equal to that of both, modulo fs bugs (see below).

Touching the C file doesn't actually make it newer:
(1) With seconds granularity, it will usually make the C file have the
     same time.
(2) In exotic situations, the C file may be on a different file system
     with different timestamp granularity.  Since y.tab.* are in the same
     directory, this probably doesn't affect them.  It is less clear that
     it doesn't affect parse.c cp'ed from y.tab.c.  Times are usually
     rounded down.  With limited granularity, this normally results in
     a time that is older than the current time.  It may also be older
     than the time of the original copy if the original copy is on a file
     system with less limited granularity.
(3) In non-exotic situations, touch(1) and utimes(2) can easily have a
     different timestamp granularity than the file system.  It is now easy
     to arrange a setup in which touch(1)ing a file sets its timestamp
     backwards:
     - timestamps set by touch(1) are the current time rounded down to
       microseconds
     - use sysctl vfs.timestamp.precision to nanoseconds.  Fortunately,
       CLOCK_REALTIME_FAST_N_BROKEN is not an option for this, so we
       don't have to worry about the additional problems caused by it.
       It uses CLOCK_REALTIME and gives timestamps in nanoseconds.
     - write and close a file, setting its timestamp in nanoseconds,
       with the last 3 digits in tv_nanoseconds fairly large (say 999)
     - use a fast system that can touch(1) the file in less than 999
       nanosecond.  Since gettimeofday() uses the coherent clock
       CLOCK_REALTIME, it gives the current time rounded down to nanseconds.
       It is only necessary to execute the kernel part of this in less
       than 999 nanoseconds (but I guess there are no systems fast enough to
       cause problems for make yet, since none are fast enough to start up
       the touch binary in 999 nanoseconds).  utimes() is limited to
       microseconds too, but this is not an additional restriction.
     After rounding down, the time goes backwards relative to the current
     time, and with sufficient speed, also relative to the file's old
     timestamp.
(4) If the .h file hasn't been closed or stat'ed after the final write to
     it (not a problem here), then its time may be in advance of the time
     set by touch, since it hasn't been updated yet.

About file system coherency bugs.  touch(1) still uses gettimeofday(),
so it is as non-broken as above.  Very broken file systems hard-code
their timestamp function as something like getnanotime(), giving the
coherency bugs described below.  Most use vfs_timestamp().  This uses
time_second for seconds granularity and getnanotime() for 1/HZ
granularity.  These are based on CLOCK_REALTIME_FAST_N_BROKEN, so
they don't even round down.  They times that are incoherent relative
to CLOCK_REALTIME, at least if both are rounded down.
CLOCK_REALTIME_FAST_N_BROKEN returns a time in the past relative to
CLOCK_REALTIME.  This if we use more accurate times that are possible
to see using clock_gettime(CLOCK_REALTIME, ...) or even gettimeofday(),
we can read and write incoherent timestamps.  E.g.,
- watch the time using clock_gettime(CLOCK_REALTIME, ...) until tv_nsec
   rolls over
- touch the file using this time (doesn't matter that we have to go back
   to microseconds resolution provided timestamp precision is lower)
- write and close or stat the file.  Its mtime will often goes backwards
   because it is set from time_second or getnanotime() which are in the
   past.  We have up to about 1/HZ seconds to lose this race, so even slow
   1990's machines can exec a few binaries in time to see it.
This is a general problem with rounding down times, but if file
timestamps are rounded down consistently then it causes fewer problems.
Touching a file might leave it in the past relative to the current
file but not relative to another file, and make(1) mainly cares about
the latter.  For this, utimes() must round down the current time in
the same way that normal file timestamps do, and it doesn't.  It would
first have to round at all.  It does does this accidentally to
microseconds as part of its API, so everything is accidentally
consistent if vfs.timestamp.precision gives microseconds for the file
times.  If vfs_timestamp.precision gave nanoseconds, then
clock_gettime() can be used to get coherent times, but utimes() just
can't handle the nanoseconds.  At lower precisions, the kernel could
round to the current vfs.timestamp_precision, but it is hard for it
to know if utimes() is asking for the correct timestamp, since there
are so many incoherent clocks to choose from.

>  2.  Apply the same fix to the other instance of .ORDER,
>      missed in the previous commit.

I didn't point out this bug since I don't agree with removing it.

With the dubious touch, it is needed to avoid endless dependency,

> Modified: head/share/mk/bsd.dep.mk
> ==============================================================================
> --- head/share/mk/bsd.dep.mk	Fri Oct 19 19:56:17 2012	(r241751)
> +++ head/share/mk/bsd.dep.mk	Fri Oct 19 20:13:08 2012	(r241752)
> @@ -95,16 +95,17 @@ CLEANFILES+= ${_LC}
> SRCS:=	${SRCS:S/${_YSRC}/${_YC}/}
> CLEANFILES+= ${_YC}
> .if !empty(YFLAGS:M-d) && !empty(SRCS:My.tab.h)
> -.ORDER: ${_YC} y.tab.h
> -${_YC} y.tab.h: ${_YSRC}
> +y.tab.h: ${_YSRC}
> 	${YACC} ${YFLAGS} ${.ALLSRC}
> +${_YC}: y.tab.h
> 	cp y.tab.c ${_YC}

There is no touch in this case.  It depends on cp updating the timestamp.
Now there is more chance of getting coherent timestamps and ones that are
actually different, because the touching is within the file system.  Hmm,
old versions of touch that didn't depend on utimes() but had to write the
file would have actually worked with nanoseconds timestamps.  touch(1)
also has a fallback to using utimes(pathname, NULL) to set the current
time.  With that, the kernel decides the time and nanoseconds might
actually work.  But normally utimes(pathname, tv) succeeds and doesn't
report loss of nanoseconds.

> CLEANFILES+= y.tab.c y.tab.h
> .elif !empty(YFLAGS:M-d)
> .for _YH in ${_YC:R}.h
> -${_YH}: ${_YC}
> -${_YC}: ${_YSRC}
> +${_YH}: ${_YSRC}
> 	${YACC} ${YFLAGS} -o ${_YC} ${.ALLSRC}
> +${_YC}: ${_YH}
> +	@touch ${.TARGET}

The important touch step is obfuscated with an @, unlike the cp step.

> SRCS+=	${_YH}
> CLEANFILES+= ${_YH}
> .endfor
>

Summary: this doesn't really fix the problem, and I don't like it.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20121020190530.P1781>