Date: Sat, 20 Oct 2012 20:49:24 +1100 (EST) From: Bruce Evans <brde@optusnet.com.au> To: Marcel Moolenaar <marcel@freebsd.org> Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org Subject: Re: svn commit: r241752 - head/share/mk Message-ID: <20121020190530.P1781@besplex.bde.org> In-Reply-To: <201210192013.q9JKD8si069344@svn.freebsd.org> References: <201210192013.q9JKD8si069344@svn.freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 19 Oct 2012, Marcel Moolenaar wrote: > Log: > Improve upon the previous commit to fix the yacc rule. Sorry I didn't complete replying to the previous thread on this. You never said what behaviour in bmake causes more problems than before. There are certainly race cases, but the only ones that I'm sure are not just bugs in make(1) are for exotic cases which were just completely broken without my rules. One of these cases is when someone deletes y.tab.h (where it is used without renaming it). The make sees y.tab.c (or parse.c etc.) and y.tab.h both depending on parse.y. y.tab.c is up to date, so make doesn't rebuild it, and may run a command that uses it. y.tab.h is out of date since it doesn't exist, so make rebuilds it. This rebuilding clobbers y.tab.c, unknown to make, and any concurrent use of y.tab.c sees the clobbered copy. The ordering requirement is '.ORDER: y.tab.c y.tab.h' so it is satisified by y.tab.c being up to date before y.tab.h is built. The reverse order might work accidentally, but would cause the same problem if someone deletes y.tab.h instead y.tab.c. y.tab.o should depend on y.tab.h via .depend, and this should prevent the 'cc -c' concurrent use of y.tab.c in some cases, so even this exotic case will usually not race. > 1. Have the resulting C file depend on the resulting H > file as it should be. Touch the C file to make sure As it shouldn't be. The C file doesn't depend on the H file. > the C file is newer than the H file to keep make > happy. The .ORDER sort of asks for the reverse. It says that y.tab.c must be build before y.tab.h. If they were built separately then it would always be older and timestamps might be find-grained enough to see the difference. When they are built together, their timestamps are determined by the order in which yacc writes them and the order in which make or something else stat()s them. The order could be anything. Perhaps this is the problem. Old make seems to still only use mtimes in seconds. Any make that uses mtimes with finer granularity would find the times different more often and then do bad things if they are different in a certain order. This depends on the filesystem actually supporting mtimes in finer granularity than seconds. Some versions of this support are broken and the breakage may matter here (see below). My rule depends on the .ORDER statement actually working, and this involves delicacies with timestamps and with not starting jobs before old ones have finished. For it to work, it is necessary for yacc to create both files and to not be run again, and for make to not look at the files' timestamps until the yacc job has finished (this is geenrally necessary, since writes to the files before the job has finished will make the files up to date according to a simple timestamp check, bit of course they are not up to date until the rule that builds them has finished). The above shows that there is a problem even when the job has finished. Writes normally make mtimes for update, but don't normally update them. The update normally occurs on close() or stat(). close() usually applies here, so make usually can't decide the order accidentally by looking at the times using stat(). But we get random times for y.tab.c and y.tab.h depending on which order yacc closes them. The rule that creates parse.c by copying y.tab.c to it is a bit more deterministic. After yacc finishes, y.tab.c and y.tab.h have times in indeterminate order. Then cp'ing y.tab.c gives it an mtime later than or equal to that of both, modulo fs bugs (see below). Touching the C file doesn't actually make it newer: (1) With seconds granularity, it will usually make the C file have the same time. (2) In exotic situations, the C file may be on a different file system with different timestamp granularity. Since y.tab.* are in the same directory, this probably doesn't affect them. It is less clear that it doesn't affect parse.c cp'ed from y.tab.c. Times are usually rounded down. With limited granularity, this normally results in a time that is older than the current time. It may also be older than the time of the original copy if the original copy is on a file system with less limited granularity. (3) In non-exotic situations, touch(1) and utimes(2) can easily have a different timestamp granularity than the file system. It is now easy to arrange a setup in which touch(1)ing a file sets its timestamp backwards: - timestamps set by touch(1) are the current time rounded down to microseconds - use sysctl vfs.timestamp.precision to nanoseconds. Fortunately, CLOCK_REALTIME_FAST_N_BROKEN is not an option for this, so we don't have to worry about the additional problems caused by it. It uses CLOCK_REALTIME and gives timestamps in nanoseconds. - write and close a file, setting its timestamp in nanoseconds, with the last 3 digits in tv_nanoseconds fairly large (say 999) - use a fast system that can touch(1) the file in less than 999 nanosecond. Since gettimeofday() uses the coherent clock CLOCK_REALTIME, it gives the current time rounded down to nanseconds. It is only necessary to execute the kernel part of this in less than 999 nanoseconds (but I guess there are no systems fast enough to cause problems for make yet, since none are fast enough to start up the touch binary in 999 nanoseconds). utimes() is limited to microseconds too, but this is not an additional restriction. After rounding down, the time goes backwards relative to the current time, and with sufficient speed, also relative to the file's old timestamp. (4) If the .h file hasn't been closed or stat'ed after the final write to it (not a problem here), then its time may be in advance of the time set by touch, since it hasn't been updated yet. About file system coherency bugs. touch(1) still uses gettimeofday(), so it is as non-broken as above. Very broken file systems hard-code their timestamp function as something like getnanotime(), giving the coherency bugs described below. Most use vfs_timestamp(). This uses time_second for seconds granularity and getnanotime() for 1/HZ granularity. These are based on CLOCK_REALTIME_FAST_N_BROKEN, so they don't even round down. They times that are incoherent relative to CLOCK_REALTIME, at least if both are rounded down. CLOCK_REALTIME_FAST_N_BROKEN returns a time in the past relative to CLOCK_REALTIME. This if we use more accurate times that are possible to see using clock_gettime(CLOCK_REALTIME, ...) or even gettimeofday(), we can read and write incoherent timestamps. E.g., - watch the time using clock_gettime(CLOCK_REALTIME, ...) until tv_nsec rolls over - touch the file using this time (doesn't matter that we have to go back to microseconds resolution provided timestamp precision is lower) - write and close or stat the file. Its mtime will often goes backwards because it is set from time_second or getnanotime() which are in the past. We have up to about 1/HZ seconds to lose this race, so even slow 1990's machines can exec a few binaries in time to see it. This is a general problem with rounding down times, but if file timestamps are rounded down consistently then it causes fewer problems. Touching a file might leave it in the past relative to the current file but not relative to another file, and make(1) mainly cares about the latter. For this, utimes() must round down the current time in the same way that normal file timestamps do, and it doesn't. It would first have to round at all. It does does this accidentally to microseconds as part of its API, so everything is accidentally consistent if vfs.timestamp.precision gives microseconds for the file times. If vfs_timestamp.precision gave nanoseconds, then clock_gettime() can be used to get coherent times, but utimes() just can't handle the nanoseconds. At lower precisions, the kernel could round to the current vfs.timestamp_precision, but it is hard for it to know if utimes() is asking for the correct timestamp, since there are so many incoherent clocks to choose from. > 2. Apply the same fix to the other instance of .ORDER, > missed in the previous commit. I didn't point out this bug since I don't agree with removing it. With the dubious touch, it is needed to avoid endless dependency, > Modified: head/share/mk/bsd.dep.mk > ============================================================================== > --- head/share/mk/bsd.dep.mk Fri Oct 19 19:56:17 2012 (r241751) > +++ head/share/mk/bsd.dep.mk Fri Oct 19 20:13:08 2012 (r241752) > @@ -95,16 +95,17 @@ CLEANFILES+= ${_LC} > SRCS:= ${SRCS:S/${_YSRC}/${_YC}/} > CLEANFILES+= ${_YC} > .if !empty(YFLAGS:M-d) && !empty(SRCS:My.tab.h) > -.ORDER: ${_YC} y.tab.h > -${_YC} y.tab.h: ${_YSRC} > +y.tab.h: ${_YSRC} > ${YACC} ${YFLAGS} ${.ALLSRC} > +${_YC}: y.tab.h > cp y.tab.c ${_YC} There is no touch in this case. It depends on cp updating the timestamp. Now there is more chance of getting coherent timestamps and ones that are actually different, because the touching is within the file system. Hmm, old versions of touch that didn't depend on utimes() but had to write the file would have actually worked with nanoseconds timestamps. touch(1) also has a fallback to using utimes(pathname, NULL) to set the current time. With that, the kernel decides the time and nanoseconds might actually work. But normally utimes(pathname, tv) succeeds and doesn't report loss of nanoseconds. > CLEANFILES+= y.tab.c y.tab.h > .elif !empty(YFLAGS:M-d) > .for _YH in ${_YC:R}.h > -${_YH}: ${_YC} > -${_YC}: ${_YSRC} > +${_YH}: ${_YSRC} > ${YACC} ${YFLAGS} -o ${_YC} ${.ALLSRC} > +${_YC}: ${_YH} > + @touch ${.TARGET} The important touch step is obfuscated with an @, unlike the cp step. > SRCS+= ${_YH} > CLEANFILES+= ${_YH} > .endfor > Summary: this doesn't really fix the problem, and I don't like it. Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20121020190530.P1781>