Date: Sat, 2 Jun 2007 00:01:00 GMT From: Dieter<freebsd@sopwith.solgatos.com> To: freebsd-gnats-submit@FreeBSD.org Subject: bin/113239: atrun(8) loses jobs due to race condition Message-ID: <200706020001.l52010tn019674@www.freebsd.org> Resent-Message-ID: <200706020010.l520A4N0097757@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
>Number: 113239 >Category: bin >Synopsis: atrun(8) loses jobs due to race condition >Confidential: no >Severity: serious >Priority: medium >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Sat Jun 02 00:10:03 GMT 2007 >Closed-Date: >Last-Modified: >Originator: Dieter >Release: 6.2 >Organization: >Environment: 6.2-RELEASE amd64 >Description: Due to a race condition, atrun(8) can unlink a job before it is executed. This can result in lost data. >How-To-Repeat: Put a sleep in to emulate something (fork() perhaps) taking a long time. Set up an at job and execute atrun. Execute atrun a second time before the sleep returns. Observe that your at job did not get executed, see error message in syslog. The patch file has code to demo the problem. >Fix: I have a workaround. Only unlink the file if it is more than 6 hours old. Strictly speaking this is not a true fix, the race condition is still present, but if fork is taking 6 hours you have other problems. The patch file implements this workaround. Patch attached with submission follows: =================================================================== RCS file: RCS/atrun.c,v retrieving revision 1.1 diff -r1.1 atrun.c 83a84,88 > /* Workaround for race condition: only unlink file if it is > * older than 6 hours. > */ > #define MIN_UNLINK_TIME 60*60*6 /* Number of seconds in 6 hours */ > 143a149,161 > #if 0 > /* If something takes too long and another instance of > * atrun starts up, it will unlink our file out from > * under us. To demonstrate this race condition, > * enable the sleep, set MIN_UNLINK_TIME to 0, create > * an at job ("echo hello" is sufficient) and have atrun > * run more frequently than the sleep time. The 70 second > * sleep assumes atrun is run from cron once a minute. > */ > syslog(LOG_DEBUG, "Sleeping to trigger race condition, file=%s\n", filename); > sleep(70); > #endif > 179c197 < perr("cannot open input file"); --- > syslog(LOG_ERR, "Cannot open input file %s : %m\n", filename); 479a498,500 > * > * Workaround for race condition: only unlink file if it is > * older than MIN_UNLINK_TIME seconds. 481c502,504 < if ((run_time < now) && !(S_IXUSR & buf.st_mode) && (S_IRUSR & buf.st_mode)) --- > if (( (run_time + MIN_UNLINK_TIME) < now) && !(S_IXUSR & buf.st_mode) && (S_IRUSR & buf.st_mode)) > { > syslog(LOG_DEBUG, "Unlinking %s run_time=%ld now=%ld\n", dirent->d_name, run_time, now); 482a506 > } >Release-Note: >Audit-Trail: >Unformatted:
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200706020001.l52010tn019674>