Date: Sat, 2 Jun 2007 00:01:00 GMT From: Dieter<freebsd@sopwith.solgatos.com> To: freebsd-gnats-submit@FreeBSD.org Subject: bin/113239: atrun(8) loses jobs due to race condition Message-ID: <200706020001.l52010tn019674@www.freebsd.org> Resent-Message-ID: <200706020010.l520A4N0097757@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
>Number: 113239
>Category: bin
>Synopsis: atrun(8) loses jobs due to race condition
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: freebsd-bugs
>State: open
>Quarter:
>Keywords:
>Date-Required:
>Class: sw-bug
>Submitter-Id: current-users
>Arrival-Date: Sat Jun 02 00:10:03 GMT 2007
>Closed-Date:
>Last-Modified:
>Originator: Dieter
>Release: 6.2
>Organization:
>Environment:
6.2-RELEASE amd64
>Description:
Due to a race condition, atrun(8) can unlink a job before it is executed.
This can result in lost data.
>How-To-Repeat:
Put a sleep in to emulate something (fork() perhaps) taking a
long time. Set up an at job and execute atrun. Execute atrun
a second time before the sleep returns. Observe that your at
job did not get executed, see error message in syslog.
The patch file has code to demo the problem.
>Fix:
I have a workaround. Only unlink the file if it is more
than 6 hours old. Strictly speaking this is not a true fix,
the race condition is still present, but if fork is taking
6 hours you have other problems.
The patch file implements this workaround.
Patch attached with submission follows:
===================================================================
RCS file: RCS/atrun.c,v
retrieving revision 1.1
diff -r1.1 atrun.c
83a84,88
> /* Workaround for race condition: only unlink file if it is
> * older than 6 hours.
> */
> #define MIN_UNLINK_TIME 60*60*6 /* Number of seconds in 6 hours */
>
143a149,161
> #if 0
> /* If something takes too long and another instance of
> * atrun starts up, it will unlink our file out from
> * under us. To demonstrate this race condition,
> * enable the sleep, set MIN_UNLINK_TIME to 0, create
> * an at job ("echo hello" is sufficient) and have atrun
> * run more frequently than the sleep time. The 70 second
> * sleep assumes atrun is run from cron once a minute.
> */
> syslog(LOG_DEBUG, "Sleeping to trigger race condition, file=%s\n", filename);
> sleep(70);
> #endif
>
179c197
< perr("cannot open input file");
---
> syslog(LOG_ERR, "Cannot open input file %s : %m\n", filename);
479a498,500
> *
> * Workaround for race condition: only unlink file if it is
> * older than MIN_UNLINK_TIME seconds.
481c502,504
< if ((run_time < now) && !(S_IXUSR & buf.st_mode) && (S_IRUSR & buf.st_mode))
---
> if (( (run_time + MIN_UNLINK_TIME) < now) && !(S_IXUSR & buf.st_mode) && (S_IRUSR & buf.st_mode))
> {
> syslog(LOG_DEBUG, "Unlinking %s run_time=%ld now=%ld\n", dirent->d_name, run_time, now);
482a506
> }
>Release-Note:
>Audit-Trail:
>Unformatted:
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200706020001.l52010tn019674>
