From owner-freebsd-bugs@FreeBSD.ORG Sat Jun 2 00:10:04 2007 Return-Path: X-Original-To: freebsd-bugs@hub.freebsd.org Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 586FC16A421 for ; Sat, 2 Jun 2007 00:10:04 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [69.147.83.40]) by mx1.freebsd.org (Postfix) with ESMTP id 394DA13C45D for ; Sat, 2 Jun 2007 00:10:04 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.13.4/8.13.4) with ESMTP id l520A4jM097763 for ; Sat, 2 Jun 2007 00:10:04 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.13.4/8.13.4/Submit) id l520A4N0097757; Sat, 2 Jun 2007 00:10:04 GMT (envelope-from gnats) Resent-Date: Sat, 2 Jun 2007 00:10:04 GMT Resent-Message-Id: <200706020010.l520A4N0097757@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-bugs@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, Dieter Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id C6E6D16A400 for ; Sat, 2 Jun 2007 00:01:02 +0000 (UTC) (envelope-from nobody@FreeBSD.org) Received: from www.freebsd.org (www.freebsd.org [69.147.83.33]) by mx1.freebsd.org (Postfix) with ESMTP id 9B95313C468 for ; Sat, 2 Jun 2007 00:01:02 +0000 (UTC) (envelope-from nobody@FreeBSD.org) Received: from www.freebsd.org (localhost [127.0.0.1]) by www.freebsd.org (8.13.1/8.13.1) with ESMTP id l52010ol019675 for ; Sat, 2 Jun 2007 00:01:00 GMT (envelope-from nobody@www.freebsd.org) Received: (from nobody@localhost) by www.freebsd.org (8.13.1/8.13.1/Submit) id l52010tn019674; Sat, 2 Jun 2007 00:01:00 GMT (envelope-from nobody) Message-Id: <200706020001.l52010tn019674@www.freebsd.org> Date: Sat, 2 Jun 2007 00:01:00 GMT From: Dieter To: freebsd-gnats-submit@FreeBSD.org X-Send-Pr-Version: www-3.0 Cc: Subject: bin/113239: atrun(8) loses jobs due to race condition X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Jun 2007 00:10:04 -0000 >Number: 113239 >Category: bin >Synopsis: atrun(8) loses jobs due to race condition >Confidential: no >Severity: serious >Priority: medium >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Sat Jun 02 00:10:03 GMT 2007 >Closed-Date: >Last-Modified: >Originator: Dieter >Release: 6.2 >Organization: >Environment: 6.2-RELEASE amd64 >Description: Due to a race condition, atrun(8) can unlink a job before it is executed. This can result in lost data. >How-To-Repeat: Put a sleep in to emulate something (fork() perhaps) taking a long time. Set up an at job and execute atrun. Execute atrun a second time before the sleep returns. Observe that your at job did not get executed, see error message in syslog. The patch file has code to demo the problem. >Fix: I have a workaround. Only unlink the file if it is more than 6 hours old. Strictly speaking this is not a true fix, the race condition is still present, but if fork is taking 6 hours you have other problems. The patch file implements this workaround. Patch attached with submission follows: =================================================================== RCS file: RCS/atrun.c,v retrieving revision 1.1 diff -r1.1 atrun.c 83a84,88 > /* Workaround for race condition: only unlink file if it is > * older than 6 hours. > */ > #define MIN_UNLINK_TIME 60*60*6 /* Number of seconds in 6 hours */ > 143a149,161 > #if 0 > /* If something takes too long and another instance of > * atrun starts up, it will unlink our file out from > * under us. To demonstrate this race condition, > * enable the sleep, set MIN_UNLINK_TIME to 0, create > * an at job ("echo hello" is sufficient) and have atrun > * run more frequently than the sleep time. The 70 second > * sleep assumes atrun is run from cron once a minute. > */ > syslog(LOG_DEBUG, "Sleeping to trigger race condition, file=%s\n", filename); > sleep(70); > #endif > 179c197 < perr("cannot open input file"); --- > syslog(LOG_ERR, "Cannot open input file %s : %m\n", filename); 479a498,500 > * > * Workaround for race condition: only unlink file if it is > * older than MIN_UNLINK_TIME seconds. 481c502,504 < if ((run_time < now) && !(S_IXUSR & buf.st_mode) && (S_IRUSR & buf.st_mode)) --- > if (( (run_time + MIN_UNLINK_TIME) < now) && !(S_IXUSR & buf.st_mode) && (S_IRUSR & buf.st_mode)) > { > syslog(LOG_DEBUG, "Unlinking %s run_time=%ld now=%ld\n", dirent->d_name, run_time, now); 482a506 > } >Release-Note: >Audit-Trail: >Unformatted: