From owner-freebsd-bugs@FreeBSD.ORG Sun Oct 28 15:10:01 2007 Return-Path: Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D427316A41B for ; Sun, 28 Oct 2007 15:10:01 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 8B5A113C4BC for ; Sun, 28 Oct 2007 15:10:01 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.14.1/8.14.1) with ESMTP id l9SFA1sp001118 for ; Sun, 28 Oct 2007 15:10:01 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.1/8.14.1/Submit) id l9SFA1pX001117; Sun, 28 Oct 2007 15:10:01 GMT (envelope-from gnats) Resent-Date: Sun, 28 Oct 2007 15:10:01 GMT Resent-Message-Id: <200710281510.l9SFA1pX001117@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-bugs@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, Danny Braniss Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C1D8116A41A for ; Sun, 28 Oct 2007 15:06:45 +0000 (UTC) (envelope-from danny@cs.huji.ac.il) Received: from cs1.cs.huji.ac.il (cs1.cs.huji.ac.il [132.65.16.10]) by mx1.freebsd.org (Postfix) with ESMTP id 7A09F13C4B2 for ; Sun, 28 Oct 2007 15:06:45 +0000 (UTC) (envelope-from danny@cs.huji.ac.il) Received: from sunfire.cs.huji.ac.il ([132.65.16.80]) by cs1.cs.huji.ac.il with esmtp id 1Im9it-000NbM-S9 for FreeBSD-gnats-submit@freebsd.org; Sun, 28 Oct 2007 17:06:43 +0200 Received: from danny by sunfire.cs.huji.ac.il with local (Exim 4.68 (FreeBSD)) (envelope-from ) id 1Im9it-0000YS-ON for FreeBSD-gnats-submit@freebsd.org; Sun, 28 Oct 2007 17:06:43 +0200 Message-Id: Date: Sun, 28 Oct 2007 17:06:43 +0200 From: Danny Braniss To: FreeBSD-gnats-submit@FreeBSD.org X-Send-Pr-Version: 3.113 Cc: Subject: misc/117603: dump(8) hangs on SMP - 4way and higher. X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Danny Braniss List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 28 Oct 2007 15:10:01 -0000 >Number: 117603 >Category: misc >Synopsis: dump(8) hangs on SMP - 4way and higher. >Confidential: no >Severity: non-critical >Priority: medium >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Sun Oct 28 15:10:01 UTC 2007 >Closed-Date: >Last-Modified: >Originator: Danny Braniss >Release: FreeBSD 7.0-BETA1 amd64 >Organization: >Environment: System: FreeBSD sunfire 7.0-BETA1 FreeBSD 7.0-BETA1 #1: Sat Oct 20 16:30:43 IST 2007 danny@sunfire:/r+d/obj/sunfire/r+d/7.0/src/sys/HUJI amd64 >Description: dump will create 4 processes, 3 of which read from disk, and via some syncronization will seq. write to tape/file. the method used to sync. these 'slaves' worked fine on older, slower, non-smp hosts. on a dual cpu, dual core, it hangs very frequently. >How-To-Repeat: dump 0aLf /some/file / >Fix: patch follows. --- tape.c.orig 2005-03-02 04:30:08.000000000 +0200 +++ tape.c 2007-10-28 16:17:46.728015000 +0200 @@ -109,11 +109,8 @@ int master; /* pid of master, for sending error signals */ int tenths; /* length of tape used per block written */ + static volatile sig_atomic_t caught; /* have we caught the signal to proceed? */ -static volatile sig_atomic_t ready; /* reached the lock point without having */ - /* received the SIGUSR2 signal from the prev slave? */ -static jmp_buf jmpbuf; /* where to jump to if we are ready when the */ - /* SIGUSR2 arrives from the previous slave */ int alloctape(void) @@ -685,15 +682,13 @@ void proceed(int signo __unused) { - - if (ready) - longjmp(jmpbuf, 1); caught++; } void enslave(void) { + sigset_t s_mask; int cmd[2]; int i, j; @@ -704,6 +699,10 @@ signal(SIGUSR1, tperror); /* Slave sends SIGUSR1 on tape errors */ signal(SIGUSR2, proceed); /* Slave sends SIGUSR2 to next slave */ + sigemptyset(&s_mask); + sigaddset(&s_mask, SIGUSR2); + sigprocmask(SIG_BLOCK, &s_mask, NULL); + for (i = 0; i < SLAVES; i++) { if (i == slp - &slaves[0]) { caught = 1; @@ -793,12 +792,8 @@ quit("master/slave protocol botched.\n"); } } - if (setjmp(jmpbuf) == 0) { - ready = 1; - if (!caught) - (void) pause(); - } - ready = 0; + if(!caught) + sigsuspend(0); caught = 0; /* Try to write the data... */ >Release-Note: >Audit-Trail: >Unformatted: