Date: Sun, 28 Oct 2007 17:06:43 +0200 From: Danny Braniss <danny@cs.huji.ac.il> To: FreeBSD-gnats-submit@FreeBSD.org Subject: misc/117603: dump(8) hangs on SMP - 4way and higher. Message-ID: <E1Im9it-0000YS-ON@sunfire.cs.huji.ac.il> Resent-Message-ID: <200710281510.l9SFA1pX001117@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
>Number: 117603 >Category: misc >Synopsis: dump(8) hangs on SMP - 4way and higher. >Confidential: no >Severity: non-critical >Priority: medium >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Sun Oct 28 15:10:01 UTC 2007 >Closed-Date: >Last-Modified: >Originator: Danny Braniss >Release: FreeBSD 7.0-BETA1 amd64 >Organization: >Environment: System: FreeBSD sunfire 7.0-BETA1 FreeBSD 7.0-BETA1 #1: Sat Oct 20 16:30:43 IST 2007 danny@sunfire:/r+d/obj/sunfire/r+d/7.0/src/sys/HUJI amd64 >Description: dump will create 4 processes, 3 of which read from disk, and via some syncronization will seq. write to tape/file. the method used to sync. these 'slaves' worked fine on older, slower, non-smp hosts. on a dual cpu, dual core, it hangs very frequently. >How-To-Repeat: dump 0aLf /some/file / >Fix: patch follows. --- tape.c.orig 2005-03-02 04:30:08.000000000 +0200 +++ tape.c 2007-10-28 16:17:46.728015000 +0200 @@ -109,11 +109,8 @@ int master; /* pid of master, for sending error signals */ int tenths; /* length of tape used per block written */ + static volatile sig_atomic_t caught; /* have we caught the signal to proceed? */ -static volatile sig_atomic_t ready; /* reached the lock point without having */ - /* received the SIGUSR2 signal from the prev slave? */ -static jmp_buf jmpbuf; /* where to jump to if we are ready when the */ - /* SIGUSR2 arrives from the previous slave */ int alloctape(void) @@ -685,15 +682,13 @@ void proceed(int signo __unused) { - - if (ready) - longjmp(jmpbuf, 1); caught++; } void enslave(void) { + sigset_t s_mask; int cmd[2]; int i, j; @@ -704,6 +699,10 @@ signal(SIGUSR1, tperror); /* Slave sends SIGUSR1 on tape errors */ signal(SIGUSR2, proceed); /* Slave sends SIGUSR2 to next slave */ + sigemptyset(&s_mask); + sigaddset(&s_mask, SIGUSR2); + sigprocmask(SIG_BLOCK, &s_mask, NULL); + for (i = 0; i < SLAVES; i++) { if (i == slp - &slaves[0]) { caught = 1; @@ -793,12 +792,8 @@ quit("master/slave protocol botched.\n"); } } - if (setjmp(jmpbuf) == 0) { - ready = 1; - if (!caught) - (void) pause(); - } - ready = 0; + if(!caught) + sigsuspend(0); caught = 0; /* Try to write the data... */ >Release-Note: >Audit-Trail: >Unformatted:
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E1Im9it-0000YS-ON>