FreeBSD Mail Archives

Date:      Mon, 01 Sep 2008 23:28:48 +0200
From:      Michael <freebsdports@bindone.de>
To:        Mike Tancsa <mike@sentex.net>
Cc:        =?windows-1252?Q?Derek_Kulin=27ski?= <takeda@takeda.tk>, freebsd-stable@freebsd.org
Subject:   Re: bin/121684: : dump(8) frequently hangs
Message-ID:  <48BC5E90.1060900@bindone.de>
In-Reply-To: <200809011336.m81Da5BT046532@lava.sentex.ca>
References:  <48BB4FA6.2090708@bindone.de> <1188558750.20080901020711@takeda.tk> <200809011336.m81Da5BT046532@lava.sentex.ca>

This is a multi-part message in MIME format.
--------------080603060905030707020102
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 8bit

Sorry to scare you, I was a little unhappy about dump hanging.
Based on the cvs repository I wrote a little patch that combines two 
changes made by scott and jeff that works agains 7.0-RELEASE (looking at 
the commit logs scared me away from trying STABLE on production right 
now. After applying this patch I could run dumps successfully on seven 
machines where it hung before on every single attempt (use is at your 
own risk of course).

cd /usr/src/sys/kern
patch < /tmp/mysleepqueue.patch
recompile and install kernel
reboot

Since I also found a fatal bug in ipv6 (panic on ping6) it might be 
better for you to wait for 7.1, for us there is no way back now.

cheers
michael

Mike Tancsa wrote:
> At 05:07 AM 9/1/2008, Derek Kuliński wrote:
> 
>> Now I'm honestly a bit scared about it (even if it will be fixed
>> before 7.1, I'm not sure I'll hurry with the update).
> 
> There have been a number of commits to releng_7 that fixed dump issues 
> for me.  A box that used to regularly exhibit hung dump processes have 
> been working fine since April.  e.g. a kernel from
> 7.0-STABLE FreeBSD 7.0-STABLE #4: Wed Apr 30
> 
> does weekly level 0 dumps and daily differential dumps on the file 
> systems below without issue
> % df -i
> Filesystem    1K-blocks      Used     Avail Capacity iused    ifree 
> %iused  Mounted on
> /dev/twed0s1a   2026030    284346   1579602    15%    2937   279685    
> 1%   /
> devfs                 1         1         0 100%       0        0  
> 100%   /dev
> /dev/twed0s1d   5077038    575828   4095048 12%    1197   658257    0%   
> /tmp
> /dev/twed0s1e  20308398  11072840   7610888 59% 1065406  1572416   40%   
> /usr
> /dev/twed0s1f  20308398  13275050   5408678 71%   13750  2624072    1%   
> /var
> /dev/twed0s1g 246875258 186393906  40731332    82% 9118036 22794922   
> 29%   /zoo
> 
> However, you should test and make sure it works for you.
> 
>         ---Mike
> 
>         ---Mike


--------------080603060905030707020102
Content-Type: text/plain;
 name="mysleepqueue.patch"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="mysleepqueue.patch"

--- subr_sleepqueue.c~	2008-09-01 05:14:28.000000000 +0200
+++ subr_sleepqueue.c	2008-09-01 05:14:28.000000000 +0200
@@ -177,7 +177,7 @@
 	for (i = 0; i < SC_TABLESIZE; i++) {
 		LIST_INIT(&sleepq_chains[i].sc_queues);
 		mtx_init(&sleepq_chains[i].sc_lock, "sleepq chain", NULL,
-		    MTX_SPIN);
+		    MTX_SPIN | MTX_RECURSE);
 #ifdef SLEEPQUEUE_PROFILING
 		snprintf(chain_name, sizeof(chain_name), "%d", i);
 		chain_oid = SYSCTL_ADD_NODE(NULL, 
@@ -403,12 +403,15 @@
 		mtx_unlock(&ps->ps_mtx);
 	}
 	/*
-	 * Lock sleepq chain before unlocking proc
-	 * without this, we could lose a race.
-	 */
+	 * Lock the per-process spinlock prior to dropping the PROC_LOCK
+	 * to avoid a signal delivery race.  PROC_LOCK, PROC_SLOCK, and
+	 * thread_lock() are currently held in tdsignal().
+ 	 */
+	PROC_SLOCK(p);
 	mtx_lock_spin(&sc->sc_lock);
 	PROC_UNLOCK(p);
 	thread_lock(td);
+	PROC_SUNLOCK(p);
 	if (ret == 0) {
 		if (!(td->td_flags & TDF_INTERRUPT)) {
 			sleepq_switch(wchan);

--------------080603060905030707020102--

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?48BC5E90.1060900>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation