From owner-freebsd-stable@FreeBSD.ORG Mon Sep 1 21:28:56 2008 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 94B58106566C for ; Mon, 1 Sep 2008 21:28:56 +0000 (UTC) (envelope-from freebsdports@bindone.de) Received: from mail.bindone.de (mail.bindone.de [80.190.134.51]) by mx1.freebsd.org (Postfix) with SMTP id 082148FC22 for ; Mon, 1 Sep 2008 21:28:55 +0000 (UTC) (envelope-from freebsdports@bindone.de) Received: (qmail 53223 invoked by uid 89); 1 Sep 2008 21:28:53 -0000 Received: from unknown (HELO bombat.bindone.de) (mg@bindone.de@84.151.246.143) by mail.bindone.de with ESMTPA; 1 Sep 2008 21:28:53 -0000 Message-ID: <48BC5E90.1060900@bindone.de> Date: Mon, 01 Sep 2008 23:28:48 +0200 From: Michael User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.8.1.16) Gecko/20080818 SeaMonkey/1.1.11 MIME-Version: 1.0 To: Mike Tancsa References: <48BB4FA6.2090708@bindone.de> <1188558750.20080901020711@takeda.tk> <200809011336.m81Da5BT046532@lava.sentex.ca> In-Reply-To: <200809011336.m81Da5BT046532@lava.sentex.ca> Content-Type: multipart/mixed; boundary="------------080603060905030707020102" Cc: =?windows-1252?Q?Derek_Kulin=27ski?= , freebsd-stable@freebsd.org Subject: Re: bin/121684: : dump(8) frequently hangs X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Sep 2008 21:28:56 -0000 This is a multi-part message in MIME format. --------------080603060905030707020102 Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 8bit Sorry to scare you, I was a little unhappy about dump hanging. Based on the cvs repository I wrote a little patch that combines two changes made by scott and jeff that works agains 7.0-RELEASE (looking at the commit logs scared me away from trying STABLE on production right now. After applying this patch I could run dumps successfully on seven machines where it hung before on every single attempt (use is at your own risk of course). cd /usr/src/sys/kern patch < /tmp/mysleepqueue.patch recompile and install kernel reboot Since I also found a fatal bug in ipv6 (panic on ping6) it might be better for you to wait for 7.1, for us there is no way back now. cheers michael Mike Tancsa wrote: > At 05:07 AM 9/1/2008, Derek KuliƄski wrote: > >> Now I'm honestly a bit scared about it (even if it will be fixed >> before 7.1, I'm not sure I'll hurry with the update). > > There have been a number of commits to releng_7 that fixed dump issues > for me. A box that used to regularly exhibit hung dump processes have > been working fine since April. e.g. a kernel from > 7.0-STABLE FreeBSD 7.0-STABLE #4: Wed Apr 30 > > does weekly level 0 dumps and daily differential dumps on the file > systems below without issue > % df -i > Filesystem 1K-blocks Used Avail Capacity iused ifree > %iused Mounted on > /dev/twed0s1a 2026030 284346 1579602 15% 2937 279685 > 1% / > devfs 1 1 0 100% 0 0 > 100% /dev > /dev/twed0s1d 5077038 575828 4095048 12% 1197 658257 0% > /tmp > /dev/twed0s1e 20308398 11072840 7610888 59% 1065406 1572416 40% > /usr > /dev/twed0s1f 20308398 13275050 5408678 71% 13750 2624072 1% > /var > /dev/twed0s1g 246875258 186393906 40731332 82% 9118036 22794922 > 29% /zoo > > However, you should test and make sure it works for you. > > ---Mike > > ---Mike --------------080603060905030707020102 Content-Type: text/plain; name="mysleepqueue.patch" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="mysleepqueue.patch" --- subr_sleepqueue.c~ 2008-09-01 05:14:28.000000000 +0200 +++ subr_sleepqueue.c 2008-09-01 05:14:28.000000000 +0200 @@ -177,7 +177,7 @@ for (i = 0; i < SC_TABLESIZE; i++) { LIST_INIT(&sleepq_chains[i].sc_queues); mtx_init(&sleepq_chains[i].sc_lock, "sleepq chain", NULL, - MTX_SPIN); + MTX_SPIN | MTX_RECURSE); #ifdef SLEEPQUEUE_PROFILING snprintf(chain_name, sizeof(chain_name), "%d", i); chain_oid = SYSCTL_ADD_NODE(NULL, @@ -403,12 +403,15 @@ mtx_unlock(&ps->ps_mtx); } /* - * Lock sleepq chain before unlocking proc - * without this, we could lose a race. - */ + * Lock the per-process spinlock prior to dropping the PROC_LOCK + * to avoid a signal delivery race. PROC_LOCK, PROC_SLOCK, and + * thread_lock() are currently held in tdsignal(). + */ + PROC_SLOCK(p); mtx_lock_spin(&sc->sc_lock); PROC_UNLOCK(p); thread_lock(td); + PROC_SUNLOCK(p); if (ret == 0) { if (!(td->td_flags & TDF_INTERRUPT)) { sleepq_switch(wchan); --------------080603060905030707020102--