From owner-freebsd-stable@FreeBSD.ORG Fri May 17 17:31:04 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 13878375 for ; Fri, 17 May 2013 17:31:04 +0000 (UTC) (envelope-from jdc@koitsu.org) Received: from qmta10.emeryville.ca.mail.comcast.net (qmta10.emeryville.ca.mail.comcast.net [IPv6:2001:558:fe2d:43:76:96:30:17]) by mx1.freebsd.org (Postfix) with ESMTP id DE5E4CEF for ; Fri, 17 May 2013 17:31:03 +0000 (UTC) Received: from omta11.emeryville.ca.mail.comcast.net ([76.96.30.36]) by qmta10.emeryville.ca.mail.comcast.net with comcast id d4Lk1l0060mlR8UAA5X37b; Fri, 17 May 2013 17:31:03 +0000 Received: from koitsu.strangled.net ([67.180.84.87]) by omta11.emeryville.ca.mail.comcast.net with comcast id d5X11l00Y1t3BNj8X5X1xD; Fri, 17 May 2013 17:31:02 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 75D4673A33; Fri, 17 May 2013 10:31:01 -0700 (PDT) Date: Fri, 17 May 2013 10:31:01 -0700 From: Jeremy Chadwick To: dennis berger Subject: Re: still mbuf leak in 9.0 / 9.1? Message-ID: <20130517173101.GB87223@icarus.home.lan> References: <004BC6EA-D8E6-473E-851C-9CDA7578510A@nipsi.de> <20130515211436.GA42790@icarus.home.lan> <696B5622-A95D-4187-A027-07ECC9B5AD1F@nipsi.de> <4F319A22-E611-4EE6-A970-98315B15C12F@nipsi.de> <1186B7CE-EC84-42F6-8904-EDD0C4A5FFBD@bsdsystems.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1186B7CE-EC84-42F6-8904-EDD0C4A5FFBD@bsdsystems.de> User-Agent: Mutt/1.5.21 (2010-09-15) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net; s=q20121106; t=1368811863; bh=+xPAKdZqrJmVeJHOisYG5z2xpNeroG1twtIJxkGqFIg=; h=Received:Received:Received:Date:From:To:Subject:Message-ID: MIME-Version:Content-Type; b=WDbz/3assegZBPtFVECMMbA+36MVWS98t+sICb2CdB56FgVfY4zZBfhst1KgH14MW ErAeTD6BAGlH8EHSPhq6iBjmGYmXA1F8xnrwbjS+yAeThZnCemnehiE7w/YlKpSb0H P1hP+n2aSRy5yuremalqqjy9mqCOdewq+4BYQO6vlDkUeuaEjHzGxzeh8tl1vrQLeM 1nCp5MnVBimMdf7KM/Jf7NHzk3IhKM199D2mQDdHN6hP2Yau7ZFt+uuLQ08yvvMnVj xOgv4NLHwAsFlSPP/R8/h6cWNdMiSwcJF2dSEd2SygpF9WA5+P7QvVJxivFp6i/1jw apI/J4+EhE0ZQ== Cc: FreeBSD stable , Steven Hartland X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 17 May 2013 17:31:04 -0000 On Fri, May 17, 2013 at 11:37:23AM +0200, dennis berger wrote: > Hi List, > I can confirm that it is the bug you mentioned steven. > Here is how I found it. > > I recorded hourly zfskern and nfsd stats. like this. > > echo "PROCSTAT" >> $reportname > pgrep -S "(zfskern|nfsd)" | xargs procstat -kk >> $reportname > > luckily it crashed this night and logged this. > > 1910 101508 nfsd nfsd: service mi_switch+0x186 sleepq_wait+0x42 _sleep+0x376 arc_lowmem+0x77 kmem_malloc+0xc1 uma_large_malloc+0x4a malloc+0xd9 arc_get_data_buf+0xb5 arc_read_nolock+0x1ec arc_read+0x93 dbuf_prefetch+0x12c dmu_zfetch_dofetch+0x10b dmu_zfetch+0xaf8 dbuf_read+0x4a7 dmu_buf_hold_array_by_dnode+0x16b dmu_buf_hold_array+0x67 dmu_read_uio+0x3f zfs_freebsd_read+0x3e3 > > Maybe it would be good to merge this fix into RELENG_9_1 and distribute a fix via freebsd-update what do you think? > > best, > -dennis > > > Am 16.05.2013 um 11:42 schrieb dennis berger: > > > This is indeed a ZFS+NFS system and I can see that istgt and nfs are stuck in some ZIO state. Maybe it's this. > > Thank's for pointing out. > > > > Is it this ZFS+NFS deadlock? > > > > --- a/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c > > +++ b/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c > > @@ -3720,8 +3720,16 @@ arc_lowmem(void *arg __unused, int howto __unused) > > mutex_enter(&arc_reclaim_thr_lock); > > needfree = 1; > > cv_signal(&arc_reclaim_thr_cv); > > - while (needfree) > > - msleep(&needfree, &arc_reclaim_thr_lock, 0, "zfs:lowmem", 0); > > + > > + /* > > + * It is unsafe to block here in arbitrary threads, because we can come > > + * here from ARC itself and may hold ARC locks and thus risk a deadlock > > + * with ARC reclaim thread. > > + */ > > + if (curproc == pageproc) { > > + while (needfree) > > + msleep(&needfree, &arc_reclaim_thr_lock, 0, "zfs:lowmem", 0); > > + } > > mutex_exit(&arc_reclaim_thr_lock); > > mutex_exit(&arc_lowmem_lock); > > } > > > > I'll try to crash our testsystem. I'll assume that stressing NFS backed with ZFS a lot might trigger this bug? > > > > -dennis > > > > > > Am 16.05.2013 um 00:03 schrieb Steven Hartland: > > > >> ----- Original Message ----- From: "dennis berger" > >>> FreeBSD 9.1-RELEASE FreeBSD 9.1-RELEASE #0 r243825: Tue Dec 4 09:23:10 UTC 2012 > >>> > >>>> 3. Regarding this: > >>>>>> A clean shutdown isn't possible though. It hangs after vnode > >>>>>> cleaning, normally you would see detaching of usb devices here, or > >>>>>> other devices maybe? > >>>> Please don't conflate this with your above issue. This is almost > >>>> certainly unrelated. Please start a new thread about that if desired. > >>> > >>> Maybe this is a misunderstanding normally this system will shutdown cleanly, of course. > >>> This hang only appears after the network problem above. > >> > >> If this is a ZFS system, its a known issue which is fixed in current, > >> stable-9, stable-8 and the upcoming 8.4 release. > >> > >> If not and you have USB devices see if the following sysctl helps: > >> hw.usb.no_shutdown_wait=1 I'm sorry to say it won't happen. The only updates that the -RELEASE branches get are for security. If you want fixes for other things, you need to follow/run stables branches (i.e. stable/9), otherwise you will need to wait until 9.2-RELEASE comes out. -- | Jeremy Chadwick jdc@koitsu.org | | UNIX Systems Administrator http://jdc.koitsu.org/ | | Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB |