From owner-freebsd-current@FreeBSD.ORG Sat Sep 15 08:31:49 2007 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B440816A41B; Sat, 15 Sep 2007 08:31:49 +0000 (UTC) (envelope-from kvs@binarysolutions.dk) Received: from solow.pil.dk (relay.pil.dk [195.41.47.164]) by mx1.freebsd.org (Postfix) with ESMTP id 7820F13C46E; Sat, 15 Sep 2007 08:31:49 +0000 (UTC) (envelope-from kvs@binarysolutions.dk) Received: from coruscant.local (naboo.binarysolutions.dk [80.196.17.173]) by solow.pil.dk (Postfix) with ESMTP id 42C9B1CC0B8; Sat, 15 Sep 2007 10:31:33 +0200 (CEST) Received: by coruscant.local (Postfix, from userid 502) id 5BF83646758; Sat, 15 Sep 2007 10:31:32 +0200 (CEST) To: Pawel Jakub Dawidek References: <20070905141759.GJ12013@garage.freebsd.pl> <20070905171741.GA15709@garage.freebsd.pl> <20070913083635.GB1155@garage.freebsd.pl> From: Kenneth Vestergaard Schmidt Date: Sat, 15 Sep 2007 10:31:32 +0200 In-Reply-To: <20070913083635.GB1155@garage.freebsd.pl> (Pawel Jakub Dawidek's message of "Thu\, 13 Sep 2007 10\:36\:35 +0200") Message-ID: User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.1 (darwin) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailman-Approved-At: Mon, 17 Sep 2007 12:02:39 +0000 Cc: freebsd-current@freebsd.org Subject: Re: Unkillable and runaway processes X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 15 Sep 2007 08:31:49 -0000 Pawel Jakub Dawidek writes: >> >> The full state of the process hanging is 'zfs:(&tx->tx_quiesce_done_cv)' >> >> - it cycles between RUN, CPUx and this one. >> > >> > Hmm, this means it didn't deadlock... >> >> Here's some debugging output. PIDs 14870, 14933 and 12458 are just >> spinning and being useless. Funnily, all three have rename() in their >> backtrace. >> >> db> ps > [...] >> 360 0 0 0 SL zfs:(&tq 0xd151877c [zil_clean] >> 359 0 0 0 SL zfs:(&tq 0xd1518848 [zil_clean] > [...] > > Are you sure you disabled ZIL? Not in that output, but after that I did, and the problem remained. I'll try to get a new dump, but right now I'm fighting 'kmem_map too small' panics again :( panic: kmem_malloc(16384): kmem_map too small: 628477952 total allocated panic(c0772a72,4000,2575d000,fbfee530,c070a4b0,...) at panic+0x124 kmem_malloc(c107108c,4000,2,fbfee580,c06bf020,...) at kmem_malloc+0x22b page_alloc(0,4000,fbfee573,2,2000000,...) at page_alloc+0x27 uma_large_malloc(4000,2,c7445200,c7591460,cc6a7d80,...) at uma_large_malloc+0x50 malloc(4000,c758e6a0,2,fbfee5c4,c7569c29,...) at malloc+0x88 zfs_kmem_alloc(4000,2,fbfee60c,c751d9b8,4000,...) at zfs_kmem_alloc+0x20 zio_buf_alloc(4000,2,2,10,c8576ce4,...) at zio_buf_alloc+0x19 arc_get_data_buf(c73e4e80,2,5,0,0,...) at arc_get_data_buf+0x5a8 arc_buf_alloc(c73cd800,4000,d8e442bc,2,479,...) at arc_buf_alloc+0xb0 arc_read(cb81ecf0,c73cd800,d3f8b080,c7561480,c7521be0,...) at arc_read+0x125 dbuf_read(d8e442bc,0,2,c7585d6f,fbfee704,...) at dbuf_read+0x4b7 dmu_buf_hold(d0dd3a90,4f6a,0,1004000,0,...) at dmu_buf_hold+0xea zap_idx_to_blk(fbfee770,6e1,c8f29638,0,5d875fa0,...) at zap_idx_to_blk+0xcc zap_deref_leaf(0,1,fbfee7c0,0,c758ba5f,...) at zap_deref_leaf+0x5a fzap_lookup(c8f29600,fbfee8bc,8,0,1,...) at fzap_lookup+0x78 zap_lookup(d0dd3a90,4f6a,0,fbfee8bc,8,...) at zap_lookup+0x81 zfs_dirent_lock(fbfee874,d0274bfc,fbfee8bc,fbfee870,6,...) at zfs_dirent_lock+0x30a zfs_dirlook(d0274bfc,fbfee8bc,fbfeeb5c,c05740f1,0,...) at zfs_dirlook+0x5e /Kenneth