From owner-freebsd-current@FreeBSD.ORG Tue Sep 4 15:26:12 2007 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3D60A16A419 for ; Tue, 4 Sep 2007 15:26:12 +0000 (UTC) (envelope-from dan@dan.emsphone.com) Received: from dan.emsphone.com (dan.emsphone.com [199.67.51.101]) by mx1.freebsd.org (Postfix) with ESMTP id 029F813C480 for ; Tue, 4 Sep 2007 15:26:11 +0000 (UTC) (envelope-from dan@dan.emsphone.com) Received: (from dan@localhost) by dan.emsphone.com (8.14.1/8.14.1) id l84EmHxl020697; Tue, 4 Sep 2007 09:48:17 -0500 (CDT) (envelope-from dan) Date: Tue, 4 Sep 2007 09:48:16 -0500 From: Dan Nelson To: Kenneth Vestergaard Schmidt Message-ID: <20070904144815.GB3547@dan.emsphone.com> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-OS: FreeBSD 7.0-CURRENT User-Agent: Mutt/1.5.16 (2007-06-09) Cc: freebsd-current@freebsd.org Subject: Re: Unkillable and runaway processes X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 04 Sep 2007 15:26:12 -0000 In the last episode (Sep 04), Kenneth Vestergaard Schmidt said: > Our ZFS testbed is experiencing some weird problems with rsync. We > run a nightly backup of about 1.6 TB data (that's how much is stored, > not how much is transferred), but after the initial sync I haven't > been able to get the machine through one full cycle. > > After many hours of rsyncing data from 50+ machines, suddenly one > rsync-process will hang, spinning on the CPU. > > It switches state between CPU0, CPU1, RUN and 'zfs:(&', but doesn't > really do anything. It can't be killed, and you can't reboot the > machine - it'll get past syncing disks, but won't shutdown or reboot. The zfs wchan strings are way too long for ps or top to print, but if the rsync is running from a tty somewhere, hit ^T and you'll get the full wait string. -- Dan Nelson dnelson@allantgroup.com