From owner-freebsd-fs@FreeBSD.ORG Tue Oct 12 13:02:47 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5EA04106564A for ; Tue, 12 Oct 2010 13:02:47 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta05.emeryville.ca.mail.comcast.net (qmta05.emeryville.ca.mail.comcast.net [76.96.30.48]) by mx1.freebsd.org (Postfix) with ESMTP id 3EDBA8FC14 for ; Tue, 12 Oct 2010 13:02:47 +0000 (UTC) Received: from omta20.emeryville.ca.mail.comcast.net ([76.96.30.87]) by qmta05.emeryville.ca.mail.comcast.net with comcast id Ho8V1f0051smiN4A5p2m4K; Tue, 12 Oct 2010 13:02:46 +0000 Received: from koitsu.dyndns.org ([98.248.41.155]) by omta20.emeryville.ca.mail.comcast.net with comcast id Hp2l1f00A3LrwQ28gp2mYk; Tue, 12 Oct 2010 13:02:46 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 82E309B425; Tue, 12 Oct 2010 06:02:45 -0700 (PDT) Date: Tue, 12 Oct 2010 06:02:45 -0700 From: Jeremy Chadwick To: Andriy Gapon Message-ID: <20101012130245.GA32584@icarus.home.lan> References: <39F05641-4E46-4BE0-81CA-4DEB175A5FBE@free.de> <20101009111241.GA58948@icarus.home.lan> <4CB17983.3020907@icyb.net.ua> <20101011151508.GA10917@icarus.home.lan> <4CB32C75.2060000@icyb.net.ua> <20101011183707.GA13925@icarus.home.lan> <4CB3870F.7070107@icyb.net.ua> <20101012100709.GA29861@icarus.home.lan> <4CB4429C.9040109@icyb.net.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4CB4429C.9040109@icyb.net.ua> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs@freebsd.org Subject: Re: Locked up processes after upgrade to ZFS v15 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 Oct 2010 13:02:47 -0000 On Tue, Oct 12, 2010 at 02:12:28PM +0300, Andriy Gapon wrote: > on 12/10/2010 13:07 Jeremy Chadwick said the following: > > I've been trying to reproduce this problem on my testbed box without > > much luck so far. The box differs severely -- the biggest differences > > being the testbed runs i386 (due to CPU), only has 1GB RAM, and is > > single-core. I don't have an amd64 testbed system on hand right now. > > > > I've been trying to reproduce it by enabling Sendfile and MMAP in Apache > > on the system, putting up some very large files on an Apache-accessible > > ZFS filesystem, and using something like "wget -r" to download > > everything. I've been watching "netstat -m" to monitor the number of > > sendfile requests. > > > > There have been a couple cases where I've seen processes go into "zfs" > > state, but I have yet to see any lock up. > > > > Is there something amd64-specific to the problem at hand, or maybe some > > VM feature which isn't getting triggered on i386? Or do you know of a > > reliable way to reproduce the issue at this point? > > I don't have an easy way to reproduce it. > The theory is that you should sendfile a file with size which is not multiple of > page size (4K) and then you should mmap and read the same file; the last step > should lock up. > Perhaps, tools/regression/sockets/sendfile/sendfile.c with the following patch > would reproduce it? > http://people.freebsd.org/~avg/sendfile.diff This patch only works on HEAD. I downloaded the HEAD version of sendfile.c from here: http://www.freebsd.org/cgi/cvsweb.cgi/src/tools/regression/sockets/sendfile/sendfile.c?rev=1.7;content-type=text%2Fplain And the HEAD Makefile as well (since libmd linking is needed): http://www.freebsd.org/cgi/cvsweb.cgi/src/tools/regression/sockets/sendfile/Makefile?rev=1.6;content-type=text%2Fplain And then applied your patch. However, the result doesn't induce a lock-up. Bummer. testbox# ./sendfile 1..11 ok 1 ok 2 ok 3 ok 4 ok 5 ok 6 ok 7 ok 8 ok 9 ok 10 ok 11 mmap test testbox# Other stuff I tried: - Verified getpagesize() returns 4096 (PAE isn't enabled on this box; I'm assuming PAE results in 2MByte pages is why I mention it) - Enabling the #if 0'd code - Adjusting TEST_EXTRA a bit (200, 1000, and 3819; just numbers I pulled out of thin air) Alternately, I can try building an amd64 testbed box, but it'll be a virtual machine under VMware, which I try to avoid using as a testbed for low-level changes (VM, etc.). -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB |