From owner-freebsd-fs@FreeBSD.ORG Wed Aug 12 12:47:25 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BBCE7106566C; Wed, 12 Aug 2009 12:47:25 +0000 (UTC) (envelope-from p.christias@noc.ntua.gr) Received: from diomedes.noc.ntua.gr (diomedes.noc.ntua.gr [IPv6:2001:648:2000:de::220]) by mx1.freebsd.org (Postfix) with ESMTP id 1C54E8FC4E; Wed, 12 Aug 2009 12:47:24 +0000 (UTC) Received: from ajax.noc.ntua.gr (ajax6.noc.ntua.gr [IPv6:2001:648:2000:dc::1]) by diomedes.noc.ntua.gr (8.14.3/8.14.3) with ESMTP id n7CClMe7058513 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Wed, 12 Aug 2009 15:47:23 +0300 (EEST) (envelope-from p.christias@noc.ntua.gr) Received: from ajax.noc.ntua.gr (localhost [127.0.0.1]) by ajax.noc.ntua.gr (8.14.3/8.14.3) with ESMTP id n7CClM89077810 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Wed, 12 Aug 2009 15:47:22 +0300 (EEST) (envelope-from p.christias@noc.ntua.gr) Received: (from christia@localhost) by ajax.noc.ntua.gr (8.14.3/8.14.3/Submit) id n7CClLVX077809; Wed, 12 Aug 2009 15:47:21 +0300 (EEST) (envelope-from p.christias@noc.ntua.gr) X-Authentication-Warning: ajax.noc.ntua.gr: christia set sender to p.christias@noc.ntua.gr using -f Date: Wed, 12 Aug 2009 15:47:21 +0300 From: Panagiotis Christias To: "Hearn, Trevor" Message-ID: <20090812124721.GA71441@noc.ntua.gr> References: <8E9591D8BCB72D4C8DE0884D9A2932DC35BD34C3@ITS-HCWNEM03.ds.Vanderbilt.edu> <200908101605.12332.jhb@freebsd.org> <200908101707.49526.jhb@freebsd.org> <8E9591D8BCB72D4C8DE0884D9A2932DC6D2EDF21@ITS-HCWNEM03.ds.Vanderbilt.edu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <8E9591D8BCB72D4C8DE0884D9A2932DC6D2EDF21@ITS-HCWNEM03.ds.Vanderbilt.edu> User-Agent: Mutt/1.5.18 (2008-05-17) X-Virus-Scanned: clamav-milter 0.95.2 at diomedes.noc.ntua.gr X-Virus-Status: Clean X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2 (diomedes.noc.ntua.gr [IPv6:2001:648:2000:de::220]); Wed, 12 Aug 2009 15:47:23 +0300 (EEST) Cc: "freebsd-fs@freebsd.org" , "Kenneth D. Merry" Subject: Re: UFS Filesystem issues, and the loss of my hair... X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Aug 2009 12:47:26 -0000 On Mon, Aug 10, 2009 at 05:20:44PM -0500, Hearn, Trevor wrote: > Yes, it does seem like it was part of one of the other messages. The isp(4) > driver was just recently updated in HEAD by mjacob@ who has maintained that > driver in the past. He may have some insight if there is an isp(4)-specific > problem. > > -- > John Baldwin > > Heh. Ok, I just watched the same error message scroll across the screen > for about 5 minutes now, with a different offset, same length. The fun > part is that it is not touching the device, /dev/da1p7 at all. From the > systat -vmstat display, I see all of the traffic coming from the > /dev/mfid0 drives. It ran for a while, then stopped. So, no access to > the drive in question, da1p7, but on the root drive, mfid0. Odd. The > partition is mapped to the root drive. I wonder if the driver lost > itself, and it tried to access the file on the empty folder on the root > drive. Sigh. Anyone? Hello, we faced a similar problem here (major greek university) about a year ago [1]. Our setup consists of Dell 2950 servers, QLogic 2462 HBAs (PCI-E) and an EMC CLARiiON CX3-40. As soon as we tried to do a simple "tar zxf ports.tgz" on a SAN volume the system would freeze or/and panic (same error messages as yours). Oleg Sharoiko suggested that we could decrease the number of tag openings (tag queue depth). Decreasing it would make the system a bit more stable but did not eliminate the problem. Then, I contacted Matthew Jacob and tested his latest isp code [2] along with alternative solutions like zfs and gjournal. Matthew was kind enough to offer his support but eventually I ran out time and patience, so I moved a couple of servers to centos in order to put the storage into production. That was around December last year. About a month ago Kenneth Merry announced that a new version of isp was available [3] which corrected bugs and added new functionality. I thought it was worth trying so I set up FreeBSD 7-stable in two Dell boxes, added the isp patches, recompiled the kernel and started the stress tests. I also looked around for more info and hints regarding qlogic hbas. The Linux driver (ql2xxx) has a 32 max queue depth by default (see ql2xmaxqdepth) which is also the recommended value by EMC. There are also similar references for Solaris (see sd:sd_max_throttle). Some mention even smaller values depending the storage. Currently, I am running stress tests, using fsx, ffsb, postmark, iozone, bonnie++, blogbench and other home-made scripts (any other suggestion?) on two 7-stable-amd64 + isp_diffs.releng7.20090629 boxes. So far, at 32 maximum tag openings, everything looks good, I have not seen any panics and the following fsck run cleanly. I will keep running more tests for a week or two hoping that they will help draw a conclusion. Regards, Panagiotis ps. cc'ed to Kenneth Merry, I think he would be interested. [1] http://lists.freebsd.org/pipermail/freebsd-scsi/2008-October/003686.html [2] http://feral.com/isp.html [3] http://lists.freebsd.org/pipermail/freebsd-scsi/2009-June/003916.html -- Panagiotis J. Christias Network Management Center P.Christias@noc.ntua.gr National Technical Univ. of Athens, GREECE