From owner-freebsd-fs@FreeBSD.ORG Thu Dec 8 20:24:22 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 096841065670 for ; Thu, 8 Dec 2011 20:24:22 +0000 (UTC) (envelope-from danno@internet2.edu) Received: from int-proxy01.merit.edu (int-proxy01.merit.edu [207.75.116.230]) by mx1.freebsd.org (Postfix) with ESMTP id C482E8FC08 for ; Thu, 8 Dec 2011 20:24:21 +0000 (UTC) Received: from localhost (localhost.localdomain [127.0.0.1]) by int-proxy01.merit.edu (Postfix) with ESMTP id 45A15100058 for ; Thu, 8 Dec 2011 15:06:34 -0500 (EST) X-Virus-Scanned: amavisd-new at int-proxy01.merit.edu Received: from int-proxy01.merit.edu ([127.0.0.1]) by localhost (int-proxy01.merit.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id iv89+Cb+N3Ws for ; Thu, 8 Dec 2011 15:06:33 -0500 (EST) Received: from shrubbery.internet2.edu (eduroam-wlan-116.internet2.edu [198.108.5.116]) by int-proxy01.merit.edu (Postfix) with ESMTPSA id D631C100056 for ; Thu, 8 Dec 2011 15:06:32 -0500 (EST) Message-ID: <4EE118C7.8030803@internet2.edu> Date: Thu, 08 Dec 2011 15:06:31 -0500 From: Dan Pritts User-Agent: Postbox 3.0.2 (Macintosh/20111203) MIME-Version: 1.0 To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: ZFS hangs with 8.2-release X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 08 Dec 2011 20:24:22 -0000 Hi all, I've got a data archive on several ZFS filesystems on a single 8.2-release system. When scrubbing, or sometimes even when copying data between ZFS filesystems, the system frequently hangs. System info: Sun x4200, 16GB RAM. FreeBSD 8.2-RELEASE #0: Thu Feb 17 02:41:51 UTC 2011 root@mason.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC amd64 CPU: Dual Core AMD Opteron(tm) Processor 285 SE (2592.62-MHz K8-class CPU) real memory = 17179869184 (16384 MB) avail memory = 16439779328 (15678 MB) internal LSI mpt-driver hardware raid for boot. 3x LSI parallel-scsi cards for primary storage. 48 SATA disks attached. Using Infortrend RAIDs as JBODs. 5 9-disk RAIDz2 zpools each independent of one another. 3 of them now have 2TB disks; the others still have 500GB disks. The pools were originally created under Solaris/amd64. The system ran on that for several years with no apparent issues, including doing weekly scrubs. I switched the system to FreeBSD when my contract came up for renewal at the new Oracle rates. I've posted some system information (zpool status output, loader.conf, some screenshots from system hangs) at: http://people.internet2.edu/~danno/zfs/ I've adjusted some values in loader.conf. based on stuff I found in the ZFS tuning wiki (see site above) With the defaults, a single zpool scrub of one of the pools would reliably crash the system within a couple minutes. Now, it's less crashy; it will stay up for many hours, but still hangs every 6-24 hours when the filesystems are being actively used (copy from one to the other, or a single scrub). So: 1) suggestions for fixing this with the current system? 2) is it expected that Freebsd 9 will be improved in the solaris compatibility layer (which i assume is what's crashing)? I have been unable to obtain crash dumps, apparently due to a bug in the mpt driver, or my hardware, or something. Filed a bug in the tracker about that. thanks danno -- Dan Pritts, Sr. Systems Engineer Internet2 office: +1-734-352-4953 | mobile: +1-734-834-7224