From owner-freebsd-fs@FreeBSD.ORG Sun Oct 21 00:11:03 2012 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id EBC3F5B3 for ; Sun, 21 Oct 2012 00:11:03 +0000 (UTC) (envelope-from freebsd@pki2.com) Received: from btw.pki2.com (btw.pki2.com [IPv6:2001:470:a:6fd::2]) by mx1.freebsd.org (Postfix) with ESMTP id 93A8F8FC08 for ; Sun, 21 Oct 2012 00:11:03 +0000 (UTC) Received: from [127.0.0.1] (localhost [127.0.0.1]) by btw.pki2.com (8.14.5/8.14.5) with ESMTP id q9L0AvWx057130 for ; Sat, 20 Oct 2012 17:10:57 -0700 (PDT) (envelope-from freebsd@pki2.com) Subject: ZFS HBAs + LSI chip sets (Was: ZFS hang (system #2)) From: Dennis Glatting To: freebsd-fs@FreeBSD.org In-Reply-To: <508322EC.4080700@FreeBSD.org> References: <1350698905.86715.33.camel@btw.pki2.com> <1350711509.86715.59.camel@btw.pki2.com> <50825598.3070505@FreeBSD.org> <1350744349.88577.10.camel@btw.pki2.com> <1350765093.86715.69.camel@btw.pki2.com> <508322EC.4080700@FreeBSD.org> Content-Type: text/plain; charset="ISO-8859-1" Date: Sat, 20 Oct 2012 17:10:57 -0700 Message-ID: <1350778257.86715.106.camel@btw.pki2.com> Mime-Version: 1.0 X-Mailer: Evolution 2.32.1 FreeBSD GNOME Team Port Content-Transfer-Encoding: 7bit X-yoursite-MailScanner-Information: Dennis Glatting X-yoursite-MailScanner-ID: q9L0AvWx057130 X-yoursite-MailScanner: Found to be clean X-MailScanner-From: freebsd@pki2.com X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 21 Oct 2012 00:11:04 -0000 I chosen the LSI2008 chip set because the code was donated by LSI, and they therefore demonstrated interest in supporting their products under FreeBSD, and that chip set is found in a lot of places, notably Supermicro boards. Additionally, there were stories of success on the lists for several boards. That said, I have received private email from others expressing frustration with ZFS and the "hang" problems, which I believe are also the LSI chips. I have two questions for the broader list: 1) What HBAs are you using for ZFS and what is your level of success/stability? Also, what is your load? 2) How well is the LSI chip sets supported under FreeBSD? I'm not all that crazy on the idea of replacing eight to ten LSI boards but I am less crazy on the idea of this problem's continuance. I do have three Areca 1880i boards (not used for ZFS, rather dumb RAID) however two of them failed within the first year for no apparent reason -- I just walked in my lab one day and was greeted by hosed systems. Consequently, I'm not all that keen on Areca. I am in the process of releasing a spec for purchase of a Supermicro in a SC848 chassis, and an identical one next FY. I would appreciate a clue as to what HBA would be a good/better choice. The vendor spec'd Supermicro boards with the LSI2008 chips. I own four or six of the AOC-USAS2-L8e boards, two of which I destroyed, but there wasn't much difference between them and the 9211. From owner-freebsd-fs@FreeBSD.ORG Sun Oct 21 02:20:32 2012 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id DC9B0594 for ; Sun, 21 Oct 2012 02:20:32 +0000 (UTC) (envelope-from prvs=16419edf90=killing@multiplay.co.uk) Received: from mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23]) by mx1.freebsd.org (Postfix) with ESMTP id 4E9968FC0C for ; Sun, 21 Oct 2012 02:20:31 +0000 (UTC) Received: from r2d2 ([188.220.16.49]) by mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23]) (MDaemon PRO v10.0.4) with ESMTP id md50000774761.msg for ; Sun, 21 Oct 2012 03:20:28 +0100 X-Spam-Processed: mail1.multiplay.co.uk, Sun, 21 Oct 2012 03:20:28 +0100 (not processed: message from valid local sender) X-MDRemoteIP: 188.220.16.49 X-Return-Path: prvs=16419edf90=killing@multiplay.co.uk X-Envelope-From: killing@multiplay.co.uk X-MDaemon-Deliver-To: freebsd-fs@FreeBSD.org Message-ID: <897B9997FC6547C9935A8FC3801AB832@multiplay.co.uk> From: "Steven Hartland" To: "Dennis Glatting" , References: <1350698905.86715.33.camel@btw.pki2.com> <1350711509.86715.59.camel@btw.pki2.com> <50825598.3070505@FreeBSD.org> <1350744349.88577.10.camel@btw.pki2.com> <1350765093.86715.69.camel@btw.pki2.com> <508322EC.4080700@FreeBSD.org> <1350778257.86715.106.camel@btw.pki2.com> Subject: Re: ZFS HBAs + LSI chip sets (Was: ZFS hang (system #2)) Date: Sun, 21 Oct 2012 03:20:19 +0100 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=original Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.5931 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6157 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 21 Oct 2012 02:20:33 -0000 ----- Original Message ----- From: "Dennis Glatting" To: Sent: Sunday, October 21, 2012 1:10 AM Subject: ZFS HBAs + LSI chip sets (Was: ZFS hang (system #2)) > > I chosen the LSI2008 chip set because the code was donated by LSI, and > they therefore demonstrated interest in supporting their products under > FreeBSD, and that chip set is found in a lot of places, notably > Supermicro boards. Additionally, there were stories of success on the > lists for several boards. That said, I have received private email from > others expressing frustration with ZFS and the "hang" problems, which I > believe are also the LSI chips. > > I have two questions for the broader list: > > 1) What HBAs are you using for ZFS and what is your level > of success/stability? Also, what is your load? > > 2) How well is the LSI chip sets supported under FreeBSD? > > I'm not all that crazy on the idea of replacing eight to ten LSI boards > but I am less crazy on the idea of this problem's continuance. > > I do have three Areca 1880i boards (not used for ZFS, rather dumb RAID) > however two of them failed within the first year for no apparent reason > -- I just walked in my lab one day and was greeted by hosed systems. > Consequently, I'm not all that keen on Areca. > > I am in the process of releasing a spec for purchase of a Supermicro in > a SC848 chassis, and an identical one next FY. I would appreciate a clue > as to what HBA would be a good/better choice. The vendor spec'd > Supermicro boards with the LSI2008 chips. I own four or six of the > AOC-USAS2-L8e boards, two of which I destroyed, but there wasn't much > difference between them and the 9211. We have a number of machines using mps although under 8.2 which is quite a old revision of the driver, 8.3 + had quite some changes committed by LSI iirc. With those machines not had a single problem under ZFS but I'm not sure how highly loaded the client has them. To give you an idea of the specs:- One of said boxes is a backups box with 32TB array (2 x 16TB raidz2) and a mirrored OS + dual SSD cache disks across 3x LSI 2008's on firmware 11.00.00.00 Another is DB box with 10TB array (5 x 2TB mirror) with dual SSD cache and mirrored SSD logs again 11.00.00.00 firmware. They are all supermicro machines with retail LSI controllers (not supermicro but that shouldn't make much difference) Not totally relevant but worth mentioning is that we're currently having some serious issues with other supermicro kit when trying to run SATA 3 speeds off the onboard Intel patsburg controller via the hotswap backplane. When we do so the disks randomly stall waiting for timeout or drop completely. We have determined the issue is the backplane that connects the MB to the hotswap backplane as when the disks are corrected directly using SATA cables the problem goes away. We're currently waiting for LSI 2008 backplane replacements which we hope will solve the problem. Regards Steve ZFS. ================================================ This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmaster@multiplay.co.uk. From owner-freebsd-fs@FreeBSD.ORG Sun Oct 21 06:52:09 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 183AFD8A for ; Sun, 21 Oct 2012 06:52:09 +0000 (UTC) (envelope-from fjwcash@gmail.com) Received: from mail-la0-f54.google.com (mail-la0-f54.google.com [209.85.215.54]) by mx1.freebsd.org (Postfix) with ESMTP id 809278FC12 for ; Sun, 21 Oct 2012 06:52:07 +0000 (UTC) Received: by mail-la0-f54.google.com with SMTP id e12so1315007lag.13 for ; Sat, 20 Oct 2012 23:52:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=ehs75o5+vKHk7/lu4FkdJKG3p5jSNd4RCAxYL8iLtJM=; b=f3zTgKaO2DWa+GtCMJEU5vAmDgo5XGmqvKyQaO8KMvmoowFG8LJoAx2gqS4wcfFv49 hjRKampLwnxKkFiMfgkDCXpWP/KcGrvHISh/qRttqHBX3k9REu617uoSUBG1CSUnuLCr /dgbH6lsf+dkegDvTAsovgMFV0pl05yEgPv+4OT9LwZoX3MoXiHe9re50aCsZKcH651L GgfVLfDIM+9EoBcEUIS6v9wMxZubx6TaM4IHWbxeqsbmoNYu5AM580qCGJfVC12WpxSL jP6gDR9C4A5HFNuBdgSGISOL7IyVlAZh2pvez0PDyFPScxbZLW9R3H60LTtR6eFkaDOT ktfg== MIME-Version: 1.0 Received: by 10.152.105.135 with SMTP id gm7mr5062073lab.22.1350802326255; Sat, 20 Oct 2012 23:52:06 -0700 (PDT) Received: by 10.114.24.66 with HTTP; Sat, 20 Oct 2012 23:52:06 -0700 (PDT) Received: by 10.114.24.66 with HTTP; Sat, 20 Oct 2012 23:52:06 -0700 (PDT) In-Reply-To: <1350778257.86715.106.camel@btw.pki2.com> References: <1350698905.86715.33.camel@btw.pki2.com> <1350711509.86715.59.camel@btw.pki2.com> <50825598.3070505@FreeBSD.org> <1350744349.88577.10.camel@btw.pki2.com> <1350765093.86715.69.camel@btw.pki2.com> <508322EC.4080700@FreeBSD.org> <1350778257.86715.106.camel@btw.pki2.com> Date: Sat, 20 Oct 2012 23:52:06 -0700 Message-ID: Subject: Re: ZFS HBAs + LSI chip sets (Was: ZFS hang (system #2)) From: Freddie Cash To: Dennis Glatting Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 21 Oct 2012 06:52:09 -0000 On Oct 20, 2012 5:11 PM, "Dennis Glatting" wrote: > > > I chosen the LSI2008 chip set because the code was donated by LSI, and > they therefore demonstrated interest in supporting their products under > FreeBSD, and that chip set is found in a lot of places, notably > Supermicro boards. Additionally, there were stories of success on the > lists for several boards. That said, I have received private email from > others expressing frustration with ZFS and the "hang" problems, which I > believe are also the LSI chips. > > I have two questions for the broader list: > > 1) What HBAs are you using for ZFS and what is your level > of success/stability? Also, what is your load? SuperMicro AOC-USAS-8i using the mpt(4) driver on FreeBSD 9-STABLE in one server (alpha). SuperMicro AOC-USAS2-8i using the mps(4) driver on FreeBSD 9-STABLE in 2 servers (beta and omega). I think they were updated on Oct 10ish. The alpha box runs 12 parallel rsync processes to backup 50-odd Linux servers across multiple data centres. The beta box runs 12 parallel rsync processes to backup 100-odd Linux and FreeBSD servers across 50-odd buildings. Both boxes uses zfs send to replicate the data to omega (each box saturates a 1 Gbps link during the zfs send). Alpha and omega have 24 SATA 3 Gbps harddrives, configured as 3x 8-drive raidz2 vdevs, with a 32 GB SSD split between OS, log vdev, and cache vdev. Beta has 16 SATA 6 Gbps harddrives, configured into 3x 5-drive raidz2 vdevs, with a cold-spare, and a 32 GB SSD split between OS, log vdev, and cache vdev. All three have been patched to support feature flags. All three have dedupe enabled, compression enabled, and HPN SSH patches with the NONE cipher enabled. All three run without any serious issues. The only issues we've had are 3, maybe 4, situations where I've tried to destroy multi-TB filesystems without enough RAM in the machine. We're now running a minimum of 32 GB of RAM with 64 GB in one box. > 2) How well is the LSI chip sets supported under FreeBSD? I have no complaints. And we're ordering a bunch of LSI 9200-series controllers for new servers (PCI brackets instead of UIO). From owner-freebsd-fs@FreeBSD.ORG Sun Oct 21 15:41:45 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 35474822; Sun, 21 Oct 2012 15:41:45 +0000 (UTC) (envelope-from freebsd@pki2.com) Received: from btw.pki2.com (btw.pki2.com [IPv6:2001:470:a:6fd::2]) by mx1.freebsd.org (Postfix) with ESMTP id D53038FC0C; Sun, 21 Oct 2012 15:41:44 +0000 (UTC) Received: from [127.0.0.1] (localhost [127.0.0.1]) by btw.pki2.com (8.14.5/8.14.5) with ESMTP id q9LFfa3Q026044; Sun, 21 Oct 2012 08:41:36 -0700 (PDT) (envelope-from freebsd@pki2.com) Subject: Re: ZFS hang status update, update From: Dennis Glatting To: Andriy Gapon In-Reply-To: <1350711509.86715.59.camel@btw.pki2.com> References: <1350698905.86715.33.camel@btw.pki2.com> <1350711509.86715.59.camel@btw.pki2.com> Content-Type: text/plain; charset="ISO-8859-1" Date: Sun, 21 Oct 2012 08:41:36 -0700 Message-ID: <1350834096.86715.116.camel@btw.pki2.com> Mime-Version: 1.0 X-Mailer: Evolution 2.32.1 FreeBSD GNOME Team Port Content-Transfer-Encoding: 7bit X-yoursite-MailScanner-Information: Dennis Glatting X-yoursite-MailScanner-ID: q9LFfa3Q026044 X-yoursite-MailScanner: Found to be clean X-MailScanner-From: freebsd@pki2.com Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 21 Oct 2012 15:41:45 -0000 System #2 is still running this morning. System #1: With the zpool disk-1 cache removed: The pools: mc# zpool status pool: disk-1 state: ONLINE scan: scrub repaired 0 in 0h38m with 0 errors on Tue Oct 16 16:47:51 2012 config: NAME STATE READ WRITE CKSUM disk-1 ONLINE 0 0 0 raidz2-0 ONLINE 0 0 0 da5 ONLINE 0 0 0 da6 ONLINE 0 0 0 da7 ONLINE 0 0 0 da2 ONLINE 0 0 0 da3 ONLINE 0 0 0 da4 ONLINE 0 0 0 errors: No known data errors pool: disk-2 state: ONLINE scan: scrub repaired 0 in 0h6m with 0 errors on Tue Oct 16 17:05:43 2012 config: NAME STATE READ WRITE CKSUM disk-2 ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 da9 ONLINE 0 0 0 da10 ONLINE 0 0 0 errors: No known data errors The procstat output is found here: http://www.pki2.com/mc.procstat.21Oct2012.txt The camcontrol output: mc# /mnt/camcontrol tags da0 -v (hung output) ^C da0 is the SSD but it isn't attached to anything. It would appear this device lost its mind, yes? mc# /mnt/camcontrol tags da1 -v (pass1:mps0:0:5:0): dev_openings 255 (pass1:mps0:0:5:0): dev_active 0 (pass1:mps0:0:5:0): devq_openings 255 (pass1:mps0:0:5:0): devq_queued 0 (pass1:mps0:0:5:0): held 0 (pass1:mps0:0:5:0): mintags 2 (pass1:mps0:0:5:0): maxtags 255 mc# /mnt/camcontrol tags da2 -v (pass2:mps0:0:6:0): dev_openings 253 (pass2:mps0:0:6:0): dev_active 2 (pass2:mps0:0:6:0): devq_openings 253 (pass2:mps0:0:6:0): devq_queued 0 (pass2:mps0:0:6:0): held 0 (pass2:mps0:0:6:0): mintags 2 (pass2:mps0:0:6:0): maxtags 255 mc# /mnt/camcontrol tags da3 -v (pass3:mps0:0:7:0): dev_openings 253 (pass3:mps0:0:7:0): dev_active 2 (pass3:mps0:0:7:0): devq_openings 253 (pass3:mps0:0:7:0): devq_queued 0 (pass3:mps0:0:7:0): held 0 (pass3:mps0:0:7:0): mintags 2 (pass3:mps0:0:7:0): maxtags 255 mc# /mnt/camcontrol tags da4 -v (pass4:mps0:0:8:0): dev_openings 253 (pass4:mps0:0:8:0): dev_active 2 (pass4:mps0:0:8:0): devq_openings 253 (pass4:mps0:0:8:0): devq_queued 0 (pass4:mps0:0:8:0): held 0 (pass4:mps0:0:8:0): mintags 2 (pass4:mps0:0:8:0): maxtags 255 mc# /mnt/camcontrol tags da5 -v (pass5:mps0:0:9:0): dev_openings 252 (pass5:mps0:0:9:0): dev_active 3 (pass5:mps0:0:9:0): devq_openings 252 (pass5:mps0:0:9:0): devq_queued 0 (pass5:mps0:0:9:0): held 0 (pass5:mps0:0:9:0): mintags 2 (pass5:mps0:0:9:0): maxtags 255 mc# /mnt/camcontrol tags da6 -v (pass6:mps0:0:10:0): dev_openings 251 (pass6:mps0:0:10:0): dev_active 4 (pass6:mps0:0:10:0): devq_openings 251 (pass6:mps0:0:10:0): devq_queued 0 (pass6:mps0:0:10:0): held 0 (pass6:mps0:0:10:0): mintags 2 (pass6:mps0:0:10:0): maxtags 255 mc# /mnt/camcontrol tags da7 -v (pass7:mps0:0:11:0): dev_openings 253 (pass7:mps0:0:11:0): dev_active 2 (pass7:mps0:0:11:0): devq_openings 253 (pass7:mps0:0:11:0): devq_queued 0 (pass7:mps0:0:11:0): held 0 (pass7:mps0:0:11:0): mintags 2 (pass7:mps0:0:11:0): maxtags 255 mc# /mnt/camcontrol tags da8 -v (pass8:mps1:0:0:0): dev_openings 216 (pass8:mps1:0:0:0): dev_active 39 (pass8:mps1:0:0:0): devq_openings 216 (pass8:mps1:0:0:0): devq_queued 0 (pass8:mps1:0:0:0): held 0 (pass8:mps1:0:0:0): mintags 2 (pass8:mps1:0:0:0): maxtags 255 mc# /mnt/camcontrol tags da9 -v (pass9:mps1:0:9:0): dev_openings 255 (pass9:mps1:0:9:0): dev_active 0 (pass9:mps1:0:9:0): devq_openings 255 (pass9:mps1:0:9:0): devq_queued 0 (pass9:mps1:0:9:0): held 0 (pass9:mps1:0:9:0): mintags 2 (pass9:mps1:0:9:0): maxtags 255 mc# /mnt/camcontrol tags da10 -v (pass10:mps1:0:11:0): dev_openings 255 (pass10:mps1:0:11:0): dev_active 0 (pass10:mps1:0:11:0): devq_openings 255 (pass10:mps1:0:11:0): devq_queued 0 (pass10:mps1:0:11:0): held 0 (pass10:mps1:0:11:0): mintags 2 (pass10:mps1:0:11:0): maxtags 255 From owner-freebsd-fs@FreeBSD.ORG Sun Oct 21 15:54:14 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 593B19DF for ; Sun, 21 Oct 2012 15:54:14 +0000 (UTC) (envelope-from freebsd@penx.com) Received: from btw.pki2.com (btw.pki2.com [IPv6:2001:470:a:6fd::2]) by mx1.freebsd.org (Postfix) with ESMTP id 140E48FC1B for ; Sun, 21 Oct 2012 15:54:14 +0000 (UTC) Received: from [127.0.0.1] (localhost [127.0.0.1]) by btw.pki2.com (8.14.5/8.14.5) with ESMTP id q9LFs8CQ030185; Sun, 21 Oct 2012 08:54:09 -0700 (PDT) (envelope-from freebsd@penx.com) Subject: Re: ZFS HBAs + LSI chip sets (Was: ZFS hang (system #2)) From: Dennis Glatting To: Freddie Cash In-Reply-To: References: <1350698905.86715.33.camel@btw.pki2.com> <1350711509.86715.59.camel@btw.pki2.com> <50825598.3070505@FreeBSD.org> <1350744349.88577.10.camel@btw.pki2.com> <1350765093.86715.69.camel@btw.pki2.com> <508322EC.4080700@FreeBSD.org> <1350778257.86715.106.camel@btw.pki2.com> Content-Type: text/plain; charset="us-ascii" Date: Sun, 21 Oct 2012 08:54:08 -0700 Message-ID: <1350834848.88577.33.camel@btw.pki2.com> Mime-Version: 1.0 X-Mailer: Evolution 2.32.1 FreeBSD GNOME Team Port Content-Transfer-Encoding: 7bit X-yoursite-MailScanner-Information: Dennis Glatting X-yoursite-MailScanner-ID: q9LFs8CQ030185 X-yoursite-MailScanner: Found to be clean X-MailScanner-From: freebsd@penx.com Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 21 Oct 2012 15:54:14 -0000 On Sat, 2012-10-20 at 23:52 -0700, Freddie Cash wrote: > On Oct 20, 2012 5:11 PM, "Dennis Glatting" wrote: > > > > > > I chosen the LSI2008 chip set because the code was donated by LSI, and > > they therefore demonstrated interest in supporting their products under > > FreeBSD, and that chip set is found in a lot of places, notably > > Supermicro boards. Additionally, there were stories of success on the > > lists for several boards. That said, I have received private email from > > others expressing frustration with ZFS and the "hang" problems, which I > > believe are also the LSI chips. > > > > I have two questions for the broader list: > > > > 1) What HBAs are you using for ZFS and what is your level > > of success/stability? Also, what is your load? > > SuperMicro AOC-USAS-8i using the mpt(4) driver on FreeBSD 9-STABLE in one > server (alpha). > > SuperMicro AOC-USAS2-8i using the mps(4) driver on FreeBSD 9-STABLE in 2 > servers (beta and omega). > > I think they were updated on Oct 10ish. > > The alpha box runs 12 parallel rsync processes to backup 50-odd Linux > servers across multiple data centres. > > The beta box runs 12 parallel rsync processes to backup 100-odd Linux and > FreeBSD servers across 50-odd buildings. > > Both boxes uses zfs send to replicate the data to omega (each box saturates > a 1 Gbps link during the zfs send). > > Alpha and omega have 24 SATA 3 Gbps harddrives, configured as 3x 8-drive > raidz2 vdevs, with a 32 GB SSD split between OS, log vdev, and cache vdev. > > Beta has 16 SATA 6 Gbps harddrives, configured into 3x 5-drive raidz2 > vdevs, with a cold-spare, and a 32 GB SSD split between OS, log vdev, and > cache vdev. > > All three have been patched to support feature flags. All three have > dedupe enabled, compression enabled, and HPN SSH patches with the NONE > cipher enabled. > > All three run without any serious issues. The only issues we've had are 3, > maybe 4, situations where I've tried to destroy multi-TB filesystems > without enough RAM in the machine. We're now running a minimum of 32 GB of > RAM with 64 GB in one box. > > > 2) How well is the LSI chip sets supported under FreeBSD? > > I have no complaints. And we're ordering a bunch of LSI 9200-series > controllers for new servers (PCI brackets instead of UIO). Perhaps I am doing something fundamentally wrong with my SSDs. Currently I simply add them to a pool after being ashift aligned via gnop (e.g., -S 4096, depending on page size). I remember reading somewhere about offsets to insure data is page aligned but, IIRC, this was strictly a performance issue. Are you doing something different? From owner-freebsd-fs@FreeBSD.ORG Sun Oct 21 22:06:37 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 651566D2 for ; Sun, 21 Oct 2012 22:06:37 +0000 (UTC) (envelope-from jurgen.weber@theiconic.com.au) Received: from exprod6og116.obsmtp.com (exprod6og116.obsmtp.com [64.18.1.37]) by mx1.freebsd.org (Postfix) with SMTP id E0B258FC14 for ; Sun, 21 Oct 2012 22:06:36 +0000 (UTC) Received: from mail-da0-f72.google.com ([209.85.210.72]) (using TLSv1) by exprod6ob116.postini.com ([64.18.5.12]) with SMTP ID DSNKUIRx5XOqTqhUNNr456BTrMx2tkx5fScP@postini.com; Sun, 21 Oct 2012 15:06:36 PDT Received: by mail-da0-f72.google.com with SMTP id r28so3662925daj.7 for ; Sun, 21 Oct 2012 15:06:29 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type:content-transfer-encoding :x-gm-message-state; bh=DGQzd3fb/9N4ZtqluZs0mL05aeGzre+gXzmSlWQm6T8=; b=cHIO+3Tr/mNtxo3OLnYPbrSUIGLTNis9LEx3gWcI5ije6o3mYPvSMIUDPdk1F7Vbp/ BscL7LKf3nTDmq0bqXZ9LdF/txuJWr30XwMkL/qzwOVw0r0dfAvByG/o+p3730OemtTl GGE+RivroYYmOLB6NtruOTw9lm5NEvg3NELmGlVC4SwqBKDJl2ie7XkeTDL71mVyAABd tQbgPCfP7UdGXHrbRuZHW9tBz38eFTzGAQai3xECpq210uzISjzer1Q7LnW678auOIhQ P37ofQk762juuTbQ/D+RJK6KjluKAZmOaTUsgZO/X2c2FL3c04cSV0QiWZKKtBxJykKQ AQSQ== Received: by 10.68.218.132 with SMTP id pg4mr24882230pbc.100.1350857189580; Sun, 21 Oct 2012 15:06:29 -0700 (PDT) Received: by 10.68.218.132 with SMTP id pg4mr24882217pbc.100.1350857189406; Sun, 21 Oct 2012 15:06:29 -0700 (PDT) Received: from [172.20.24.157] ([202.126.107.170]) by mx.google.com with ESMTPS id v9sm4787486paz.6.2012.10.21.15.06.27 (version=SSLv3 cipher=OTHER); Sun, 21 Oct 2012 15:06:28 -0700 (PDT) Message-ID: <508471E0.9010805@theiconic.com.au> Date: Mon, 22 Oct 2012 09:06:24 +1100 From: =?ISO-8859-1?Q?J=FCrgen_Weber?= User-Agent: Mozilla/5.0 (X11; Linux i686 on x86_64; rv:16.0) Gecko/20121010 Thunderbird/16.0.1 MIME-Version: 1.0 To: freebsd-fs@freebsd.org Subject: Re: mfi0 timeout error zfs boot mount problem References: <508090E8.4010300@theiconic.com.au> <5081CE05.1010108@theiconic.com.au> <50830EA3.6020001@theiconic.com.au> In-Reply-To: <50830EA3.6020001@theiconic.com.au> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-Gm-Message-State: ALoCoQmEqLxp8f7WrW+gaSxFsjbkk8dySUCZgwRiS62NxW5iO6/HzZjiQDVFT5+nBeETJnzQxts3QpyJBNOMb3+Wzu3O2HDRnt4VxH9U7LVwPvfKH8yQ7gMQAoHxgj5xGxZxWb8RmQP35O7FGak9xd9+E1NSHpouHA== X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 21 Oct 2012 22:06:37 -0000 This is still a problem for me, is anyone there? :) I have tried the following at the bootime loader. vfs.zfs.zil_disable="1" vfs.zfs.prefetch_disable="1" vfs.zfs.txg.timeout="5" Any other suggestions on how to get this zpool to import and mount again? Thanks On 21/10/12 07:50, Jurgen Weber wrote: > Hi > > Lastly, is there a way at boot time, some sysctl's or something I can > set to bring zfs to a minimalistic state? Turn off features, etc to > get this to mount? > > Any ideas appreciated. > > Thanks > > Jurgen > On 20/10/2012 9:02 AM, Jurgen Weber wrote: >> Guys >> >> Some more details on this, some insight would be greatly appreciated. >> >> As my day wore on trying to get this zpool to import or mount I have >> learnt a few things. I think over time this issue has came about as >> more and more data was added to the file systems. >> >> Some further details: >> >> Its a 8 disk raidz pool that the system boots from as well. The disk >> are all 2TB. >> The server has 16GB Of RAM, I notcied the day before this happen the >> server was struggling with its RAM griding to a halt and dumping its >> RAM. >> The issue is not hardware because I found another server (same one) >> swapped the harddrives out took another 8GB of RAM and I have the >> same problem. >> The main data file systems have dedup and gzip compression on. >> >> I have booted from open/Oracle Solars 11 adn attempted to import and >> the Solaris live CD will not import either. In the Solaris system the >> disk detach from the system. >> >> I get the feeling that ZFS is hitting some root limit when attempting >> to mount and its not finishing the job. >> >> Thanks >> >> Jurgen >> >> On 19/10/2012 10:29 AM, Jürgen Weber wrote: >>> Team >>> >>> I have googled around for a solution and I see a lot of posts about >>> firmware versions and patches for FreeBSD 8.*. >>> >>> I have a FreeBSD 9.1rc1 system, which was beta1 orginally and has >>> been running for months. >>> >>> Now it will not boot, I get the following: >>> >>> "Trying to mount root from zfs:tank/root []..... >>> mfi0: COMMAND 0Xffffff8000cb83530 TIMEOUT AFTER xxx SECONDS >>> (this just repeats). >>> >>> I have not seen this error before during normal runtime, _only_ >>> during boot. >>> >>> Originally when I had the problem I could boot off a USB stick >>> (9.1beta1 or rc1), run a 'zpool import -f tank' and it would work on >>> the livecd. Rebooting and the main system would work. >>> >>> This time this work around does not work for me. When I am on the >>> USB stick I can run a 'zpool import' and all of the disk are >>> recognised, the pool is recognised and the file system is healthy. >>> >>> The Card is a H700 PERC, with 12.10.3 firmware in a Dell R515. >>> Running FreeBSD 9.1-RC1, latest zfs and zpool versions. >>> >>> I have tried disabling the cache (mfiutil cache xxx disable). I have >>> also gone into the Card settings and changed under advanced settings >>> "adaptive forward read" to "read only". >>> >>> Any help, appreciated. >>> >>> Thanks >>> >> > -- Jürgen Weber Systems Engineer IT Infrastructure Team Leader THE ICONIC | E jurgen.weber@theiconic.com.au | www.theiconic.com.au From owner-freebsd-fs@FreeBSD.ORG Mon Oct 22 01:36:24 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 30030F59 for ; Mon, 22 Oct 2012 01:36:24 +0000 (UTC) (envelope-from jurgen.weber@theiconic.com.au) Received: from exprod6og110.obsmtp.com (exprod6og110.obsmtp.com [64.18.1.25]) by mx1.freebsd.org (Postfix) with SMTP id 9444D8FC0C for ; Mon, 22 Oct 2012 01:36:23 +0000 (UTC) Received: from mail-pb0-f72.google.com ([209.85.160.72]) (using TLSv1) by exprod6ob110.postini.com ([64.18.5.12]) with SMTP ID DSNKUISjEXtrkqaSN+PHtb4nIKPPaGfSDc5b@postini.com; Sun, 21 Oct 2012 18:36:23 PDT Received: by mail-pb0-f72.google.com with SMTP id rp2so6258148pbb.7 for ; Sun, 21 Oct 2012 18:36:16 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type:x-gm-message-state; bh=1j3AjdbiAfBoTsp2vkctHPuBPwhrfsaQ+CidYM1kuoY=; b=VxvKKrdDW/ij7YPQbTNWH0XsnppS+dYDiisrUmkMU6u8Nb2Ymlr4RWiIc6pz8RKP6U p2EIWYEQ3vCAZH8If4HOPyyuJFAZpaoAbsTzc34i3van5dU0257GK7eDyOZr+k2V24yw wzrXp7uxnNM3hsmxLEjK4WOyJ7jRrbtIcPAURwhYMP+ALnsN++rnKmqwIYVQf1D4Z2as OYVG8U2/dv0KJpk3sUWP/xEwS07Lzf+X14ecslnRoZloG4TYWxZw+M6sBKrN9AK0mp4g 5hQP0OsHRe5/8r369nm2P0wHm43TB5IPU8jXjzcBKX4baPAa9lcHsM3CstYsXRkykg9g 9p+w== Received: by 10.68.224.161 with SMTP id rd1mr26150251pbc.49.1350869407360; Sun, 21 Oct 2012 18:30:07 -0700 (PDT) Received: by 10.68.224.161 with SMTP id rd1mr26150240pbc.49.1350869407246; Sun, 21 Oct 2012 18:30:07 -0700 (PDT) Received: from [172.20.24.157] ([202.126.107.170]) by mx.google.com with ESMTPS id c7sm2721858pay.10.2012.10.21.18.30.04 (version=SSLv3 cipher=OTHER); Sun, 21 Oct 2012 18:30:06 -0700 (PDT) Message-ID: <5084A19A.5050905@theiconic.com.au> Date: Mon, 22 Oct 2012 12:30:02 +1100 From: =?ISO-8859-1?Q?J=FCrgen_Weber?= User-Agent: Mozilla/5.0 (X11; Linux i686 on x86_64; rv:16.0) Gecko/20121010 Thunderbird/16.0.1 MIME-Version: 1.0 To: freebsd-fs@freebsd.org Subject: Re: mfi0 timeout error zfs boot mount problem References: <508090E8.4010300@theiconic.com.au> <5081CE05.1010108@theiconic.com.au> <50830EA3.6020001@theiconic.com.au> <508471E0.9010805@theiconic.com.au> In-Reply-To: <508471E0.9010805@theiconic.com.au> X-Gm-Message-State: ALoCoQl7bzcxazGBWJfH6aXiyRV37KKIfetSfgODRimWnc2dqh0TQWLdLdwmHO7BzwxWMnuCe9jVogtAaLjznbzWlWhNZcQDbrLFw/MZDLeKJId0pV738bWTCtJhqFhIw4POdXAdPJqxwQgULubUOkNhjwqIIOpHIA== Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-Content-Filtered-By: Mailman/MimeDel 2.1.14 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 Oct 2012 01:36:24 -0000 Some more updates! on the bootloader I have also tried: kern.maxfiles=5000000 kern.maxvnodes=5000000 I have also gone into the Card settings BIOS and changed under advanced settings "Forward Read" to "none". Now the systems gets to "Trying to mount root from zfs:tank/root []..... " and then after maybe 1 to 5 minutes the next couple of lines load like its working! eg: "Setting hostuuid: xxxxx" "Setting hostid: xxxxxx" "Entropy harvesting:interrupts ethernet point_to_point kickstart" "Starting file system checks:" "Mounting local file systems:." and stops. I have had the machine on my desk all morning observing it and I can see the disk access is going crazy,, it is doing something. I have found this article: http://constantin.glez.de/blog/2011/07/zfs-dedupe-or-not-dedupe I have a 15TB file system which has dedup on from the start (10TB. I feel its trying to load the DDT and its going to swap/there is not enough RAM (only have 16GB's). Hopefully my 64GB RAM upgrade is enough. Thanks Jurgen On 22/10/12 09:06, Jürgen Weber wrote: > This is still a problem for me, is anyone there? :) > > I have tried the following at the bootime loader. > > vfs.zfs.zil_disable="1" > vfs.zfs.prefetch_disable="1" > vfs.zfs.txg.timeout="5" > > Any other suggestions on how to get this zpool to import and mount again? > > Thanks > > On 21/10/12 07:50, Jurgen Weber wrote: >> Hi >> >> Lastly, is there a way at boot time, some sysctl's or something I can >> set to bring zfs to a minimalistic state? Turn off features, etc to >> get this to mount? >> >> Any ideas appreciated. >> >> Thanks >> >> Jurgen >> On 20/10/2012 9:02 AM, Jurgen Weber wrote: >>> Guys >>> >>> Some more details on this, some insight would be greatly appreciated. >>> >>> As my day wore on trying to get this zpool to import or mount I have >>> learnt a few things. I think over time this issue has came about as >>> more and more data was added to the file systems. >>> >>> Some further details: >>> >>> Its a 8 disk raidz pool that the system boots from as well. The disk >>> are all 2TB. >>> The server has 16GB Of RAM, I notcied the day before this happen the >>> server was struggling with its RAM griding to a halt and dumping its >>> RAM. >>> The issue is not hardware because I found another server (same one) >>> swapped the harddrives out took another 8GB of RAM and I have the >>> same problem. >>> The main data file systems have dedup and gzip compression on. >>> >>> I have booted from open/Oracle Solars 11 adn attempted to import and >>> the Solaris live CD will not import either. In the Solaris system >>> the disk detach from the system. >>> >>> I get the feeling that ZFS is hitting some root limit when >>> attempting to mount and its not finishing the job. >>> >>> Thanks >>> >>> Jurgen >>> >>> On 19/10/2012 10:29 AM, Jürgen Weber wrote: >>>> Team >>>> >>>> I have googled around for a solution and I see a lot of posts about >>>> firmware versions and patches for FreeBSD 8.*. >>>> >>>> I have a FreeBSD 9.1rc1 system, which was beta1 orginally and has >>>> been running for months. >>>> >>>> Now it will not boot, I get the following: >>>> >>>> "Trying to mount root from zfs:tank/root []..... >>>> mfi0: COMMAND 0Xffffff8000cb83530 TIMEOUT AFTER xxx SECONDS >>>> (this just repeats). >>>> >>>> I have not seen this error before during normal runtime, _only_ >>>> during boot. >>>> >>>> Originally when I had the problem I could boot off a USB stick >>>> (9.1beta1 or rc1), run a 'zpool import -f tank' and it would work >>>> on the livecd. Rebooting and the main system would work. >>>> >>>> This time this work around does not work for me. When I am on the >>>> USB stick I can run a 'zpool import' and all of the disk are >>>> recognised, the pool is recognised and the file system is healthy. >>>> >>>> The Card is a H700 PERC, with 12.10.3 firmware in a Dell R515. >>>> Running FreeBSD 9.1-RC1, latest zfs and zpool versions. >>>> >>>> I have tried disabling the cache (mfiutil cache xxx disable). I >>>> have also gone into the Card settings and changed under advanced >>>> settings "adaptive forward read" to "read only". >>>> >>>> Any help, appreciated. >>>> >>>> Thanks >>>> >>> >> > -- Jürgen Weber Systems Engineer IT Infrastructure Team Leader THE ICONIC | E jurgen.weber@theiconic.com.au | www.theiconic.com.au From owner-freebsd-fs@FreeBSD.ORG Mon Oct 22 02:42:13 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 650215A6 for ; Mon, 22 Oct 2012 02:42:13 +0000 (UTC) (envelope-from fjwcash@gmail.com) Received: from mail-la0-f54.google.com (mail-la0-f54.google.com [209.85.215.54]) by mx1.freebsd.org (Postfix) with ESMTP id C8EF48FC0A for ; Mon, 22 Oct 2012 02:42:12 +0000 (UTC) Received: by mail-la0-f54.google.com with SMTP id e12so1678587lag.13 for ; Sun, 21 Oct 2012 19:42:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=fkIix7tEhirLiIq7oHcwMY0dkGayYb0FQ9Do63a/stE=; b=FePQCf2U1DmZD6ypwqih8guWkZ5r1VHWKZKDkfWlBusNDxEsH4irPc5XYo94+jRpmr QKo8CfJRdI2e5Z9Bx/Sm160pR+Mmmw5U0bUsgKcir9lJwcK//LnQ3k418b3wCDnVTQ0W bZV8ZHEK0u7PI4gNlyzsoUG4H+bPA/B1RyIEYI6f88e/qJjXOFtLuq8rMKEHOMOoChid ep1tIg7ktn++iKIw4nnmWudHYSdmKUzk/RbK+Rq4H2Q+emqtBNnPq2xWdGKk3NWvYfcE mdMs79hSm9yfDew2Al4te79RBGtREIue+HOUMJZJ0+ssXaBTDYNgAwJnVixIas+/xA/7 b50w== MIME-Version: 1.0 Received: by 10.112.103.7 with SMTP id fs7mr3102441lbb.25.1350873731506; Sun, 21 Oct 2012 19:42:11 -0700 (PDT) Received: by 10.114.24.66 with HTTP; Sun, 21 Oct 2012 19:42:11 -0700 (PDT) Received: by 10.114.24.66 with HTTP; Sun, 21 Oct 2012 19:42:11 -0700 (PDT) In-Reply-To: <1350834848.88577.33.camel@btw.pki2.com> References: <1350698905.86715.33.camel@btw.pki2.com> <1350711509.86715.59.camel@btw.pki2.com> <50825598.3070505@FreeBSD.org> <1350744349.88577.10.camel@btw.pki2.com> <1350765093.86715.69.camel@btw.pki2.com> <508322EC.4080700@FreeBSD.org> <1350778257.86715.106.camel@btw.pki2.com> <1350834848.88577.33.camel@btw.pki2.com> Date: Sun, 21 Oct 2012 19:42:11 -0700 Message-ID: Subject: Re: ZFS HBAs + LSI chip sets (Was: ZFS hang (system #2)) From: Freddie Cash To: Dennis Glatting Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 Oct 2012 02:42:13 -0000 On Oct 21, 2012 8:54 AM, "Dennis Glatting" wrote: > > On Sat, 2012-10-20 at 23:52 -0700, Freddie Cash wrote: > > On Oct 20, 2012 5:11 PM, "Dennis Glatting" wrote: > > > > > > > > > I chosen the LSI2008 chip set because the code was donated by LSI, and > > > they therefore demonstrated interest in supporting their products under > > > FreeBSD, and that chip set is found in a lot of places, notably > > > Supermicro boards. Additionally, there were stories of success on the > > > lists for several boards. That said, I have received private email from > > > others expressing frustration with ZFS and the "hang" problems, which I > > > believe are also the LSI chips. > > > > > > I have two questions for the broader list: > > > > > > 1) What HBAs are you using for ZFS and what is your level > > > of success/stability? Also, what is your load? > > > > SuperMicro AOC-USAS-8i using the mpt(4) driver on FreeBSD 9-STABLE in one > > server (alpha). > > > > SuperMicro AOC-USAS2-8i using the mps(4) driver on FreeBSD 9-STABLE in 2 > > servers (beta and omega). > > > > I think they were updated on Oct 10ish. > > > > The alpha box runs 12 parallel rsync processes to backup 50-odd Linux > > servers across multiple data centres. > > > > The beta box runs 12 parallel rsync processes to backup 100-odd Linux and > > FreeBSD servers across 50-odd buildings. > > > > Both boxes uses zfs send to replicate the data to omega (each box saturates > > a 1 Gbps link during the zfs send). > > > > Alpha and omega have 24 SATA 3 Gbps harddrives, configured as 3x 8-drive > > raidz2 vdevs, with a 32 GB SSD split between OS, log vdev, and cache vdev. > > > > Beta has 16 SATA 6 Gbps harddrives, configured into 3x 5-drive raidz2 > > vdevs, with a cold-spare, and a 32 GB SSD split between OS, log vdev, and > > cache vdev. > > > > All three have been patched to support feature flags. All three have > > dedupe enabled, compression enabled, and HPN SSH patches with the NONE > > cipher enabled. > > > > All three run without any serious issues. The only issues we've had are 3, > > maybe 4, situations where I've tried to destroy multi-TB filesystems > > without enough RAM in the machine. We're now running a minimum of 32 GB of > > RAM with 64 GB in one box. > > > > > 2) How well is the LSI chip sets supported under FreeBSD? > > > > I have no complaints. And we're ordering a bunch of LSI 9200-series > > controllers for new servers (PCI brackets instead of UIO). > > > Perhaps I am doing something fundamentally wrong with my SSDs. Currently > I simply add them to a pool after being ashift aligned via gnop (e.g., > -S 4096, depending on page size). > > I remember reading somewhere about offsets to insure data is page > aligned but, IIRC, this was strictly a performance issue. Are you doing > something different? All my harddisks are partitioned the same: # gpart create -s gpt daX # gpart add -b 2048 -t freebsd-zfs -l some-label daX For the SSDs, the above are followed by multiple partitions that are on MB boundaries. From owner-freebsd-fs@FreeBSD.ORG Mon Oct 22 03:15:35 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 61FE5587; Mon, 22 Oct 2012 03:15:35 +0000 (UTC) (envelope-from dg@pki2.com) Received: from btw.pki2.com (btw.pki2.com [IPv6:2001:470:a:6fd::2]) by mx1.freebsd.org (Postfix) with ESMTP id 0B7588FC0C; Mon, 22 Oct 2012 03:15:34 +0000 (UTC) Received: from [127.0.0.1] (localhost [127.0.0.1]) by btw.pki2.com (8.14.5/8.14.5) with ESMTP id q9M3FQHG099378; Sun, 21 Oct 2012 20:15:26 -0700 (PDT) (envelope-from dg@pki2.com) Subject: Discovered stangeness (Was: ZFS hang status update) From: Dennis Glatting To: Andriy Gapon In-Reply-To: <1350711509.86715.59.camel@btw.pki2.com> References: <1350698905.86715.33.camel@btw.pki2.com> <1350711509.86715.59.camel@btw.pki2.com> Content-Type: text/plain; charset="ISO-8859-1" Date: Sun, 21 Oct 2012 20:15:26 -0700 Message-ID: <1350875726.86715.134.camel@btw.pki2.com> Mime-Version: 1.0 X-Mailer: Evolution 2.32.1 FreeBSD GNOME Team Port Content-Transfer-Encoding: 7bit X-yoursite-MailScanner-Information: Dennis Glatting X-yoursite-MailScanner-ID: q9M3FQHG099378 X-yoursite-MailScanner: Found to be clean X-MailScanner-From: dg@pki2.com Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 Oct 2012 03:15:35 -0000 As noted in my previous email, camcontrol against the SSD (da0) would hang and did so across a reboot. I decided to remove the SSD from the system. When I disconnected the SSD and rebooted the boot process included these messages: run_interrupt_driven_hooks: still waiting after 60 seconds for xpt_config run_interrupt_driven_hooks: still waiting after 120 seconds for xpt_config run_interrupt_driven_hooks: still waiting after 180 seconds for xpt_config run_interrupt_driven_hooks: still waiting after 240 seconds for xpt_config The system would eventually continue but hang later in the boot sequence, not reaching the command prompt, at this point: Timecounter "TSC-low" frequency 8594011 Hz quality 800 I removed power from the system and tried again. No luck. I reconnected the SSD and rebooted in verbose, and eventually got this: Timecounter "TSC-low" frequency 8594011 Hz quality 800 GEOM_PART: partition 1 is not aligned on 4096 bytes GEOM_PART: partition 2 is not aligned on 4096 bytes What I eventually discovered is one of the two disks of the OS RAID1 array is suddenly toast. Maybe this is coincidence but could it be the driver is confusing the two LSI chips? I am in the process of rebuilding this system. BTW, I installed ZFS-on-Linux under CentOS 6.3 on one of my other systems that would spontaneously reboot when I would issue a "zfs send" of a data set to it from another system. That system was issued a job with substantial load and has been up for only four hours. It'll be interesting to see if anything happens. From owner-freebsd-fs@FreeBSD.ORG Mon Oct 22 07:33:57 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 38DB316F for ; Mon, 22 Oct 2012 07:33:57 +0000 (UTC) (envelope-from daniel@digsys.bg) Received: from smtp-sofia.digsys.bg (smtp-sofia.digsys.bg [193.68.3.230]) by mx1.freebsd.org (Postfix) with ESMTP id A82228FC0A for ; Mon, 22 Oct 2012 07:33:56 +0000 (UTC) Received: from dcave.digsys.bg (dcave.digsys.bg [192.92.129.5]) (authenticated bits=0) by smtp-sofia.digsys.bg (8.14.5/8.14.5) with ESMTP id q9M7XgSx011916 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO) for ; Mon, 22 Oct 2012 10:33:42 +0300 (EEST) (envelope-from daniel@digsys.bg) Message-ID: <5084F6D5.5080400@digsys.bg> Date: Mon, 22 Oct 2012 10:33:41 +0300 From: Daniel Kalchev User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:10.0.7) Gecko/20120928 Thunderbird/10.0.7 MIME-Version: 1.0 To: freebsd-fs@freebsd.org Subject: Re: ZFS HBAs + LSI chip sets (Was: ZFS hang (system #2)) References: <1350698905.86715.33.camel@btw.pki2.com> <1350711509.86715.59.camel@btw.pki2.com> <50825598.3070505@FreeBSD.org> <1350744349.88577.10.camel@btw.pki2.com> <1350765093.86715.69.camel@btw.pki2.com> <508322EC.4080700@FreeBSD.org> <1350778257.86715.106.camel@btw.pki2.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 Oct 2012 07:33:57 -0000 On 21.10.12 09:52, Freddie Cash wrote: [...] > All three run without any serious issues. The only issues we've had > are 3, maybe 4, situations where I've tried to destroy multi-TB > filesystems without enough RAM in the machine. We're now running a > minimum of 32 GB of RAM with 64 GB in one box. What is the firmware on your LSI2008 controllers? I am having weird situation with one server that has LSI2008, on 9-stable and all SSD configuration. One or two of the drives would drop off the bus for no reason sometimes few times a day and because the current driver ignores bus reset, someone has to physically remove and re-insert the drives for them to come back. Real pain. My firmware version is 12.00.00.00 -- perhaps it is buggy? Daniel From owner-freebsd-fs@FreeBSD.ORG Mon Oct 22 08:32:52 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id C07B9BC9 for ; Mon, 22 Oct 2012 08:32:52 +0000 (UTC) (envelope-from prvs=1642f03d01=killing@multiplay.co.uk) Received: from mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23]) by mx1.freebsd.org (Postfix) with ESMTP id 4CD7B8FC08 for ; Mon, 22 Oct 2012 08:32:51 +0000 (UTC) Received: from r2d2 ([188.220.16.49]) by mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23]) (MDaemon PRO v10.0.4) with ESMTP id md50000790927.msg for ; Mon, 22 Oct 2012 09:32:44 +0100 X-Spam-Processed: mail1.multiplay.co.uk, Mon, 22 Oct 2012 09:32:44 +0100 (not processed: message from valid local sender) X-MDRemoteIP: 188.220.16.49 X-Return-Path: prvs=1642f03d01=killing@multiplay.co.uk X-Envelope-From: killing@multiplay.co.uk X-MDaemon-Deliver-To: freebsd-fs@freebsd.org Message-ID: <79E4892EEEC94DBEB60C8DC237089F15@multiplay.co.uk> From: "Steven Hartland" To: =?iso-8859-1?Q?J=FCrgen_Weber?= , References: <508090E8.4010300@theiconic.com.au> <5081CE05.1010108@theiconic.com.au> <50830EA3.6020001@theiconic.com.au> <508471E0.9010805@theiconic.com.au> Subject: Re: mfi0 timeout error zfs boot mount problem Date: Mon, 22 Oct 2012 09:32:40 +0100 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=response Content-Transfer-Encoding: 8bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.5931 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6157 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 Oct 2012 08:32:52 -0000 I'm not aware of anything. You could try booting from an mfsBSD cd and seeing if that can access your array. You can downloaded the cd from here:- http://mfsbsd.vx.sk/ Regards Steve ----- Original Message ----- From: "Jürgen Weber" To: Sent: Sunday, October 21, 2012 11:06 PM Subject: Re: mfi0 timeout error zfs boot mount problem This is still a problem for me, is anyone there? :) I have tried the following at the bootime loader. vfs.zfs.zil_disable="1" vfs.zfs.prefetch_disable="1" vfs.zfs.txg.timeout="5" Any other suggestions on how to get this zpool to import and mount again? ================================================ This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmaster@multiplay.co.uk. From owner-freebsd-fs@FreeBSD.ORG Mon Oct 22 08:44:14 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx2.freebsd.org (mx2.freebsd.org [69.147.83.53]) by hub.freebsd.org (Postfix) with ESMTP id CAC41DDF for ; Mon, 22 Oct 2012 08:44:14 +0000 (UTC) (envelope-from ae@FreeBSD.org) Received: from [127.0.0.1] (hub.FreeBSD.org [8.8.178.136]) by mx2.freebsd.org (Postfix) with ESMTP id F08473B6032; Mon, 22 Oct 2012 08:44:13 +0000 (UTC) Message-ID: <50850756.8080507@FreeBSD.org> Date: Mon, 22 Oct 2012 12:44:06 +0400 From: "Andrey V. Elsukov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:15.0) Gecko/20121010 Thunderbird/15.0.1 MIME-Version: 1.0 To: David Wimsey Subject: Re: gptzfsboot very slow References: <5082D202.9010701@FreeBSD.org> <211EBAB0-5105-4106-A3CF-30E4D08301DF@rtsz.com> In-Reply-To: <211EBAB0-5105-4106-A3CF-30E4D08301DF@rtsz.com> X-Enigmail-Version: 1.4.3 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: "freebsd-fs@freebsd.org" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 Oct 2012 08:44:14 -0000 On 21.10.2012 02:02, David Wimsey wrote: > The new zfsloader fixed the problem, the twiddle keeps spinning fast > and only takes a few seconds before jumping to the boot menu. > > Thanks! > > Just out of curiosity, do you know what change fixed it or what > exactly the old loader was doing? Hi David, There were many changes in the loader and zfsloader last several months, so it is not one commit. I'm planning merge them to stable in a week or two, after i'll fix the last one issue. -- WBR, Andrey V. Elsukov From owner-freebsd-fs@FreeBSD.ORG Mon Oct 22 11:06:34 2012 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id E393424E for ; Mon, 22 Oct 2012 11:06:34 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.FreeBSD.org [8.8.178.135]) by mx1.freebsd.org (Postfix) with ESMTP id C855F8FC0A for ; Mon, 22 Oct 2012 11:06:34 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q9MB6YPn044398 for ; Mon, 22 Oct 2012 11:06:34 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q9MB6YmI044396 for freebsd-fs@FreeBSD.org; Mon, 22 Oct 2012 11:06:34 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 22 Oct 2012 11:06:34 GMT Message-Id: <201210221106.q9MB6YmI044396@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-fs@FreeBSD.org Subject: Current problem reports assigned to freebsd-fs@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 Oct 2012 11:06:35 -0000 Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/172348 fs [unionfs] umount -f of filesystem in use with readonly o kern/172334 fs [unionfs] unionfs permits recursive union mounts; caus o kern/172259 fs [zfs] [patch] ZFS fails to receive valid snapshots (pa o kern/171626 fs [tmpfs] tmpfs should be noisier when the requested siz o kern/171415 fs [zfs] zfs recv fails with "cannot receive incremental o kern/170945 fs [gpt] disk layout not portable between direct connect o kern/170914 fs [zfs] [patch] Import patchs related with issues 3090 a o kern/170912 fs [zfs] [patch] unnecessarily setting DS_FLAG_INCONSISTE o bin/170778 fs [zfs] [panic] FreeBSD panics randomly o kern/170680 fs [nfs] Multiple NFS Client bug in the FreeBSD 7.4-RELEA o kern/170497 fs [xfs][panic] kernel will panic whenever I ls a mounted o kern/170238 fs [zfs] [panic] Panic when deleting data o kern/169945 fs [zfs] [panic] Kernel panic while importing zpool (afte o kern/169480 fs [zfs] ZFS stalls on heavy I/O o kern/169398 fs [zfs] Can't remove file with permanent error o kern/169339 fs panic while " : > /etc/123" o kern/169319 fs [zfs] zfs resilver can't complete o kern/168947 fs [nfs] [zfs] .zfs/snapshot directory is messed up when o kern/168942 fs [nfs] [hang] nfsd hangs after being restarted (not -HU o kern/168158 fs [zfs] incorrect parsing of sharenfs options in zfs (fs o kern/167979 fs [ufs] DIOCGDINFO ioctl does not work on 8.2 file syste o kern/167977 fs [smbfs] mount_smbfs results are differ when utf-8 or U o kern/167688 fs [fusefs] Incorrect signal handling with direct_io o kern/167685 fs [zfs] ZFS on USB drive prevents shutdown / reboot o kern/167612 fs [portalfs] The portal file system gets stuck inside po o kern/167272 fs [zfs] ZFS Disks reordering causes ZFS to pick the wron o kern/167260 fs [msdosfs] msdosfs disk was mounted the second time whe o kern/167109 fs [zfs] [panic] zfs diff kernel panic Fatal trap 9: gene o kern/167105 fs [nfs] mount_nfs can not handle source exports wiht mor o kern/167067 fs [zfs] [panic] ZFS panics the server o kern/167066 fs [zfs] ZVOLs not appearing in /dev/zvol o kern/167065 fs [zfs] boot fails when a spare is the boot disk o kern/167048 fs [nfs] [patch] RELEASE-9 crash when using ZFS+NULLFS+NF o kern/166912 fs [ufs] [panic] Panic after converting Softupdates to jo o kern/166851 fs [zfs] [hang] Copying directory from the mounted UFS di o kern/166477 fs [nfs] NFS data corruption. o kern/165950 fs [ffs] SU+J and fsck problem o kern/165923 fs [nfs] Writing to NFS-backed mmapped files fails if flu o kern/165521 fs [zfs] [hang] livelock on 1 Gig of RAM with zfs when 31 o kern/165392 fs Multiple mkdir/rmdir fails with errno 31 o kern/165087 fs [unionfs] lock violation in unionfs o kern/164472 fs [ufs] fsck -B panics on particular data inconsistency o kern/164370 fs [zfs] zfs destroy for snapshot fails on i386 and sparc o kern/164261 fs [nullfs] [patch] fix panic with NFS served from NULLFS o kern/164256 fs [zfs] device entry for volume is not created after zfs o kern/164184 fs [ufs] [panic] Kernel panic with ufs_makeinode o kern/163801 fs [md] [request] allow mfsBSD legacy installed in 'swap' o kern/163770 fs [zfs] [hang] LOR between zfs&syncer + vnlru leading to o kern/163501 fs [nfs] NFS exporting a dir and a subdir in that dir to o kern/162944 fs [coda] Coda file system module looks broken in 9.0 o kern/162860 fs [zfs] Cannot share ZFS filesystem to hosts with a hyph o kern/162751 fs [zfs] [panic] kernel panics during file operations o kern/162591 fs [nullfs] cross-filesystem nullfs does not work as expe o kern/162519 fs [zfs] "zpool import" relies on buggy realpath() behavi o kern/162362 fs [snapshots] [panic] ufs with snapshot(s) panics when g o kern/161968 fs [zfs] [hang] renaming snapshot with -r including a zvo p kern/161897 fs [zfs] [patch] zfs partition probing causing long delay o kern/161864 fs [ufs] removing journaling from UFS partition fails on o bin/161807 fs [patch] add option for explicitly specifying metadata o kern/161579 fs [smbfs] FreeBSD sometimes panics when an smb share is o kern/161533 fs [zfs] [panic] zfs receive panic: system ioctl returnin o kern/161438 fs [zfs] [panic] recursed on non-recursive spa_namespace_ o kern/161424 fs [nullfs] __getcwd() calls fail when used on nullfs mou o kern/161280 fs [zfs] Stack overflow in gptzfsboot o kern/161205 fs [nfs] [pfsync] [regression] [build] Bug report freebsd o kern/161169 fs [zfs] [panic] ZFS causes kernel panic in dbuf_dirty o kern/161112 fs [ufs] [lor] filesystem LOR in FreeBSD 9.0-BETA3 o kern/160893 fs [zfs] [panic] 9.0-BETA2 kernel panic o kern/160860 fs [ufs] Random UFS root filesystem corruption with SU+J o kern/160801 fs [zfs] zfsboot on 8.2-RELEASE fails to boot from root-o o kern/160790 fs [fusefs] [panic] VPUTX: negative ref count with FUSE o kern/160777 fs [zfs] [hang] RAID-Z3 causes fatal hang upon scrub/impo o kern/160706 fs [zfs] zfs bootloader fails when a non-root vdev exists o kern/160591 fs [zfs] Fail to boot on zfs root with degraded raidz2 [r o kern/160410 fs [smbfs] [hang] smbfs hangs when transferring large fil o kern/160283 fs [zfs] [patch] 'zfs list' does abort in make_dataset_ha o kern/159930 fs [ufs] [panic] kernel core o kern/159402 fs [zfs][loader] symlinks cause I/O errors o kern/159357 fs [zfs] ZFS MAXNAMELEN macro has confusing name (off-by- o kern/159356 fs [zfs] [patch] ZFS NAME_ERR_DISKLIKE check is Solaris-s o kern/159351 fs [nfs] [patch] - divide by zero in mountnfs() o kern/159251 fs [zfs] [request]: add FLETCHER4 as DEDUP hash option o kern/159077 fs [zfs] Can't cd .. with latest zfs version o kern/159048 fs [smbfs] smb mount corrupts large files o kern/159045 fs [zfs] [hang] ZFS scrub freezes system o kern/158839 fs [zfs] ZFS Bootloader Fails if there is a Dead Disk o kern/158802 fs amd(8) ICMP storm and unkillable process. o kern/158231 fs [nullfs] panic on unmounting nullfs mounted over ufs o f kern/157929 fs [nfs] NFS slow read o kern/157399 fs [zfs] trouble with: mdconfig force delete && zfs strip o kern/157179 fs [zfs] zfs/dbuf.c: panic: solaris assert: arc_buf_remov o kern/156797 fs [zfs] [panic] Double panic with FreeBSD 9-CURRENT and o kern/156781 fs [zfs] zfs is losing the snapshot directory, p kern/156545 fs [ufs] mv could break UFS on SMP systems o kern/156193 fs [ufs] [hang] UFS snapshot hangs && deadlocks processes o kern/156039 fs [nullfs] [unionfs] nullfs + unionfs do not compose, re o kern/155615 fs [zfs] zfs v28 broken on sparc64 -current o kern/155587 fs [zfs] [panic] kernel panic with zfs p kern/155411 fs [regression] [8.2-release] [tmpfs]: mount: tmpfs : No o kern/155199 fs [ext2fs] ext3fs mounted as ext2fs gives I/O errors o bin/155104 fs [zfs][patch] use /dev prefix by default when importing o kern/154930 fs [zfs] cannot delete/unlink file from full volume -> EN o kern/154828 fs [msdosfs] Unable to create directories on external USB o kern/154491 fs [smbfs] smb_co_lock: recursive lock for object 1 p kern/154228 fs [md] md getting stuck in wdrain state o kern/153996 fs [zfs] zfs root mount error while kernel is not located o kern/153753 fs [zfs] ZFS v15 - grammatical error when attempting to u o kern/153716 fs [zfs] zpool scrub time remaining is incorrect o kern/153695 fs [patch] [zfs] Booting from zpool created on 4k-sector o kern/153680 fs [xfs] 8.1 failing to mount XFS partitions o kern/153520 fs [zfs] Boot from GPT ZFS root on HP BL460c G1 unstable o kern/153418 fs [zfs] [panic] Kernel Panic occurred writing to zfs vol o kern/153351 fs [zfs] locking directories/files in ZFS o bin/153258 fs [patch][zfs] creating ZVOLs requires `refreservation' s kern/153173 fs [zfs] booting from a gzip-compressed dataset doesn't w o bin/153142 fs [zfs] ls -l outputs `ls: ./.zfs: Operation not support o kern/153126 fs [zfs] vdev failure, zpool=peegel type=vdev.too_small o kern/152022 fs [nfs] nfs service hangs with linux client [regression] o kern/151942 fs [zfs] panic during ls(1) zfs snapshot directory o kern/151905 fs [zfs] page fault under load in /sbin/zfs o bin/151713 fs [patch] Bug in growfs(8) with respect to 32-bit overfl o kern/151648 fs [zfs] disk wait bug o kern/151629 fs [fs] [patch] Skip empty directory entries during name o kern/151330 fs [zfs] will unshare all zfs filesystem after execute a o kern/151326 fs [nfs] nfs exports fail if netgroups contain duplicate o kern/151251 fs [ufs] Can not create files on filesystem with heavy us o kern/151226 fs [zfs] can't delete zfs snapshot o kern/151111 fs [zfs] vnodes leakage during zfs unmount o kern/150503 fs [zfs] ZFS disks are UNAVAIL and corrupted after reboot o kern/150501 fs [zfs] ZFS vdev failure vdev.bad_label on amd64 o kern/150390 fs [zfs] zfs deadlock when arcmsr reports drive faulted o kern/150336 fs [nfs] mountd/nfsd became confused; refused to reload n o kern/149208 fs mksnap_ffs(8) hang/deadlock o kern/149173 fs [patch] [zfs] make OpenSolaris installa o kern/149015 fs [zfs] [patch] misc fixes for ZFS code to build on Glib o kern/149014 fs [zfs] [patch] declarations in ZFS libraries/utilities o kern/149013 fs [zfs] [patch] make ZFS makefiles use the libraries fro o kern/148504 fs [zfs] ZFS' zpool does not allow replacing drives to be o kern/148490 fs [zfs]: zpool attach - resilver bidirectionally, and re o kern/148368 fs [zfs] ZFS hanging forever on 8.1-PRERELEASE o kern/148138 fs [zfs] zfs raidz pool commands freeze o kern/147903 fs [zfs] [panic] Kernel panics on faulty zfs device o kern/147881 fs [zfs] [patch] ZFS "sharenfs" doesn't allow different " p kern/147560 fs [zfs] [boot] Booting 8.1-PRERELEASE raidz system take o kern/147420 fs [ufs] [panic] ufs_dirbad, nullfs, jail panic (corrupt o kern/146941 fs [zfs] [panic] Kernel Double Fault - Happens constantly o kern/146786 fs [zfs] zpool import hangs with checksum errors o kern/146708 fs [ufs] [panic] Kernel panic in softdep_disk_write_compl o kern/146528 fs [zfs] Severe memory leak in ZFS on i386 o kern/146502 fs [nfs] FreeBSD 8 NFS Client Connection to Server s kern/145712 fs [zfs] cannot offline two drives in a raidz2 configurat o kern/145411 fs [xfs] [panic] Kernel panics shortly after mounting an f bin/145309 fs bsdlabel: Editing disk label invalidates the whole dev o kern/145272 fs [zfs] [panic] Panic during boot when accessing zfs on o kern/145246 fs [ufs] dirhash in 7.3 gratuitously frees hashes when it o kern/145238 fs [zfs] [panic] kernel panic on zpool clear tank o kern/145229 fs [zfs] Vast differences in ZFS ARC behavior between 8.0 o kern/145189 fs [nfs] nfsd performs abysmally under load o kern/144929 fs [ufs] [lor] vfs_bio.c + ufs_dirhash.c p kern/144447 fs [zfs] sharenfs fsunshare() & fsshare_main() non functi o kern/144416 fs [panic] Kernel panic on online filesystem optimization s kern/144415 fs [zfs] [panic] kernel panics on boot after zfs crash o kern/144234 fs [zfs] Cannot boot machine with recent gptzfsboot code o kern/143825 fs [nfs] [panic] Kernel panic on NFS client o bin/143572 fs [zfs] zpool(1): [patch] The verbose output from iostat o kern/143212 fs [nfs] NFSv4 client strange work ... o kern/143184 fs [zfs] [lor] zfs/bufwait LOR o kern/142878 fs [zfs] [vfs] lock order reversal o kern/142597 fs [ext2fs] ext2fs does not work on filesystems with real o kern/142489 fs [zfs] [lor] allproc/zfs LOR o kern/142466 fs Update 7.2 -> 8.0 on Raid 1 ends with screwed raid [re o kern/142306 fs [zfs] [panic] ZFS drive (from OSX Leopard) causes two o kern/142068 fs [ufs] BSD labels are got deleted spontaneously o kern/141897 fs [msdosfs] [panic] Kernel panic. msdofs: file name leng o kern/141463 fs [nfs] [panic] Frequent kernel panics after upgrade fro o kern/141305 fs [zfs] FreeBSD ZFS+sendfile severe performance issues ( o kern/141091 fs [patch] [nullfs] fix panics with DIAGNOSTIC enabled o kern/141086 fs [nfs] [panic] panic("nfs: bioread, not dir") on FreeBS o kern/141010 fs [zfs] "zfs scrub" fails when backed by files in UFS2 o kern/140888 fs [zfs] boot fail from zfs root while the pool resilveri o kern/140661 fs [zfs] [patch] /boot/loader fails to work on a GPT/ZFS- o kern/140640 fs [zfs] snapshot crash o kern/140068 fs [smbfs] [patch] smbfs does not allow semicolon in file o kern/139725 fs [zfs] zdb(1) dumps core on i386 when examining zpool c o kern/139715 fs [zfs] vfs.numvnodes leak on busy zfs p bin/139651 fs [nfs] mount(8): read-only remount of NFS volume does n o kern/139407 fs [smbfs] [panic] smb mount causes system crash if remot o kern/138662 fs [panic] ffs_blkfree: freeing free block o kern/138421 fs [ufs] [patch] remove UFS label limitations o kern/138202 fs mount_msdosfs(1) see only 2Gb o kern/136968 fs [ufs] [lor] ufs/bufwait/ufs (open) o kern/136945 fs [ufs] [lor] filedesc structure/ufs (poll) o kern/136944 fs [ffs] [lor] bufwait/snaplk (fsync) o kern/136873 fs [ntfs] Missing directories/files on NTFS volume o kern/136865 fs [nfs] [patch] NFS exports atomic and on-the-fly atomic p kern/136470 fs [nfs] Cannot mount / in read-only, over NFS o kern/135546 fs [zfs] zfs.ko module doesn't ignore zpool.cache filenam o kern/135469 fs [ufs] [panic] kernel crash on md operation in ufs_dirb o kern/135050 fs [zfs] ZFS clears/hides disk errors on reboot o kern/134491 fs [zfs] Hot spares are rather cold... o kern/133676 fs [smbfs] [panic] umount -f'ing a vnode-based memory dis o kern/132960 fs [ufs] [panic] panic:ffs_blkfree: freeing free frag o kern/132397 fs reboot causes filesystem corruption (failure to sync b o kern/132331 fs [ufs] [lor] LOR ufs and syncer o kern/132237 fs [msdosfs] msdosfs has problems to read MSDOS Floppy o kern/132145 fs [panic] File System Hard Crashes o kern/131441 fs [unionfs] [nullfs] unionfs and/or nullfs not combineab o kern/131360 fs [nfs] poor scaling behavior of the NFS server under lo o kern/131342 fs [nfs] mounting/unmounting of disks causes NFS to fail o bin/131341 fs makefs: error "Bad file descriptor" on the mount poin o kern/130920 fs [msdosfs] cp(1) takes 100% CPU time while copying file o kern/130210 fs [nullfs] Error by check nullfs o kern/129760 fs [nfs] after 'umount -f' of a stale NFS share FreeBSD l o kern/129488 fs [smbfs] Kernel "bug" when using smbfs in smbfs_smb.c: o kern/129231 fs [ufs] [patch] New UFS mount (norandom) option - mostly o kern/129152 fs [panic] non-userfriendly panic when trying to mount(8) o kern/127787 fs [lor] [ufs] Three LORs: vfslock/devfs/vfslock, ufs/vfs o bin/127270 fs fsck_msdosfs(8) may crash if BytesPerSec is zero o kern/127029 fs [panic] mount(8): trying to mount a write protected zi o kern/126287 fs [ufs] [panic] Kernel panics while mounting an UFS file o kern/125895 fs [ffs] [panic] kernel: panic: ffs_blkfree: freeing free s kern/125738 fs [zfs] [request] SHA256 acceleration in ZFS o kern/123939 fs [msdosfs] corrupts new files o kern/122380 fs [ffs] ffs_valloc:dup alloc (Soekris 4801/7.0/USB Flash o bin/122172 fs [fs]: amd(8) automount daemon dies on 6.3-STABLE i386, o bin/121898 fs [nullfs] pwd(1)/getcwd(2) fails with Permission denied o bin/121072 fs [smbfs] mount_smbfs(8) cannot normally convert the cha o kern/120483 fs [ntfs] [patch] NTFS filesystem locking changes o kern/120482 fs [ntfs] [patch] Sync style changes between NetBSD and F o kern/118912 fs [2tb] disk sizing/geometry problem with large array o kern/118713 fs [minidump] [patch] Display media size required for a k o kern/118318 fs [nfs] NFS server hangs under special circumstances o bin/118249 fs [ufs] mv(1): moving a directory changes its mtime o kern/118126 fs [nfs] [patch] Poor NFS server write performance o kern/118107 fs [ntfs] [panic] Kernel panic when accessing a file at N o kern/117954 fs [ufs] dirhash on very large directories blocks the mac o bin/117315 fs [smbfs] mount_smbfs(8) and related options can't mount o kern/117158 fs [zfs] zpool scrub causes panic if geli vdevs detach on o bin/116980 fs [msdosfs] [patch] mount_msdosfs(8) resets some flags f o conf/116931 fs lack of fsck_cd9660 prevents mounting iso images with o kern/116583 fs [ffs] [hang] System freezes for short time when using o bin/115361 fs [zfs] mount(8) gets into a state where it won't set/un o kern/114955 fs [cd9660] [patch] [request] support for mask,dirmask,ui o kern/114847 fs [ntfs] [patch] [request] dirmask support for NTFS ala o kern/114676 fs [ufs] snapshot creation panics: snapacct_ufs2: bad blo o bin/114468 fs [patch] [request] add -d option to umount(8) to detach o kern/113852 fs [smbfs] smbfs does not properly implement DFS referral o bin/113838 fs [patch] [request] mount(8): add support for relative p o bin/113049 fs [patch] [request] make quot(8) use getopt(3) and show o kern/112658 fs [smbfs] [patch] smbfs and caching problems (resolves b o kern/111843 fs [msdosfs] Long Names of files are incorrectly created o kern/111782 fs [ufs] dump(8) fails horribly for large filesystems s bin/111146 fs [2tb] fsck(8) fails on 6T filesystem o bin/107829 fs [2TB] fdisk(8): invalid boundary checking in fdisk / w o kern/106107 fs [ufs] left-over fsck_snapshot after unfinished backgro o kern/104406 fs [ufs] Processes get stuck in "ufs" state under persist o kern/104133 fs [ext2fs] EXT2FS module corrupts EXT2/3 filesystems o kern/103035 fs [ntfs] Directories in NTFS mounted disc images appear o kern/101324 fs [smbfs] smbfs sometimes not case sensitive when it's s o kern/99290 fs [ntfs] mount_ntfs ignorant of cluster sizes s bin/97498 fs [request] newfs(8) has no option to clear the first 12 o kern/97377 fs [ntfs] [patch] syntax cleanup for ntfs_ihash.c o kern/95222 fs [cd9660] File sections on ISO9660 level 3 CDs ignored o kern/94849 fs [ufs] rename on UFS filesystem is not atomic o bin/94810 fs fsck(8) incorrectly reports 'file system marked clean' o kern/94769 fs [ufs] Multiple file deletions on multi-snapshotted fil o kern/94733 fs [smbfs] smbfs may cause double unlock o kern/93942 fs [vfs] [patch] panic: ufs_dirbad: bad dir (patch from D o kern/92272 fs [ffs] [hang] Filling a filesystem while creating a sna o kern/91134 fs [smbfs] [patch] Preserve access and modification time a kern/90815 fs [smbfs] [patch] SMBFS with character conversions somet o kern/88657 fs [smbfs] windows client hang when browsing a samba shar o kern/88555 fs [panic] ffs_blkfree: freeing free frag on AMD 64 o kern/88266 fs [smbfs] smbfs does not implement UIO_NOCOPY and sendfi o bin/87966 fs [patch] newfs(8): introduce -A flag for newfs to enabl o kern/87859 fs [smbfs] System reboot while umount smbfs. o kern/86587 fs [msdosfs] rm -r /PATH fails with lots of small files o bin/85494 fs fsck_ffs: unchecked use of cg_inosused macro etc. o kern/80088 fs [smbfs] Incorrect file time setting on NTFS mounted vi o bin/74779 fs Background-fsck checks one filesystem twice and omits o kern/73484 fs [ntfs] Kernel panic when doing `ls` from the client si o bin/73019 fs [ufs] fsck_ufs(8) cannot alloc 607016868 bytes for ino o kern/71774 fs [ntfs] NTFS cannot "see" files on a WinXP filesystem o bin/70600 fs fsck(8) throws files away when it can't grow lost+foun o kern/68978 fs [panic] [ufs] crashes with failing hard disk, loose po o kern/65920 fs [nwfs] Mounted Netware filesystem behaves strange o kern/65901 fs [smbfs] [patch] smbfs fails fsx write/truncate-down/tr o kern/61503 fs [smbfs] mount_smbfs does not work as non-root o kern/55617 fs [smbfs] Accessing an nsmb-mounted drive via a smb expo o kern/51685 fs [hang] Unbounded inode allocation causes kernel to loc o kern/36566 fs [smbfs] System reboot with dead smb mount and umount o bin/27687 fs fsck(8) wrapper is not properly passing options to fsc o kern/18874 fs [2TB] 32bit NFS servers export wrong negative values t 293 problems total. From owner-freebsd-fs@FreeBSD.ORG Mon Oct 22 13:21:24 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id AABFD71E for ; Mon, 22 Oct 2012 13:21:24 +0000 (UTC) (envelope-from feld@feld.me) Received: from feld.me (unknown [IPv6:2607:f4e0:100:300::2]) by mx1.freebsd.org (Postfix) with ESMTP id 689FE8FC17 for ; Mon, 22 Oct 2012 13:21:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=feld.me; s=blargle; h=In-Reply-To:Message-Id:From:Mime-Version:Date:References:Subject:Cc:To:Content-Type; bh=aM8Z7R75Wzzmd7G4IPK30sPaVaFw0/7OM9xq68T34/Q=; b=NrWK6EcTb60oqEuFS+Cv8I36p1P0ziHLr1SIB8PIL9HHbGSlNWHehKiPuocZE5VBVUWAuagIM5+WDlvA2vkiS69cRG7mq274eyLX62r3Y/UEd5+5PAg9Sc3V5ljxekSF; Received: from localhost ([127.0.0.1] helo=mwi1.coffeenet.org) by feld.me with esmtp (Exim 4.80 (FreeBSD)) (envelope-from ) id 1TQHwP-000IYc-9k; Mon, 22 Oct 2012 08:21:18 -0500 Received: from feld@feld.me by mwi1.coffeenet.org (Archiveopteryx 3.1.4) with esmtpa id 1350912067-65253-65252/5/1; Mon, 22 Oct 2012 13:21:07 +0000 Content-Type: text/plain; charset=utf-8; format=flowed; delsp=yes To: Dustin Wenz , Olivier Smedts , Steven Hartland Subject: Re: Imposing ZFS latency limits References: <6116A56E-4565-4485-887E-46E3ED231606@ebureau.com> <089898A4493042448C934643FD5C3887@multiplay.co.uk> Date: Mon, 22 Oct 2012 08:21:07 -0500 Mime-Version: 1.0 From: Mark Felder Message-Id: In-Reply-To: <089898A4493042448C934643FD5C3887@multiplay.co.uk> User-Agent: Opera Mail/12.10 (FreeBSD) X-SA-Report: ALL_TRUSTED=-1, KHOP_THREADED=-0.5 X-SA-Score: -1.5 Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 Oct 2012 13:21:24 -0000 On Tue, 16 Oct 2012 10:46:00 -0500, Steven Hartland wrote: > > Interesting, what metrics where you using which made it easy to detect, > work be nice to know your process there Mark? One reason is that our virtual machine performance gets awful and we get alerted for higher than usual load and/or disk io latency by the hypervisor. Another thing we've implemented is watching for some SCSI errors on the server too. They seem to let us know before it really gets bad. It's nice knowing ZFS is doing everything within its power to read the data off the disk, but when there's a fully intact raidz it should be smart enough to kick a disk out that's being problematic. From owner-freebsd-fs@FreeBSD.ORG Mon Oct 22 13:47:05 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 9ED5D640 for ; Mon, 22 Oct 2012 13:47:05 +0000 (UTC) (envelope-from fjwcash@gmail.com) Received: from mail-lb0-f182.google.com (mail-lb0-f182.google.com [209.85.217.182]) by mx1.freebsd.org (Postfix) with ESMTP id 0AFFB8FC18 for ; Mon, 22 Oct 2012 13:47:04 +0000 (UTC) Received: by mail-lb0-f182.google.com with SMTP id b5so2136657lbd.13 for ; Mon, 22 Oct 2012 06:47:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=GFFMkO/FpzDh/5wVhzYBOb9PiVgof9tlMKs3ZOgIGFA=; b=OuIeNwaAg+6mWRhhZ2wzFrMR00erpURwENP/v5TLRCSi7+hznfAW0aFdf9l7pY2kYu R8THI3QGFAvzOruhmSnKq47hab0TeyZTQxo/+FIwdl3pf57UZQXcOLOHddwmf3lGO9R1 u5LhD5OcaTgt44ZFfujtXYmEgUTSqln7aKh3NdwIxql292Db2NJD8T2O1nHy3G1Nb6wU +yN5Abu50lwefrydGNa+kT+DMEsyzLNF33uKlrWhH5X5hhWSKefQAJtVMbwpPN+F3hoo Zje8BryqkfELNe18F2TIoqp0Hty16EYwrRyFtGCA7F+7Nn39384YesYDRZTf/L7+R9p9 s8bg== MIME-Version: 1.0 Received: by 10.112.83.73 with SMTP id o9mr3684312lby.128.1350913623592; Mon, 22 Oct 2012 06:47:03 -0700 (PDT) Received: by 10.114.24.66 with HTTP; Mon, 22 Oct 2012 06:47:03 -0700 (PDT) Received: by 10.114.24.66 with HTTP; Mon, 22 Oct 2012 06:47:03 -0700 (PDT) In-Reply-To: <5084F6D5.5080400@digsys.bg> References: <1350698905.86715.33.camel@btw.pki2.com> <1350711509.86715.59.camel@btw.pki2.com> <50825598.3070505@FreeBSD.org> <1350744349.88577.10.camel@btw.pki2.com> <1350765093.86715.69.camel@btw.pki2.com> <508322EC.4080700@FreeBSD.org> <1350778257.86715.106.camel@btw.pki2.com> <5084F6D5.5080400@digsys.bg> Date: Mon, 22 Oct 2012 06:47:03 -0700 Message-ID: Subject: Re: ZFS HBAs + LSI chip sets (Was: ZFS hang (system #2)) From: Freddie Cash To: Daniel Kalchev Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 Oct 2012 13:47:05 -0000 I'll double-check when I get to work, but I'm pretty sure it's 10.something. On Oct 22, 2012 12:34 AM, "Daniel Kalchev" wrote: > > > On 21.10.12 09:52, Freddie Cash wrote: > > [...] > >> All three run without any serious issues. The only issues we've had are >> 3, maybe 4, situations where I've tried to destroy multi-TB filesystems >> without enough RAM in the machine. We're now running a minimum of 32 GB of >> RAM with 64 GB in one box. >> > > What is the firmware on your LSI2008 controllers? > > I am having weird situation with one server that has LSI2008, on 9-stable > and all SSD configuration. One or two of the drives would drop off the bus > for no reason sometimes few times a day and because the current driver > ignores bus reset, someone has to physically remove and re-insert the > drives for them to come back. Real pain. > My firmware version is 12.00.00.00 -- perhaps it is buggy? > > Daniel > ______________________________**_________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/**mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@**freebsd.org > " > From owner-freebsd-fs@FreeBSD.ORG Mon Oct 22 14:40:02 2012 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 83F307BE for ; Mon, 22 Oct 2012 14:40:02 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.FreeBSD.org [8.8.178.135]) by mx1.freebsd.org (Postfix) with ESMTP id 5E2F48FC14 for ; Mon, 22 Oct 2012 14:40:02 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q9MEe2cm065457 for ; Mon, 22 Oct 2012 14:40:02 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q9MEe2qi065456; Mon, 22 Oct 2012 14:40:02 GMT (envelope-from gnats) Date: Mon, 22 Oct 2012 14:40:02 GMT Message-Id: <201210221440.q9MEe2qi065456@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org Cc: From: Andrey Simonenko Subject: Re: kern/136865: [nfs] [patch] NFS exports atomic and on-the-fly atomic updates X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: Andrey Simonenko List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 Oct 2012 14:40:02 -0000 The following reply was made to PR kern/136865; it has been noted by GNATS. From: Andrey Simonenko To: Martin Birgmeier Cc: bug-followup@FreeBSD.org Subject: Re: kern/136865: [nfs] [patch] NFS exports atomic and on-the-fly atomic updates Date: Mon, 22 Oct 2012 17:35:52 +0300 On Sat, Oct 20, 2012 at 11:43:17AM +0200, Martin Birgmeier wrote: > Andrey, > > I'd really like to use this. However, I need to use it with FreeBSD 7.4, > 8.2, and 9.0 (and 9.1 in the near future); I tried to backport your > changes, but this turned out to be too difficult for me. The nfse utility could be built on 8.x, 9.x and 10.x. To build nfse on 7.x it is necessary to copy with LIST_SWAP() macro definition. The semantics of the getgroupslist(3) was changed in 8.x and 9.x, so it is required to use -mapall and -maproot with list of groups (-mapall=user:group) on 7.x, otherwise nfse will report about duplicated groups for "-mapall=user". The src/sys.diff changes are more complex. First of all NFSv4 support and sys/fs/nfs* were added in 8.0 and some parts in sys/nfs* were changed. Having checked difference between 7.x and 8.x-10.x in NFS related parts, I think it will require to rewrite all changes from sys.diff to support 7.x. Right now it is possible to apply sys.diff to RELENG_9, only one simple change will be rejected. I applied sys.diff to 8.2 and there are several rejected files, I made sys.diff for 8.2 and sent it to you in a private message (it can be built, but I did not run it). I did not make changes for cddl/ from 8.2, this is not necessary. Modification of etc/ is not complex. > > I believe your work could be more easily adopted (even into the core > FreeBSD sources) provided that > - patches for all supported branches of FreeBSD were available and I used to think that major changes should go to the CURRENT and only then be MFCed to other releases if necessary. > - there existed a simple knob in rc.conf where one switches between > the old mountd and the new nfse It works like this. Set mountd_enable="NO" and set nfse_enable="YES", if compatibility mode with exports(5) configuration is required then set nfse_exports_compat="YES". > > I guess that for the latter you'd also need to introduce some > compatibility shims into your kernel changes, such that a single kernel > could support both methods. Does "Both methods" mean "mountd and nfse"? If so, then it works like this right now. By default VFS_CHECKEXP() is used (mountd mode) and if nfse is called at least one time, then NFSE code in the NFS server is used. A user can continue to use mountd on modified system. > > What is your opinion? > There is opinion that nfse in its compatibility mode is not compatible with mountd and its configuration. This opinion exist, because first versions of NFSE changes were incompatible with existent exports(5) configurations and I explained to several commiters why exports(5) format is wrong and why nfs.exports(5) format was created. Then the compatibility mode was implemented and then it was improved several times. I wrote comparison of the nfse compatibility mode with mountd and exports(5): http://nfse.sourceforge.net/COMPATIBILITY According to my understanding of exports(5) format I can say that "nfse -C" is completely compatible with exports(5) rules and mountd is incompatible with exports(5) rules. Please note, that I do not compare nfs.exports(5) with exports(5), because they are not compatible. The compatibility mode was implemented because I was told that supporting of existent exports(5) configuration is required. Also I made several changes to mountd to make it more compatible with exports(5) (available in PRs), or at least restore its compatibility with exports(5) that existed several years ago. I asked questions whether these changes are POLA violations here: http://lists.freebsd.org/pipermail/freebsd-fs/2012-September/015175.html I have not received any replies on my requests to check nfse compatibility mode with existent configurations, so I think there is no opinion in the community about nfse compatibility mode with exports(5) configuration. Also, I have not received any replies about nfs.exports(5) configuration format and any technical opinions/review about NFSE changes in the kernel. I will repeat it again one more time. It is not necessary to modify the kernel, patch or install something to check nfse compatibility mode and/or to check nfs.exports(5) format, just compile the nfse utility and use it to verify existent exports(5) configurations. From owner-freebsd-fs@FreeBSD.ORG Mon Oct 22 15:16:02 2012 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 2E15A528 for ; Mon, 22 Oct 2012 15:16:02 +0000 (UTC) (envelope-from bra@fsn.hu) Received: from people.fsn.hu (people.fsn.hu [195.228.252.137]) by mx1.freebsd.org (Postfix) with ESMTP id 063EE8FC08 for ; Mon, 22 Oct 2012 15:16:00 +0000 (UTC) Received: by people.fsn.hu (Postfix, from userid 1001) id 9F908E66293; Mon, 22 Oct 2012 17:15:51 +0200 (CEST) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.2 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MF-ACE0E1EA [pR: 15.3466] X-CRM114-CacheID: sfid-20121022_17155_FB7A4A58 X-CRM114-Status: Good ( pR: 15.3466 ) X-DSPAM-Result: Whitelisted X-DSPAM-Processed: Mon Oct 22 17:15:51 2012 X-DSPAM-Confidence: 0.9961 X-DSPAM-Probability: 0.0000 X-DSPAM-Signature: 50856327748374770168120 X-DSPAM-Factors: 27, From*Attila Nagy , 0.00010, FreeBSD, 0.00041, FreeBSD, 0.00041, the+>, 0.00103, cache, 0.00201, wrote+>, 0.00230, and+>, 0.00247, >+I, 0.00265, >+I, 0.00265, I+>, 0.00302, disks, 0.00388, disks, 0.00388, debug, 0.00417, 01+00, 0.00417, the+code, 0.00452, snapshot, 0.00493, dev, 0.00542, the+machine, 0.00542, the+machine, 0.00542, cmd, 0.00542, 02+00, 0.00542, sendmail, 0.00542, sendmail, 0.00542, wrote, 0.00546, 15+0, 0.00602, root, 0.00656, X-Spambayes-Classification: ham; 0.00 Received: from [192.168.3.2] (japan.t-online.co.hu [195.228.243.99]) by people.fsn.hu (Postfix) with ESMTPSA id 8F9E9E66282; Mon, 22 Oct 2012 17:15:48 +0200 (CEST) Message-ID: <50856322.9070307@fsn.hu> Date: Mon, 22 Oct 2012 17:15:46 +0200 From: Attila Nagy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.23) Gecko/20090817 Thunderbird/2.0.0.23 Mnenhy/0.7.6.0 MIME-Version: 1.0 To: Dennis Glatting Subject: Re: ZFS HBAs + LSI chip sets (Was: ZFS hang (system #2)) References: <1350698905.86715.33.camel@btw.pki2.com> <1350711509.86715.59.camel@btw.pki2.com> <50825598.3070505@FreeBSD.org> <1350744349.88577.10.camel@btw.pki2.com> <1350765093.86715.69.camel@btw.pki2.com> <508322EC.4080700@FreeBSD.org> <1350778257.86715.106.camel@btw.pki2.com> In-Reply-To: <1350778257.86715.106.camel@btw.pki2.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 Oct 2012 15:16:02 -0000 Hi, On 10/21/2012 02:10 AM, Dennis Glatting wrote: > I chosen the LSI2008 chip set because the code was donated by LSI, and > they therefore demonstrated interest in supporting their products under > FreeBSD, and that chip set is found in a lot of places, notably > Supermicro boards. Additionally, there were stories of success on the > lists for several boards. That said, I have received private email from > others expressing frustration with ZFS and the "hang" problems, which I > believe are also the LSI chips. > I have a Sun X4540, which shows similar symptoms. It has some (6) on-board LSI 1068E SAS controllers with 1.27.02.00-IT firmware (latest from Sun/Oracle) and 48 SATA disks. It runs stable/9@r240134. Currently the machine does a resilver on its 48 disk pool (heavy IO happens), which stops periodically. I've set up watchdogd with a command of "ls /data" (the pool is mounted there). It doesn't restart the machine when the IO freezes, because the command always succeeds (coming from cache I guess). But if something wants to touch the disks, it stucks in D state. zpool status shows: scan: resilver in progress since Sun Oct 21 15:40:50 2012 3.16T scanned out of 13.8T at 26.4M/s, 117h45m to go 133G resilvered, 22.82% done And the estimated time grows constantly. gstat shows no IO. If I issue an ls -R /data, it gets stuck: root 36217 0.0 0.0 14380 1800 3 D+ 4:45PM 0:00.00 ls -R /data/ # procstat -k 36217 PID TID COMM TDNAME KSTACK 36217 101469 ls - mi_switch sleepq_wait _cv_wait zio_wait dbuf_read dbuf_findbp dbuf_hold_impl dbuf_hold dmu_buf_hold zap_lockdir zap_cursor_retrieve zfs_freebsd_readdir kern_getdirentries sys_getdirentries amd64_syscall Xfast_syscall Also, a dd on any of the disks waits forever, without reading a single byte: root 36570 0.0 0.0 9876 1356 4 DL+ 4:46PM 0:00.00 dd if=/dev/da0 of=/dev/null # procstat -k 36570 PID TID COMM TDNAME KSTACK 36570 101489 dd - mi_switch sleepq_wait _sleep bwait physio devfs_read_f dofileread kern_readv sys_read amd64_syscall Xfast_syscall Camcontrol works: # camcontrol devlist at scbus0 target 0 lun 0 (pass0,da0) at scbus0 target 1 lun 0 (pass1,da1) at scbus0 target 2 lun 0 (pass2,da2) at scbus0 target 3 lun 0 (pass3,da3) at scbus0 target 4 lun 0 (pass4,da4) at scbus0 target 5 lun 0 (pass5,da5) at scbus0 target 6 lun 0 (pass6,da6) at scbus0 target 7 lun 0 (pass7,da7) at scbus1 target 0 lun 0 (pass8,da8) at scbus1 target 1 lun 0 (pass9,da9) at scbus1 target 2 lun 0 (pass10,da10) at scbus1 target 3 lun 0 (pass11,da11) at scbus1 target 4 lun 0 (pass12,da12) at scbus1 target 5 lun 0 (pass13,da13) at scbus1 target 6 lun 0 (pass14,da14) at scbus1 target 7 lun 0 (pass15,da15) at scbus2 target 0 lun 0 (pass16,da16) at scbus2 target 1 lun 0 (pass17,da17) at scbus2 target 2 lun 0 (pass18,da18) at scbus2 target 3 lun 0 (pass19,da19) at scbus2 target 4 lun 0 (pass20,da20) at scbus2 target 5 lun 0 (pass21,da21) at scbus2 target 6 lun 0 (pass22,da22) at scbus2 target 7 lun 0 (pass23,da23) at scbus3 target 0 lun 0 (pass24,da24) at scbus3 target 1 lun 0 (pass25,da25) at scbus3 target 2 lun 0 (pass26,da26) at scbus3 target 3 lun 0 (pass27,da27) at scbus3 target 4 lun 0 (pass28,da28) at scbus3 target 5 lun 0 (pass29,da29) at scbus3 target 6 lun 0 (pass30,da30) at scbus3 target 7 lun 0 (pass31,da31) at scbus4 target 0 lun 0 (pass32,da32) at scbus4 target 1 lun 0 (pass33,da33) at scbus4 target 2 lun 0 (pass34,da34) at scbus4 target 3 lun 0 (pass35,da35) at scbus4 target 4 lun 0 (pass36,da36) at scbus4 target 5 lun 0 (pass37,da37) at scbus4 target 6 lun 0 (pass38,da38) at scbus4 target 7 lun 0 (pass39,da39) at scbus5 target 0 lun 0 (pass40,da40) at scbus5 target 1 lun 0 (pass41,da41) at scbus5 target 2 lun 0 (pass42,da42) at scbus5 target 3 lun 0 (pass43,da43) at scbus5 target 4 lun 0 (pass44,da44) at scbus5 target 5 lun 0 (pass45,da45) at scbus5 target 6 lun 0 (pass46,da46) at scbus5 target 7 lun 0 (pass47,da47) # camcontrol tags da0 (pass0:mpt0:0:0:0): device openings: 255 Also works (I guess it doesn't touch the disks): # zfs list NAME USED AVAIL REFER MOUNTPOINT logpool 13.1T 7.17T 507K /data logpool/jail 7.08G 7.17T 7.08G /data/jail logpool/logs 13.1T 7.17T 3.40T /data/jail/logvm/logs logpool/logs/OTHER 9.24T 7.17T 2.36T /data/jail/logvm/logs/OTHER But this doesn't: root 36686 0.0 0.0 33384 2512 5 D+ 4:49PM 0:00.00 zfs list -t snapshot # procstat -k 36686 PID TID COMM TDNAME KSTACK 36686 101593 zfs - mi_switch sleepq_wait _cv_wait zio_wait dbuf_read dmu_buf_hold zap_lockdir zap_cursor_retrieve dmu_snapshot_list_next zfs_ioc_snapshot_list_next zfsdev_ioctl devfs_ioctl_f kern_ioctl sys_ioctl amd64_syscall Xfast_syscall Entering into the debugger: KDB: enter: sysctl debug.kdb.enter [ thread pid 36959 tid 101484 ] Stopped at kdb_enter+0x3b: movq $0,0x95ab72(%rip) db> ps pid ppid pgrp uid state wmesg wchan cmd 36959 1769 36959 0 R+ CPU 0 sysctl 36691 919 919 0 S sbwait 0xfffffe009d752144 perl 36686 36677 36686 0 D+ zio->io_ 0xfffffe001ccb7d70 zfs 36677 36208 36677 0 Ss+ pause 0xfffffe009d0030a0 csh 36570 36567 36570 0 DL+ physrd 0xffffff87005a2980 dd 36567 36208 36567 0 Ss+ pause 0xfffffe00115c4540 csh 36217 36209 36217 0 D+ zio->io_ 0xfffffe001c2b2320 ls 36209 36208 36209 0 Ss+ pause 0xfffffe022c8aa0a0 csh 36208 36207 36208 0 Ss select 0xfffffe0665c92e40 screen 36207 1782 36207 0 S+ pause 0xfffffe009d0010a0 screen 32921 883 873 0 DL cbwait 0xfffffe000f7f7848 camcontrol 1782 1780 1782 0 Ss+ pause 0xfffffe009d4559e0 csh 1780 897 1780 0 Ss select 0xfffffe001d546740 sshd 1776 1774 1776 0 Ss+ ttyin 0xfffffe001c02a4a8 csh 1774 897 1774 0 Ss select 0xfffffe001cb4d0c0 sshd 1769 1767 1769 0 Ss+ pause 0xfffffe001191a540 csh 1767 897 1767 0 Ss select 0xfffffe000fd72bc0 sshd 1079 1 1079 0 Ss+ ttyin 0xfffffe000c82c4a8 getty 1078 1 1078 0 Ss+ ttyin 0xfffffe000c82c8a8 getty 1077 1 1077 0 Ss+ ttyin 0xfffffe000c82cca8 getty 1076 1 1076 0 Ss+ ttyin 0xfffffe000c82d0a8 getty 1075 1 1075 0 Ss+ ttyin 0xfffffe000c82d4a8 getty 1074 1 1074 0 Ss+ ttyin 0xfffffe000c82d8a8 getty 1073 1 1073 0 Ss+ ttyin 0xfffffe000c82dca8 getty 1072 1 1072 0 Ss+ ttyin 0xfffffe000c82f0a8 getty 919 1 919 0 Ss select 0xfffffe000f5ac940 perl 907 1 907 0 Ss nanslp 0xffffffff81244f08 cron 903 1 903 25 Ss pause 0xfffffe001125e0a0 sendmail 900 1 900 0 Ss select 0xfffffe001d549340 sendmail 897 1 897 0 Ss select 0xfffffe001d546cc0 sshd 892 884 873 0 S piperd 0xfffffe001e940888 fghack 884 878 873 0 S wait 0xfffffe000fdee000 sh 883 879 873 0 S piperd 0xfffffe022c08b000 perl 879 875 873 0 S select 0xfffffe001ca6a8c0 supervise 878 875 873 0 S select 0xfffffe000fd73d40 supervise 876 1 873 0 S piperd 0xfffffe001e9c5b60 readproctitle 875 1 873 0 S nanslp 0xffffffff81244f08 svscan 870 868 867 123 S select 0xfffffe000fd934c0 ntpd 868 867 867 123 S select 0xfffffe001ca68e40 ntpd 867 1 867 0 Ss select 0xfffffe000fddd740 ntpd 796 0 0 0 DL mdwait 0xfffffe000f52a000 [md2] 774 1 774 53 Ss (threaded) named 101524 S kqread 0xfffffe00115dd100 named 101523 S uwait 0xfffffe000fde5200 named 101522 S uwait 0xfffffe00110ce680 named 101521 S uwait 0xfffffe000fda0300 named 101520 S uwait 0xfffffe000fddd380 named 101519 S uwait 0xfffffe001198ca00 named 101518 S uwait 0xfffffe000fd58880 named 101517 S uwait 0xfffffe000fd7ab80 named 101516 S uwait 0xfffffe000f80e480 named 101515 S uwait 0xfffffe000f80f400 named 101501 S sigwait 0xfffffe00110dd000 named 751 750 751 0 Ss select 0xfffffe001d549440 syslog-ng 750 1 749 0 S wait 0xfffffe000c8144a0 syslog-ng 612 608 608 64 S bpf 0xfffffe001ca94800 pflogd 608 1 608 0 Ss sbwait 0xfffffe001eb4ae8c pflogd 605 0 0 0 DL pftm 0xffffffff817547a0 [pfpurge] 78 0 0 0 DL (threaded) [zfskern] 101459 D spa->spa 0xfffffe0011462680 [txg_thread_enter] 101458 D tx->tx_q 0xfffffe001b199230 [txg_thread_enter] 100122 D l2arc_fe 0xffffffff8173ebc0 [l2arc_feed_thread] 100121 D arc_recl 0xffffffff8172ed20 [arc_reclaim_thread] 59 0 0 0 DL mdwait 0xfffffe000f521000 [md1] 47 0 0 0 DL mdwait 0xfffffe000f523800 [md0] 24 0 0 0 DL sdflush 0xffffffff812a6158 [softdepflush] 23 0 0 0 DL syncer 0xffffffff812928c0 [syncer] 22 0 0 0 DL vlruwt 0xfffffe000c80d000 [vnlru] 21 0 0 0 DL psleep 0xffffffff81292348 [bufdaemon] 20 0 0 0 DL pgzero 0xffffffff812b019c [pagezero] 19 0 0 0 DL psleep 0xffffffff812af368 [vmdaemon] 18 0 0 0 DL psleep 0xffffffff812af32c [pagedaemon] 17 0 0 0 DL ccb_scan 0xffffffff811ff260 [xpt_thrd] 16 0 0 0 DL idle 0xffffff8001df3000 [mpt_recovery5] 9 0 0 0 DL idle 0xffffff8001dde000 [mpt_recovery4] 8 0 0 0 DL idle 0xffffff8001dc9000 [mpt_recovery3] 7 0 0 0 DL idle 0xffffff8001daa000 [mpt_recovery2] 6 0 0 0 DL idle 0xffffff8001d95000 [mpt_recovery1] 5 0 0 0 DL idle 0xffffff8001d80000 [mpt_recovery0] 15 0 0 0 DL (threaded) [usb] 100048 D - 0xffffff8001d73e18 [usbus1] 100047 D - 0xffffff8001d73dc0 [usbus1] 100046 D - 0xffffff8001d73d68 [usbus1] 100045 D - 0xffffff8001d73d10 [usbus1] 100043 D - 0xffffff8001d6b460 [usbus0] 100042 D - 0xffffff8001d6b408 [usbus0] 100041 D - 0xffffff8001d6b3b0 [usbus0] 100040 D - 0xffffff8001d6b358 [usbus0] 4 0 0 0 DL ctl_work 0xffffff8000a41000 [ctl_thrd] 14 0 0 0 DL - 0xffffffff81243ba4 [yarrow] 3 0 0 0 DL crypto_r 0xffffffff812a4ae0 [crypto returns] 2 0 0 0 DL crypto_w 0xffffffff812a4aa0 [crypto] 13 0 0 0 DL (threaded) [geom] 100023 D - 0xffffffff8123d030 [g_down] 100022 D - 0xffffffff8123d028 [g_up] 100021 D - 0xffffffff8123d018 [g_event] 12 0 0 0 RL (threaded) [intr] 100065 I [swi0: uart] 100063 I [irq293: mpt5] 100061 I [irq292: mpt4] 100059 I [irq291: mpt3] 100055 I [irq274: mpt2] 100053 I [irq273: mpt1] 100051 I [irq272: mpt0] 100044 I [irq22: ehci0] 100039 I [irq21: ohci0] 100034 I [swi2: cambio] 100031 I [swi6: task queue] 100030 I [swi6: Giant taskq] 100028 I [swi5: +] 100020 I [swi1: netisr 0] 100019 I [swi4: clock] 100018 I [swi4: clock] 100017 I [swi4: clock] 100016 I [swi4: clock] 100015 I [swi4: clock] 100014 I [swi4: clock] 100013 I [swi4: clock] 100012 RunQ [swi4: clock] 100011 I [swi3: vm] 11 0 0 0 RL (threaded) [idle] 100010 Run CPU 7 [idle: cpu7] 100009 Run CPU 6 [idle: cpu6] 100008 Run CPU 5 [idle: cpu5] 100007 Run CPU 4 [idle: cpu4] 100006 Run CPU 3 [idle: cpu3] 100005 Run CPU 2 [idle: cpu2] 100004 Run CPU 1 [idle: cpu1] 100003 CanRun [idle: cpu0] 1 0 1 0 SLs wait 0xfffffe000c068940 [init] 10 0 0 0 DL audit_wo 0xffffffff812a50d0 [audit] 0 0 0 0 DLs (threaded) [kernel] 101463 D - 0xfffffe000fddab00 [zil_clean] 101462 D - 0xfffffe000fd6a800 [zil_clean] 101461 D - 0xfffffe000fdf6180 [zil_clean] 101460 D - 0xfffffe001d546600 [zil_clean] 101457 D - 0xfffffe000f359e00 [zfs_vn_rele_taskq] 101456 D - 0xfffffe001198d080 [zio_ioctl_intr] 101455 D - 0xfffffe001cb4fa80 [zio_ioctl_issue] 101454 D - 0xfffffe000ffbf380 [zio_claim_intr] 101453 D - 0xfffffe00110cf580 [zio_claim_issue] 101452 D - 0xfffffe00110cf880 [zio_free_intr] 101451 D - 0xfffffe000ffc1b80 [zio_free_issue_99] 101450 D - 0xfffffe000ffc1b80 [zio_free_issue_98] 101449 D - 0xfffffe000ffc1b80 [zio_free_issue_97] 101448 D - 0xfffffe000ffc1b80 [zio_free_issue_96] 101447 D - 0xfffffe000ffc1b80 [zio_free_issue_95] 101446 D - 0xfffffe000ffc1b80 [zio_free_issue_94] 101445 D - 0xfffffe000ffc1b80 [zio_free_issue_93] 101444 D - 0xfffffe000ffc1b80 [zio_free_issue_92] 101443 D - 0xfffffe000ffc1b80 [zio_free_issue_91] 101442 D - 0xfffffe000ffc1b80 [zio_free_issue_90] 101441 D - 0xfffffe000ffc1b80 [zio_free_issue_89] 101440 D - 0xfffffe000ffc1b80 [zio_free_issue_88] 101439 D - 0xfffffe000ffc1b80 [zio_free_issue_87] 101438 D - 0xfffffe000ffc1b80 [zio_free_issue_86] 101437 D - 0xfffffe000ffc1b80 [zio_free_issue_85] 101436 D - 0xfffffe000ffc1b80 [zio_free_issue_84] 101435 D - 0xfffffe000ffc1b80 [zio_free_issue_83] 101434 D - 0xfffffe000ffc1b80 [zio_free_issue_82] 101433 D - 0xfffffe000ffc1b80 [zio_free_issue_81] 101432 D - 0xfffffe000ffc1b80 [zio_free_issue_80] 101431 D - 0xfffffe000ffc1b80 [zio_free_issue_79] 101430 D - 0xfffffe000ffc1b80 [zio_free_issue_78] 101429 D - 0xfffffe000ffc1b80 [zio_free_issue_77] 101428 D - 0xfffffe000ffc1b80 [zio_free_issue_76] 101427 D - 0xfffffe000ffc1b80 [zio_free_issue_75] 101426 D - 0xfffffe000ffc1b80 [zio_free_issue_74] 101425 D - 0xfffffe000ffc1b80 [zio_free_issue_73] 101424 D - 0xfffffe000ffc1b80 [zio_free_issue_72] 101423 D - 0xfffffe000ffc1b80 [zio_free_issue_71] 101422 D - 0xfffffe000ffc1b80 [zio_free_issue_70] 101421 D - 0xfffffe000ffc1b80 [zio_free_issue_69] 101420 D - 0xfffffe000ffc1b80 [zio_free_issue_68] 101419 D - 0xfffffe000ffc1b80 [zio_free_issue_67] 101418 D - 0xfffffe000ffc1b80 [zio_free_issue_66] 101417 D - 0xfffffe000ffc1b80 [zio_free_issue_65] 101416 D - 0xfffffe000ffc1b80 [zio_free_issue_64] 101415 D - 0xfffffe000ffc1b80 [zio_free_issue_63] 101414 D - 0xfffffe000ffc1b80 [zio_free_issue_62] 101413 D - 0xfffffe000ffc1b80 [zio_free_issue_61] 101412 D - 0xfffffe000ffc1b80 [zio_free_issue_60] 101411 D - 0xfffffe000ffc1b80 [zio_free_issue_59] 101410 D - 0xfffffe000ffc1b80 [zio_free_issue_58] 101409 D - 0xfffffe000ffc1b80 [zio_free_issue_57] 101408 D - 0xfffffe000ffc1b80 [zio_free_issue_56] 101407 D - 0xfffffe000ffc1b80 [zio_free_issue_55] 101406 D - 0xfffffe000ffc1b80 [zio_free_issue_54] 101405 D - 0xfffffe000ffc1b80 [zio_free_issue_53] 101404 D - 0xfffffe000ffc1b80 [zio_free_issue_52] 101403 D - 0xfffffe000ffc1b80 [zio_free_issue_51] 101402 D - 0xfffffe000ffc1b80 [zio_free_issue_50] 101401 D - 0xfffffe000ffc1b80 [zio_free_issue_49] 101400 D - 0xfffffe000ffc1b80 [zio_free_issue_48] 101399 D - 0xfffffe000ffc1b80 [zio_free_issue_47] 101398 D - 0xfffffe000ffc1b80 [zio_free_issue_46] 101397 D - 0xfffffe000ffc1b80 [zio_free_issue_45] 101396 D - 0xfffffe000ffc1b80 [zio_free_issue_44] 101395 D - 0xfffffe000ffc1b80 [zio_free_issue_43] 101394 D - 0xfffffe000ffc1b80 [zio_free_issue_42] 101393 D - 0xfffffe000ffc1b80 [zio_free_issue_41] 101392 D - 0xfffffe000ffc1b80 [zio_free_issue_40] 101391 D - 0xfffffe000ffc1b80 [zio_free_issue_39] 101390 D - 0xfffffe000ffc1b80 [zio_free_issue_38] 101389 D - 0xfffffe000ffc1b80 [zio_free_issue_37] 101388 D - 0xfffffe000ffc1b80 [zio_free_issue_36] 101387 D - 0xfffffe000ffc1b80 [zio_free_issue_35] 101386 D - 0xfffffe000ffc1b80 [zio_free_issue_34] 101385 D - 0xfffffe000ffc1b80 [zio_free_issue_33] 101384 D - 0xfffffe000ffc1b80 [zio_free_issue_32] 101383 D - 0xfffffe000ffc1b80 [zio_free_issue_31] 100569 D - 0xfffffe000ffc1b80 [zio_free_issue_30] 100567 D - 0xfffffe000ffc1b80 [zio_free_issue_29] 100565 D - 0xfffffe000ffc1b80 [zio_free_issue_28] 100560 D - 0xfffffe000ffc1b80 [zio_free_issue_27] 100554 D - 0xfffffe000ffc1b80 [zio_free_issue_26] 100553 D - 0xfffffe000ffc1b80 [zio_free_issue_25] 100547 D - 0xfffffe000ffc1b80 [zio_free_issue_24] 100545 D - 0xfffffe000ffc1b80 [zio_free_issue_23] 100542 D - 0xfffffe000ffc1b80 [zio_free_issue_22] 100539 D - 0xfffffe000ffc1b80 [zio_free_issue_21] 100536 D - 0xfffffe000ffc1b80 [zio_free_issue_20] 100530 D - 0xfffffe000ffc1b80 [zio_free_issue_19] 100487 D - 0xfffffe000ffc1b80 [zio_free_issue_18] 100415 D - 0xfffffe000ffc1b80 [zio_free_issue_17] 100413 D - 0xfffffe000ffc1b80 [zio_free_issue_16] 100407 D - 0xfffffe000ffc1b80 [zio_free_issue_15] 100403 D - 0xfffffe000ffc1b80 [zio_free_issue_14] 100400 D - 0xfffffe000ffc1b80 [zio_free_issue_13] 100393 D - 0xfffffe000ffc1b80 [zio_free_issue_12] 100391 D - 0xfffffe000ffc1b80 [zio_free_issue_11] 100387 D - 0xfffffe000ffc1b80 [zio_free_issue_10] 100386 D - 0xfffffe000ffc1b80 [zio_free_issue_9] 100385 D - 0xfffffe000ffc1b80 [zio_free_issue_8] 100384 D - 0xfffffe000ffc1b80 [zio_free_issue_7] 100383 D - 0xfffffe000ffc1b80 [zio_free_issue_6] 100379 D - 0xfffffe000ffc1b80 [zio_free_issue_5] 100372 D - 0xfffffe000ffc1b80 [zio_free_issue_4] 100367 D - 0xfffffe000ffc1b80 [zio_free_issue_3] 100366 D - 0xfffffe000ffc1b80 [zio_free_issue_2] 100361 D - 0xfffffe000ffc1b80 [zio_free_issue_1] 100360 D - 0xfffffe000ffc1b80 [zio_free_issue_0] 100359 D - 0xfffffe001ca67280 [zio_write_intr_high] 100358 D - 0xfffffe001ca67280 [zio_write_intr_high] 100357 D - 0xfffffe001ca67280 [zio_write_intr_high] 100354 D - 0xfffffe001ca67280 [zio_write_intr_high] 100353 D - 0xfffffe001ca67280 [zio_write_intr_high] 100349 D - 0xfffffe000fd72700 [zio_write_intr_7] 100348 D - 0xfffffe000fd72700 [zio_write_intr_6] 100345 D - 0xfffffe000fd72700 [zio_write_intr_5] 100343 D - 0xfffffe000fd72700 [zio_write_intr_4] 100342 D - 0xfffffe000fd72700 [zio_write_intr_3] 100341 D - 0xfffffe000fd72700 [zio_write_intr_2] 100340 D - 0xfffffe000fd72700 [zio_write_intr_1] 100339 D - 0xfffffe000fd72700 [zio_write_intr_0] 100337 D - 0xfffffe001196ce00 [zio_write_issue_hig] 100336 D - 0xfffffe001196ce00 [zio_write_issue_hig] 100334 D - 0xfffffe001196ce00 [zio_write_issue_hig] 100330 D - 0xfffffe001196ce00 [zio_write_issue_hig] 100327 D - 0xfffffe001196ce00 [zio_write_issue_hig] 100324 D - 0xfffffe00110cfb00 [zio_write_issue_7] 100322 D - 0xfffffe00110cfb00 [zio_write_issue_6] 100321 D - 0xfffffe00110cfb00 [zio_write_issue_5] 100316 D - 0xfffffe00110cfb00 [zio_write_issue_4] 100314 D - 0xfffffe00110cfb00 [zio_write_issue_3] 100312 D - 0xfffffe00110cfb00 [zio_write_issue_2] 100311 D - 0xfffffe00110cfb00 [zio_write_issue_1] 100307 D - 0xfffffe00110cfb00 [zio_write_issue_0] 100306 D - 0xfffffe000ffbfc80 [zio_read_intr_7] 100305 D - 0xfffffe000ffbfc80 [zio_read_intr_6] 100303 D - 0xfffffe000ffbfc80 [zio_read_intr_5] 100300 D - 0xfffffe000ffbfc80 [zio_read_intr_4] 100298 D - 0xfffffe000ffbfc80 [zio_read_intr_3] 100297 D - 0xfffffe000ffbfc80 [zio_read_intr_2] 100293 D - 0xfffffe000ffbfc80 [zio_read_intr_1] 100292 D - 0xfffffe000ffbfc80 [zio_read_intr_0] 100291 D - 0xfffffe00110cf000 [zio_read_issue_7] 100289 D - 0xfffffe00110cf000 [zio_read_issue_6] 100288 D - 0xfffffe00110cf000 [zio_read_issue_5] 100286 D - 0xfffffe00110cf000 [zio_read_issue_4] 100282 D - 0xfffffe00110cf000 [zio_read_issue_3] 100281 D - 0xfffffe00110cf000 [zio_read_issue_2] 100280 D - 0xfffffe00110cf000 [zio_read_issue_1] 100278 D - 0xfffffe00110cf000 [zio_read_issue_0] 100275 D - 0xfffffe001113b500 [zio_null_intr] 100273 D - 0xfffffe001196c800 [zio_null_issue] 100120 D - 0xfffffe0011370300 [system_taskq_7] 100119 D - 0xfffffe0011370300 [system_taskq_6] 100118 D - 0xfffffe0011370300 [system_taskq_5] 100117 D - 0xfffffe0011370300 [system_taskq_4] 100116 D - 0xfffffe0011370300 [system_taskq_3] 100115 D - 0xfffffe0011370300 [system_taskq_2] 100114 D - 0xfffffe0011370300 [system_taskq_1] 100113 D - 0xfffffe0011370300 [system_taskq_0] 100066 D - 0xfffffe000f239a80 [mca taskq] 100058 D - 0xfffffe000c69b900 [nfe3 taskq] 100057 D - 0xfffffe000c698480 [nfe2 taskq] 100050 D - 0xfffffe000c620400 [nfe1 taskq] 100049 D - 0xfffffe000c61b500 [nfe0 taskq] 100037 D - 0xfffffe000c24bb00 [acpi_task_2] 100036 D - 0xfffffe000c24bb00 [acpi_task_1] 100035 D - 0xfffffe000c24bb00 [acpi_task_0] 100033 D - 0xfffffe000c24be00 [kqueue taskq] 100032 D - 0xfffffe000c24c000 [ffs_trim taskq] 100029 D - 0xfffffe000c20c780 [thread taskq] 100024 D - 0xfffffe000c07fb80 [firmware taskq] 100000 D sched 0xffffffff8123d280 [swapper] 895 892 873 0 Z perl Setting this: # sysctl dev.mpt.0.debug=255 and doing a dd again from a disk on that controller prints this onto the console: SCSI IO Request @ 0xffffff80003046f0 Chain Offset 0x00 MsgFlags 0x00 MsgContext 0x000201c5 Bus: 0 TargetID 0 SenseBufferLength 32 LUN: 0x0 Control 0x02000200 READ ORDEREDQ DataLength 0x00000200 SenseBufAddr 0x0c678be0 CDB[0:6] 08 00 00 00 01 00 SE64 0xffffff87ffd33a30: Addr=0x000000070cc08400 FlagsLength=0xd3000200 64_BIT_ADDRESSING LAST_ELEMENT END_OF_BUFFER END_OF_LIST mpt0: Send Request 453 (c678a00): mpt0: 00000000 00002006 000201c5 00000000 00000000 02000200 00000008 00000001 mpt0: 00000000 00000000 00000200 0c678be0 d3000200 0cc08400 00000007 ffffffff mpt0: ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff mpt0: ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff mpt0: enter mpt_intr mpt0: Context Reply: 0x000201c5 mpt0: exit mpt_intr And dd freezes. Alltrace from a couple of stuck processes: Tracing command dd pid 36971 tid 101570 td 0xfffffe001efce000 sched_switch() at sched_switch+0x115 mi_switch() at mi_switch+0x186 sleepq_wait() at sleepq_wait+0x42 _sleep() at _sleep+0x379 bwait() at bwait+0x64 physio() at physio+0x1c8 devfs_read_f() at devfs_read_f+0x90 dofileread() at dofileread+0xa1 kern_readv() at kern_readv+0x6c sys_read() at sys_read+0x64 amd64_syscall() at amd64_syscall+0x540 Xfast_syscall() at Xfast_syscall+0xf7 --- syscall (3, FreeBSD ELF64, sys_read), rip = 0x800916c8c, rsp = 0x7fffffffd658, rbp = 0x7fffffffd6b0 --- Tracing command zfs pid 36686 tid 101593 td 0xfffffe001ecb3900 sched_switch() at sched_switch+0x115 mi_switch() at mi_switch+0x186 sleepq_wait() at sleepq_wait+0x42 _cv_wait() at _cv_wait+0x112 zio_wait() at zio_wait+0x61 dbuf_read() at dbuf_read+0x5e5 dmu_buf_hold() at dmu_buf_hold+0xe0 zap_lockdir() at zap_lockdir+0x58 zap_cursor_retrieve() at zap_cursor_retrieve+0x19b dmu_snapshot_list_next() at dmu_snapshot_list_next+0xaf zfs_ioc_snapshot_list_next() at zfs_ioc_snapshot_list_next+0x101 zfsdev_ioctl() at zfsdev_ioctl+0xe6 devfs_ioctl_f() at devfs_ioctl_f+0x7b kern_ioctl() at kern_ioctl+0x106 sys_ioctl() at sys_ioctl+0xfd amd64_syscall() at amd64_syscall+0x540 Xfast_syscall() at Xfast_syscall+0xf7 --- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x801be2c2c, rsp = 0x7fffffff8938, rbp = 0x4000 --- From owner-freebsd-fs@FreeBSD.ORG Mon Oct 22 15:50:29 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id E6593623 for ; Mon, 22 Oct 2012 15:50:29 +0000 (UTC) (envelope-from cpghost@cordula.ws) Received: from mail-ie0-f182.google.com (mail-ie0-f182.google.com [209.85.223.182]) by mx1.freebsd.org (Postfix) with ESMTP id A4FF08FC18 for ; Mon, 22 Oct 2012 15:50:29 +0000 (UTC) Received: by mail-ie0-f182.google.com with SMTP id k10so5134644iea.13 for ; Mon, 22 Oct 2012 08:50:29 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-originating-ip:in-reply-to:references:date :message-id:subject:from:to:cc:content-type:x-gm-message-state; bh=yIjXKTarFdBxifQM3LOfX/mp+GIkoG4BZ1Djhrtu8yo=; b=JFwwckGaH1ztGyicnf8LzjG7Yg/bNe0WW4Gsc/lVV82XAF2tdRL+yZhMldw0xC+iGr wtagJTXzOM2t1QqnwsOw/yGQ7+VsUpa1oCqgT80X2bw6WTvgw2wSOPzB7QG4XtvNqTjX FlRkAbjkSteZ/BID3qGL9frm8jVv4i3CrGLtOyDAESGZ5AsPWgk+cL5KM5BRmpqeC7Wq ydMCzpHDME5uZksOzlwx1Ctp7d4nT2EW++bR3GoN/kH2pAQxuz7kvJdKJ4JyksJsxumY IOdATmHn/2pyOd5Z8mkVdwJoNUsZWjFUFlc6w2sAxuJFyyvyMqsOHuBlXRUVDL4i5QUS b/Ig== MIME-Version: 1.0 Received: by 10.50.237.70 with SMTP id va6mr16581462igc.8.1350921024406; Mon, 22 Oct 2012 08:50:24 -0700 (PDT) Received: by 10.64.49.67 with HTTP; Mon, 22 Oct 2012 08:50:24 -0700 (PDT) X-Originating-IP: [93.221.178.135] In-Reply-To: References: Date: Mon, 22 Oct 2012 17:50:24 +0200 Message-ID: Subject: Re: MPSAFE VFS -- update From: "C. P. Ghost" To: attilio@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 X-Gm-Message-State: ALoCoQnPwVShdOIjVm554araT8+7Mw8DrGj5Wa3FsV+t5CKoF98CZD/I1Bik1Nw97GmXcdbdM4jb Cc: FreeBSD FS , Peter Holm , freebsd-current@freebsd.org, Konstantin Belousov X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 Oct 2012 15:50:30 -0000 On Thu, Oct 18, 2012 at 7:51 PM, Attilio Rao wrote: > Following the plan reported here: > http://wiki.freebsd.org/NONMPSAFE_DEORBIT_VFS > > We are now at the state where all non-MPSAFE filesystems are > disconnected by the three. Sad to see PortalFS go. You've served us well here. :-( -- Cordula's Web. http://www.cordula.ws/ From owner-freebsd-fs@FreeBSD.ORG Mon Oct 22 16:31:30 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 9D123978 for ; Mon, 22 Oct 2012 16:31:30 +0000 (UTC) (envelope-from fjwcash@gmail.com) Received: from mail-la0-f54.google.com (mail-la0-f54.google.com [209.85.215.54]) by mx1.freebsd.org (Postfix) with ESMTP id DB4738FC18 for ; Mon, 22 Oct 2012 16:31:29 +0000 (UTC) Received: by mail-la0-f54.google.com with SMTP id e12so2231631lag.13 for ; Mon, 22 Oct 2012 09:31:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=XXBHAW1ll6BlmZWkSWd3EIZRI7lVtPa19Oob/Xqcpb8=; b=HQQaYqNWwUUoctK2mnojseAHZl1uIgWl50EQAqVxK9z0GlpwnTeO9JTWZW8cSKsWQf TOMieJ6OwYkYAkSQijRFMpp141FXH3fCfv/ZKwMLh2cwZL01/NwHeO59OB8H3IsabOP0 2d3tAslJS5WzM2h/c9m7FSudMZc34O17mKLCj/AFWCWTPrfQrKD8FnhrgKLYi4U18UbI DqN+pNAHsqRMeXE3PnkT1qiHfEQ2S3MC25KE+13X9Gwy0kRlDYgUwO7LPg9daPrE9o67 15R8vkeYTKb87RoXtcp8wf+IEcdR/lG+D1RC1PqN5NAmvVBZVtv+4WBQWCfQsQdPKGsJ nJYw== MIME-Version: 1.0 Received: by 10.112.103.7 with SMTP id fs7mr3904695lbb.25.1350923488782; Mon, 22 Oct 2012 09:31:28 -0700 (PDT) Received: by 10.114.24.66 with HTTP; Mon, 22 Oct 2012 09:31:28 -0700 (PDT) In-Reply-To: References: <1350698905.86715.33.camel@btw.pki2.com> <1350711509.86715.59.camel@btw.pki2.com> <50825598.3070505@FreeBSD.org> <1350744349.88577.10.camel@btw.pki2.com> <1350765093.86715.69.camel@btw.pki2.com> <508322EC.4080700@FreeBSD.org> <1350778257.86715.106.camel@btw.pki2.com> <5084F6D5.5080400@digsys.bg> Date: Mon, 22 Oct 2012 09:31:28 -0700 Message-ID: Subject: Re: ZFS HBAs + LSI chip sets (Was: ZFS hang (system #2)) From: Freddie Cash To: Daniel Kalchev Content-Type: text/plain; charset=UTF-8 Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 Oct 2012 16:31:30 -0000 On Mon, Oct 22, 2012 at 6:47 AM, Freddie Cash wrote: > I'll double-check when I get to work, but I'm pretty sure it's 10.something. mpt(4) on alpha has firmware 1.5.20.0. mps(4) on beta has firmware 09.00.00.00, driver 14.00.00.01-fbsd. mps(4) on omega has firmware 10.00.02.00, driver 14.00.00.01-fbsd. Hope that helps. -- Freddie Cash fjwcash@gmail.com From owner-freebsd-fs@FreeBSD.ORG Mon Oct 22 18:02:00 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 28818AE4 for ; Mon, 22 Oct 2012 18:02:00 +0000 (UTC) (envelope-from gtodd@wawanesa.iciti.ca) Received: from mailout.easydns.com (mailout.easydns.com [64.68.200.141]) by mx1.freebsd.org (Postfix) with ESMTP id CB3598FC12 for ; Mon, 22 Oct 2012 18:01:59 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by mailout.easydns.com (Postfix) with ESMTP id 593BDE000 for ; Mon, 22 Oct 2012 13:55:16 -0400 (EDT) X-Virus-Scanned: Debian amavisd-new at mailout.easydns.com Received: from mailout.easydns.com ([127.0.0.1]) by localhost (mailout.easydns.vpn [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id HA0Dryy2vBqQ for ; Mon, 22 Oct 2012 13:55:15 -0400 (EDT) Received: from wawanesa.iciti.ca (CPE0080c8f208a5-CM001371173cf8.cpe.net.cable.rogers.com [99.246.61.82]) by mailout.easydns.com (Postfix) with ESMTPA id 7FB12E03D for ; Mon, 22 Oct 2012 13:55:15 -0400 (EDT) Received: (qmail 63255 invoked from network); 22 Oct 2012 13:54:34 -0400 Received: from unknown (HELO wawanesa.iciti.ca) (192.168.2.4) by wawanesa.iciti.ca with ESMTP; 22 Oct 2012 13:54:34 -0400 Date: Mon, 22 Oct 2012 13:54:34 -0400 (EDT) From: Graham Todd To: "C. P. Ghost" Subject: Re: MPSAFE VFS -- update In-Reply-To: Message-ID: References: User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: attilio@freebsd.org, FreeBSD FS , freebsd-current@freebsd.org, Konstantin Belousov , Peter Holm X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 Oct 2012 18:02:00 -0000 On Mon, 22 Oct 2012, C. P. Ghost wrote: > On Thu, Oct 18, 2012 at 7:51 PM, Attilio Rao wrote: >> Following the plan reported here: >> http://wiki.freebsd.org/NONMPSAFE_DEORBIT_VFS >> >> We are now at the state where all non-MPSAFE filesystems are >> disconnected by the three. > > Sad to see PortalFS go. You've served us well here. :-( It is kinda neat. How do/did you use it? portalfs seems to have around a 10th of the LoC than MSDosFS and the changes made to make MSDosFS MPSAFE are documented on the above page: is it just that portalfs is too obscure to have a maintainer? Thanks to all for the work on MPSAFE-ty ;-) From owner-freebsd-fs@FreeBSD.ORG Mon Oct 22 22:24:37 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 13586F03 for ; Mon, 22 Oct 2012 22:24:37 +0000 (UTC) (envelope-from lists@hurricane-ridge.com) Received: from mail-vb0-f54.google.com (mail-vb0-f54.google.com [209.85.212.54]) by mx1.freebsd.org (Postfix) with ESMTP id A67A78FC0A for ; Mon, 22 Oct 2012 22:24:36 +0000 (UTC) Received: by mail-vb0-f54.google.com with SMTP id v11so4471901vbm.13 for ; Mon, 22 Oct 2012 15:24:35 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-originating-ip:in-reply-to:references:date :message-id:subject:from:to:cc:content-type:x-gm-message-state; bh=3LfGdrwtdP0S9GHyRkCNhLp/QfMqqtvXz7jC4hG29n0=; b=ja5lbWGNuQRvyf33SNy8dQKw2B83LFQNeKxjXS9n4SFDbLtLbARMdmVmiPoeNDZRnX HDlMJN93Mb9IamNmJB5O7NNKSlUkM0qmBGdOrpn852NRAlmoIr4tuy9iLNHdTFnCunK9 BYz3ERpSXFqnOCGDSlAQyhHpoCP7p1y2bTyJwylsr+dEkmE38y+tV4RASDRumqBqyXJA bTBOMLKg3P7SBvxF/Vvj72CXgHHWbvwBrNq+2YydCAL0dte6Pu2L+tH6iaM5A4w3uc2U 9PuDSCasLi4Zqu4XEFghrmGKuwpLzRHzhrBfhD4A01h5fQjFWQtm9CRod/yBuhnw+grB OLhA== MIME-Version: 1.0 Received: by 10.221.1.75 with SMTP id np11mr17532871vcb.56.1350944675337; Mon, 22 Oct 2012 15:24:35 -0700 (PDT) Received: by 10.58.189.163 with HTTP; Mon, 22 Oct 2012 15:24:35 -0700 (PDT) X-Originating-IP: [146.129.249.238] In-Reply-To: References: <1350698905.86715.33.camel@btw.pki2.com> <1350711509.86715.59.camel@btw.pki2.com> <50825598.3070505@FreeBSD.org> <1350744349.88577.10.camel@btw.pki2.com> <1350765093.86715.69.camel@btw.pki2.com> <508322EC.4080700@FreeBSD.org> <1350778257.86715.106.camel@btw.pki2.com> <5084F6D5.5080400@digsys.bg> Date: Mon, 22 Oct 2012 15:24:35 -0700 Message-ID: Subject: Re: ZFS HBAs + LSI chip sets (Was: ZFS hang (system #2)) From: Andrew Leonard To: Freddie Cash Content-Type: text/plain; charset=ISO-8859-1 X-Gm-Message-State: ALoCoQkbfJUUCq4jK7GUfz7KpPebdDn3JAl0xmnKvSkj8C1IxvpsrnlTKOqMDe0xG4HYcfa/KEir Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 Oct 2012 22:24:37 -0000 On Mon, Oct 22, 2012 at 9:31 AM, Freddie Cash wrote: > On Mon, Oct 22, 2012 at 6:47 AM, Freddie Cash wrote: >> I'll double-check when I get to work, but I'm pretty sure it's 10.something. > > mpt(4) on alpha has firmware 1.5.20.0. > > mps(4) on beta has firmware 09.00.00.00, driver 14.00.00.01-fbsd. > > mps(4) on omega has firmware 10.00.02.00, driver 14.00.00.01-fbsd. There was an assertion a couple months back that for mps cards, the firmware version must match the driver version, and that v14 was a substantial improvement: http://www.mail-archive.com/freebsd-stable@freebsd.org/msg122546.html We're on 13 for both firmware and drivers on all our machines that haven't had problems; a couple that are on firmware 09.00.00.00 with driver 13.00.00.00 have had problems under heavy I/O (haven't been able to get downtime to upgrade the firmware yet). -Andy > Hope that helps. > > -- > Freddie Cash > fjwcash@gmail.com > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" From owner-freebsd-fs@FreeBSD.ORG Mon Oct 22 23:17:45 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 0E92A26F for ; Mon, 22 Oct 2012 23:17:45 +0000 (UTC) (envelope-from freebsd@pki2.com) Received: from btw.pki2.com (btw.pki2.com [IPv6:2001:470:a:6fd::2]) by mx1.freebsd.org (Postfix) with ESMTP id C42848FC0C for ; Mon, 22 Oct 2012 23:17:44 +0000 (UTC) Received: from [127.0.0.1] (localhost [127.0.0.1]) by btw.pki2.com (8.14.5/8.14.5) with ESMTP id q9MNHDw2074881; Mon, 22 Oct 2012 16:17:13 -0700 (PDT) (envelope-from freebsd@pki2.com) Subject: Re: ZFS HBAs + LSI chip sets (Was: ZFS hang (system #2)) From: Dennis Glatting To: Daniel Kalchev In-Reply-To: <5084F6D5.5080400@digsys.bg> References: <1350698905.86715.33.camel@btw.pki2.com> <1350711509.86715.59.camel@btw.pki2.com> <50825598.3070505@FreeBSD.org> <1350744349.88577.10.camel@btw.pki2.com> <1350765093.86715.69.camel@btw.pki2.com> <508322EC.4080700@FreeBSD.org> <1350778257.86715.106.camel@btw.pki2.com> <5084F6D5.5080400@digsys.bg> Content-Type: text/plain; charset="ISO-8859-1" Date: Mon, 22 Oct 2012 16:17:13 -0700 Message-ID: <1350947833.86715.137.camel@btw.pki2.com> Mime-Version: 1.0 X-Mailer: Evolution 2.32.1 FreeBSD GNOME Team Port Content-Transfer-Encoding: 7bit X-yoursite-MailScanner-Information: Dennis Glatting X-yoursite-MailScanner-ID: q9MNHDw2074881 X-yoursite-MailScanner: Found to be clean X-MailScanner-From: freebsd@pki2.com Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 Oct 2012 23:17:45 -0000 On Mon, 2012-10-22 at 10:33 +0300, Daniel Kalchev wrote: > > On 21.10.12 09:52, Freddie Cash wrote: > > [...] > > All three run without any serious issues. The only issues we've had > > are 3, maybe 4, situations where I've tried to destroy multi-TB > > filesystems without enough RAM in the machine. We're now running a > > minimum of 32 GB of RAM with 64 GB in one box. > > What is the firmware on your LSI2008 controllers? > > I am having weird situation with one server that has LSI2008, on > 9-stable and all SSD configuration. One or two of the drives would drop > off the bus for no reason sometimes few times a day and because the > current driver ignores bus reset, someone has to physically remove and > re-insert the drives for them to come back. Real pain. > My firmware version is 12.00.00.00 -- perhaps it is buggy? > mps0: Firmware: 14.00.00.00, Driver: 14.00.00.01-fbsd I upgraded in an attempt to solve the problem. From owner-freebsd-fs@FreeBSD.ORG Mon Oct 22 23:19:20 2012 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id B6151456 for ; Mon, 22 Oct 2012 23:19:20 +0000 (UTC) (envelope-from freebsd@pki2.com) Received: from btw.pki2.com (btw.pki2.com [IPv6:2001:470:a:6fd::2]) by mx1.freebsd.org (Postfix) with ESMTP id 2D68C8FC16 for ; Mon, 22 Oct 2012 23:19:20 +0000 (UTC) Received: from [127.0.0.1] (localhost [127.0.0.1]) by btw.pki2.com (8.14.5/8.14.5) with ESMTP id q9MNJDhR074980; Mon, 22 Oct 2012 16:19:13 -0700 (PDT) (envelope-from freebsd@pki2.com) Subject: Re: ZFS HBAs + LSI chip sets (Was: ZFS hang (system #2)) From: Dennis Glatting To: Attila Nagy In-Reply-To: <50856322.9070307@fsn.hu> References: <1350698905.86715.33.camel@btw.pki2.com> <1350711509.86715.59.camel@btw.pki2.com> <50825598.3070505@FreeBSD.org> <1350744349.88577.10.camel@btw.pki2.com> <1350765093.86715.69.camel@btw.pki2.com> <508322EC.4080700@FreeBSD.org> <1350778257.86715.106.camel@btw.pki2.com> <50856322.9070307@fsn.hu> Content-Type: text/plain; charset="ISO-8859-1" Date: Mon, 22 Oct 2012 16:19:13 -0700 Message-ID: <1350947953.86715.138.camel@btw.pki2.com> Mime-Version: 1.0 X-Mailer: Evolution 2.32.1 FreeBSD GNOME Team Port Content-Transfer-Encoding: 7bit X-yoursite-MailScanner-Information: Dennis Glatting X-yoursite-MailScanner-ID: q9MNJDhR074980 X-yoursite-MailScanner: Found to be clean X-yoursite-MailScanner-SpamScore: 1 X-MailScanner-From: freebsd@pki2.com Cc: freebsd-fs@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 Oct 2012 23:19:20 -0000 On Mon, 2012-10-22 at 17:15 +0200, Attila Nagy wrote: > Hi, > > On 10/21/2012 02:10 AM, Dennis Glatting wrote: > > I chosen the LSI2008 chip set because the code was donated by LSI, and > > they therefore demonstrated interest in supporting their products under > > FreeBSD, and that chip set is found in a lot of places, notably > > Supermicro boards. Additionally, there were stories of success on the > > lists for several boards. That said, I have received private email from > > others expressing frustration with ZFS and the "hang" problems, which I > > believe are also the LSI chips. > > > I have a Sun X4540, which shows similar symptoms. It has some (6) > on-board LSI 1068E SAS controllers with 1.27.02.00-IT firmware (latest > from Sun/Oracle) and 48 SATA disks. > It runs stable/9@r240134. > > Currently the machine does a resilver on its 48 disk pool (heavy IO > happens), which stops periodically. > I've set up watchdogd with a command of "ls /data" (the pool is mounted > there). It doesn't restart the machine when the IO freezes, because the > command always succeeds (coming from cache I guess). > But if something wants to touch the disks, it stucks in D state. > > zpool status shows: > scan: resilver in progress since Sun Oct 21 15:40:50 2012 > 3.16T scanned out of 13.8T at 26.4M/s, 117h45m to go > 133G resilvered, 22.82% done > And the estimated time grows constantly. > gstat shows no IO. > I've had this problem too. > If I issue an ls -R /data, it gets stuck: > root 36217 0.0 0.0 14380 1800 3 D+ 4:45PM 0:00.00 ls -R /data/ > # procstat -k 36217 > PID TID COMM TDNAME KSTACK > 36217 101469 ls - mi_switch sleepq_wait > _cv_wait zio_wait dbuf_read dbuf_findbp dbuf_hold_impl dbuf_hold > dmu_buf_hold zap_lockdir zap_cursor_retrieve zfs_freebsd_readdir > kern_getdirentries sys_getdirentries amd64_syscall Xfast_syscall > > Also, a dd on any of the disks waits forever, without reading a single byte: > root 36570 0.0 0.0 9876 1356 4 DL+ 4:46PM 0:00.00 dd > if=/dev/da0 of=/dev/null > # procstat -k 36570 > PID TID COMM TDNAME KSTACK > 36570 101489 dd - mi_switch sleepq_wait > _sleep bwait physio devfs_read_f dofileread kern_readv sys_read > amd64_syscall Xfast_syscall > > > Camcontrol works: > > # camcontrol devlist > at scbus0 target 0 lun 0 (pass0,da0) > at scbus0 target 1 lun 0 (pass1,da1) > at scbus0 target 2 lun 0 (pass2,da2) > at scbus0 target 3 lun 0 (pass3,da3) > at scbus0 target 4 lun 0 (pass4,da4) > at scbus0 target 5 lun 0 (pass5,da5) > at scbus0 target 6 lun 0 (pass6,da6) > at scbus0 target 7 lun 0 (pass7,da7) > at scbus1 target 0 lun 0 (pass8,da8) > at scbus1 target 1 lun 0 (pass9,da9) > at scbus1 target 2 lun 0 (pass10,da10) > at scbus1 target 3 lun 0 (pass11,da11) > at scbus1 target 4 lun 0 (pass12,da12) > at scbus1 target 5 lun 0 (pass13,da13) > at scbus1 target 6 lun 0 (pass14,da14) > at scbus1 target 7 lun 0 (pass15,da15) > at scbus2 target 0 lun 0 (pass16,da16) > at scbus2 target 1 lun 0 (pass17,da17) > at scbus2 target 2 lun 0 (pass18,da18) > at scbus2 target 3 lun 0 (pass19,da19) > at scbus2 target 4 lun 0 (pass20,da20) > at scbus2 target 5 lun 0 (pass21,da21) > at scbus2 target 6 lun 0 (pass22,da22) > at scbus2 target 7 lun 0 (pass23,da23) > at scbus3 target 0 lun 0 (pass24,da24) > at scbus3 target 1 lun 0 (pass25,da25) > at scbus3 target 2 lun 0 (pass26,da26) > at scbus3 target 3 lun 0 (pass27,da27) > at scbus3 target 4 lun 0 (pass28,da28) > at scbus3 target 5 lun 0 (pass29,da29) > at scbus3 target 6 lun 0 (pass30,da30) > at scbus3 target 7 lun 0 (pass31,da31) > at scbus4 target 0 lun 0 (pass32,da32) > at scbus4 target 1 lun 0 (pass33,da33) > at scbus4 target 2 lun 0 (pass34,da34) > at scbus4 target 3 lun 0 (pass35,da35) > at scbus4 target 4 lun 0 (pass36,da36) > at scbus4 target 5 lun 0 (pass37,da37) > at scbus4 target 6 lun 0 (pass38,da38) > at scbus4 target 7 lun 0 (pass39,da39) > at scbus5 target 0 lun 0 (pass40,da40) > at scbus5 target 1 lun 0 (pass41,da41) > at scbus5 target 2 lun 0 (pass42,da42) > at scbus5 target 3 lun 0 (pass43,da43) > at scbus5 target 4 lun 0 (pass44,da44) > at scbus5 target 5 lun 0 (pass45,da45) > at scbus5 target 6 lun 0 (pass46,da46) > at scbus5 target 7 lun 0 (pass47,da47) > > # camcontrol tags da0 > (pass0:mpt0:0:0:0): device openings: 255 > > Also works (I guess it doesn't touch the disks): > # zfs list > NAME USED AVAIL REFER MOUNTPOINT > logpool 13.1T 7.17T 507K /data > logpool/jail 7.08G 7.17T 7.08G /data/jail > logpool/logs 13.1T 7.17T 3.40T /data/jail/logvm/logs > logpool/logs/OTHER 9.24T 7.17T 2.36T /data/jail/logvm/logs/OTHER > > But this doesn't: > root 36686 0.0 0.0 33384 2512 5 D+ 4:49PM 0:00.00 zfs list > -t snapshot > # procstat -k 36686 > PID TID COMM TDNAME KSTACK > 36686 101593 zfs - mi_switch sleepq_wait > _cv_wait zio_wait dbuf_read dmu_buf_hold zap_lockdir zap_cursor_retrieve > dmu_snapshot_list_next zfs_ioc_snapshot_list_next zfsdev_ioctl > devfs_ioctl_f kern_ioctl sys_ioctl amd64_syscall Xfast_syscall > > Entering into the debugger: > KDB: enter: sysctl debug.kdb.enter > [ thread pid 36959 tid 101484 ] > Stopped at kdb_enter+0x3b: movq $0,0x95ab72(%rip) > db> ps > pid ppid pgrp uid state wmesg wchan cmd > 36959 1769 36959 0 R+ CPU 0 sysctl > 36691 919 919 0 S sbwait 0xfffffe009d752144 perl > 36686 36677 36686 0 D+ zio->io_ 0xfffffe001ccb7d70 zfs > 36677 36208 36677 0 Ss+ pause 0xfffffe009d0030a0 csh > 36570 36567 36570 0 DL+ physrd 0xffffff87005a2980 dd > 36567 36208 36567 0 Ss+ pause 0xfffffe00115c4540 csh > 36217 36209 36217 0 D+ zio->io_ 0xfffffe001c2b2320 ls > 36209 36208 36209 0 Ss+ pause 0xfffffe022c8aa0a0 csh > 36208 36207 36208 0 Ss select 0xfffffe0665c92e40 screen > 36207 1782 36207 0 S+ pause 0xfffffe009d0010a0 screen > 32921 883 873 0 DL cbwait 0xfffffe000f7f7848 camcontrol > 1782 1780 1782 0 Ss+ pause 0xfffffe009d4559e0 csh > 1780 897 1780 0 Ss select 0xfffffe001d546740 sshd > 1776 1774 1776 0 Ss+ ttyin 0xfffffe001c02a4a8 csh > 1774 897 1774 0 Ss select 0xfffffe001cb4d0c0 sshd > 1769 1767 1769 0 Ss+ pause 0xfffffe001191a540 csh > 1767 897 1767 0 Ss select 0xfffffe000fd72bc0 sshd > 1079 1 1079 0 Ss+ ttyin 0xfffffe000c82c4a8 getty > 1078 1 1078 0 Ss+ ttyin 0xfffffe000c82c8a8 getty > 1077 1 1077 0 Ss+ ttyin 0xfffffe000c82cca8 getty > 1076 1 1076 0 Ss+ ttyin 0xfffffe000c82d0a8 getty > 1075 1 1075 0 Ss+ ttyin 0xfffffe000c82d4a8 getty > 1074 1 1074 0 Ss+ ttyin 0xfffffe000c82d8a8 getty > 1073 1 1073 0 Ss+ ttyin 0xfffffe000c82dca8 getty > 1072 1 1072 0 Ss+ ttyin 0xfffffe000c82f0a8 getty > 919 1 919 0 Ss select 0xfffffe000f5ac940 perl > 907 1 907 0 Ss nanslp 0xffffffff81244f08 cron > 903 1 903 25 Ss pause 0xfffffe001125e0a0 sendmail > 900 1 900 0 Ss select 0xfffffe001d549340 sendmail > 897 1 897 0 Ss select 0xfffffe001d546cc0 sshd > 892 884 873 0 S piperd 0xfffffe001e940888 fghack > 884 878 873 0 S wait 0xfffffe000fdee000 sh > 883 879 873 0 S piperd 0xfffffe022c08b000 perl > 879 875 873 0 S select 0xfffffe001ca6a8c0 supervise > 878 875 873 0 S select 0xfffffe000fd73d40 supervise > 876 1 873 0 S piperd 0xfffffe001e9c5b60 readproctitle > 875 1 873 0 S nanslp 0xffffffff81244f08 svscan > 870 868 867 123 S select 0xfffffe000fd934c0 ntpd > 868 867 867 123 S select 0xfffffe001ca68e40 ntpd > 867 1 867 0 Ss select 0xfffffe000fddd740 ntpd > 796 0 0 0 DL mdwait 0xfffffe000f52a000 [md2] > 774 1 774 53 Ss (threaded) named > 101524 S kqread 0xfffffe00115dd100 named > 101523 S uwait 0xfffffe000fde5200 named > 101522 S uwait 0xfffffe00110ce680 named > 101521 S uwait 0xfffffe000fda0300 named > 101520 S uwait 0xfffffe000fddd380 named > 101519 S uwait 0xfffffe001198ca00 named > 101518 S uwait 0xfffffe000fd58880 named > 101517 S uwait 0xfffffe000fd7ab80 named > 101516 S uwait 0xfffffe000f80e480 named > 101515 S uwait 0xfffffe000f80f400 named > 101501 S sigwait 0xfffffe00110dd000 named > 751 750 751 0 Ss select 0xfffffe001d549440 syslog-ng > 750 1 749 0 S wait 0xfffffe000c8144a0 syslog-ng > 612 608 608 64 S bpf 0xfffffe001ca94800 pflogd > 608 1 608 0 Ss sbwait 0xfffffe001eb4ae8c pflogd > 605 0 0 0 DL pftm 0xffffffff817547a0 [pfpurge] > 78 0 0 0 DL (threaded) [zfskern] > 101459 D spa->spa 0xfffffe0011462680 > [txg_thread_enter] > 101458 D tx->tx_q 0xfffffe001b199230 > [txg_thread_enter] > 100122 D l2arc_fe 0xffffffff8173ebc0 > [l2arc_feed_thread] > 100121 D arc_recl 0xffffffff8172ed20 > [arc_reclaim_thread] > 59 0 0 0 DL mdwait 0xfffffe000f521000 [md1] > 47 0 0 0 DL mdwait 0xfffffe000f523800 [md0] > 24 0 0 0 DL sdflush 0xffffffff812a6158 [softdepflush] > 23 0 0 0 DL syncer 0xffffffff812928c0 [syncer] > 22 0 0 0 DL vlruwt 0xfffffe000c80d000 [vnlru] > 21 0 0 0 DL psleep 0xffffffff81292348 [bufdaemon] > 20 0 0 0 DL pgzero 0xffffffff812b019c [pagezero] > 19 0 0 0 DL psleep 0xffffffff812af368 [vmdaemon] > 18 0 0 0 DL psleep 0xffffffff812af32c [pagedaemon] > 17 0 0 0 DL ccb_scan 0xffffffff811ff260 [xpt_thrd] > 16 0 0 0 DL idle 0xffffff8001df3000 [mpt_recovery5] > 9 0 0 0 DL idle 0xffffff8001dde000 [mpt_recovery4] > 8 0 0 0 DL idle 0xffffff8001dc9000 [mpt_recovery3] > 7 0 0 0 DL idle 0xffffff8001daa000 [mpt_recovery2] > 6 0 0 0 DL idle 0xffffff8001d95000 [mpt_recovery1] > 5 0 0 0 DL idle 0xffffff8001d80000 [mpt_recovery0] > 15 0 0 0 DL (threaded) [usb] > 100048 D - 0xffffff8001d73e18 [usbus1] > 100047 D - 0xffffff8001d73dc0 [usbus1] > 100046 D - 0xffffff8001d73d68 [usbus1] > 100045 D - 0xffffff8001d73d10 [usbus1] > 100043 D - 0xffffff8001d6b460 [usbus0] > 100042 D - 0xffffff8001d6b408 [usbus0] > 100041 D - 0xffffff8001d6b3b0 [usbus0] > 100040 D - 0xffffff8001d6b358 [usbus0] > 4 0 0 0 DL ctl_work 0xffffff8000a41000 [ctl_thrd] > 14 0 0 0 DL - 0xffffffff81243ba4 [yarrow] > 3 0 0 0 DL crypto_r 0xffffffff812a4ae0 [crypto > returns] > 2 0 0 0 DL crypto_w 0xffffffff812a4aa0 [crypto] > 13 0 0 0 DL (threaded) [geom] > 100023 D - 0xffffffff8123d030 [g_down] > 100022 D - 0xffffffff8123d028 [g_up] > 100021 D - 0xffffffff8123d018 [g_event] > 12 0 0 0 RL (threaded) [intr] > 100065 I [swi0: uart] > 100063 I [irq293: mpt5] > 100061 I [irq292: mpt4] > 100059 I [irq291: mpt3] > 100055 I [irq274: mpt2] > 100053 I [irq273: mpt1] > 100051 I [irq272: mpt0] > 100044 I [irq22: ehci0] > 100039 I [irq21: ohci0] > 100034 I [swi2: cambio] > 100031 I [swi6: task queue] > 100030 I [swi6: Giant taskq] > 100028 I [swi5: +] > 100020 I [swi1: netisr 0] > 100019 I [swi4: clock] > 100018 I [swi4: clock] > 100017 I [swi4: clock] > 100016 I [swi4: clock] > 100015 I [swi4: clock] > 100014 I [swi4: clock] > 100013 I [swi4: clock] > 100012 RunQ [swi4: clock] > 100011 I [swi3: vm] > 11 0 0 0 RL (threaded) [idle] > 100010 Run CPU 7 [idle: cpu7] > 100009 Run CPU 6 [idle: cpu6] > 100008 Run CPU 5 [idle: cpu5] > 100007 Run CPU 4 [idle: cpu4] > 100006 Run CPU 3 [idle: cpu3] > 100005 Run CPU 2 [idle: cpu2] > 100004 Run CPU 1 [idle: cpu1] > 100003 CanRun [idle: cpu0] > 1 0 1 0 SLs wait 0xfffffe000c068940 [init] > 10 0 0 0 DL audit_wo 0xffffffff812a50d0 [audit] > 0 0 0 0 DLs (threaded) [kernel] > 101463 D - 0xfffffe000fddab00 [zil_clean] > 101462 D - 0xfffffe000fd6a800 [zil_clean] > 101461 D - 0xfffffe000fdf6180 [zil_clean] > 101460 D - 0xfffffe001d546600 [zil_clean] > 101457 D - 0xfffffe000f359e00 [zfs_vn_rele_taskq] > 101456 D - 0xfffffe001198d080 [zio_ioctl_intr] > 101455 D - 0xfffffe001cb4fa80 [zio_ioctl_issue] > 101454 D - 0xfffffe000ffbf380 [zio_claim_intr] > 101453 D - 0xfffffe00110cf580 [zio_claim_issue] > 101452 D - 0xfffffe00110cf880 [zio_free_intr] > 101451 D - 0xfffffe000ffc1b80 [zio_free_issue_99] > 101450 D - 0xfffffe000ffc1b80 [zio_free_issue_98] > 101449 D - 0xfffffe000ffc1b80 [zio_free_issue_97] > 101448 D - 0xfffffe000ffc1b80 [zio_free_issue_96] > 101447 D - 0xfffffe000ffc1b80 [zio_free_issue_95] > 101446 D - 0xfffffe000ffc1b80 [zio_free_issue_94] > 101445 D - 0xfffffe000ffc1b80 [zio_free_issue_93] > 101444 D - 0xfffffe000ffc1b80 [zio_free_issue_92] > 101443 D - 0xfffffe000ffc1b80 [zio_free_issue_91] > 101442 D - 0xfffffe000ffc1b80 [zio_free_issue_90] > 101441 D - 0xfffffe000ffc1b80 [zio_free_issue_89] > 101440 D - 0xfffffe000ffc1b80 [zio_free_issue_88] > 101439 D - 0xfffffe000ffc1b80 [zio_free_issue_87] > 101438 D - 0xfffffe000ffc1b80 [zio_free_issue_86] > 101437 D - 0xfffffe000ffc1b80 [zio_free_issue_85] > 101436 D - 0xfffffe000ffc1b80 [zio_free_issue_84] > 101435 D - 0xfffffe000ffc1b80 [zio_free_issue_83] > 101434 D - 0xfffffe000ffc1b80 [zio_free_issue_82] > 101433 D - 0xfffffe000ffc1b80 [zio_free_issue_81] > 101432 D - 0xfffffe000ffc1b80 [zio_free_issue_80] > 101431 D - 0xfffffe000ffc1b80 [zio_free_issue_79] > 101430 D - 0xfffffe000ffc1b80 [zio_free_issue_78] > 101429 D - 0xfffffe000ffc1b80 [zio_free_issue_77] > 101428 D - 0xfffffe000ffc1b80 [zio_free_issue_76] > 101427 D - 0xfffffe000ffc1b80 [zio_free_issue_75] > 101426 D - 0xfffffe000ffc1b80 [zio_free_issue_74] > 101425 D - 0xfffffe000ffc1b80 [zio_free_issue_73] > 101424 D - 0xfffffe000ffc1b80 [zio_free_issue_72] > 101423 D - 0xfffffe000ffc1b80 [zio_free_issue_71] > 101422 D - 0xfffffe000ffc1b80 [zio_free_issue_70] > 101421 D - 0xfffffe000ffc1b80 [zio_free_issue_69] > 101420 D - 0xfffffe000ffc1b80 [zio_free_issue_68] > 101419 D - 0xfffffe000ffc1b80 [zio_free_issue_67] > 101418 D - 0xfffffe000ffc1b80 [zio_free_issue_66] > 101417 D - 0xfffffe000ffc1b80 [zio_free_issue_65] > 101416 D - 0xfffffe000ffc1b80 [zio_free_issue_64] > 101415 D - 0xfffffe000ffc1b80 [zio_free_issue_63] > 101414 D - 0xfffffe000ffc1b80 [zio_free_issue_62] > 101413 D - 0xfffffe000ffc1b80 [zio_free_issue_61] > 101412 D - 0xfffffe000ffc1b80 [zio_free_issue_60] > 101411 D - 0xfffffe000ffc1b80 [zio_free_issue_59] > 101410 D - 0xfffffe000ffc1b80 [zio_free_issue_58] > 101409 D - 0xfffffe000ffc1b80 [zio_free_issue_57] > 101408 D - 0xfffffe000ffc1b80 [zio_free_issue_56] > 101407 D - 0xfffffe000ffc1b80 [zio_free_issue_55] > 101406 D - 0xfffffe000ffc1b80 [zio_free_issue_54] > 101405 D - 0xfffffe000ffc1b80 [zio_free_issue_53] > 101404 D - 0xfffffe000ffc1b80 [zio_free_issue_52] > 101403 D - 0xfffffe000ffc1b80 [zio_free_issue_51] > 101402 D - 0xfffffe000ffc1b80 [zio_free_issue_50] > 101401 D - 0xfffffe000ffc1b80 [zio_free_issue_49] > 101400 D - 0xfffffe000ffc1b80 [zio_free_issue_48] > 101399 D - 0xfffffe000ffc1b80 [zio_free_issue_47] > 101398 D - 0xfffffe000ffc1b80 [zio_free_issue_46] > 101397 D - 0xfffffe000ffc1b80 [zio_free_issue_45] > 101396 D - 0xfffffe000ffc1b80 [zio_free_issue_44] > 101395 D - 0xfffffe000ffc1b80 [zio_free_issue_43] > 101394 D - 0xfffffe000ffc1b80 [zio_free_issue_42] > 101393 D - 0xfffffe000ffc1b80 [zio_free_issue_41] > 101392 D - 0xfffffe000ffc1b80 [zio_free_issue_40] > 101391 D - 0xfffffe000ffc1b80 [zio_free_issue_39] > 101390 D - 0xfffffe000ffc1b80 [zio_free_issue_38] > 101389 D - 0xfffffe000ffc1b80 [zio_free_issue_37] > 101388 D - 0xfffffe000ffc1b80 [zio_free_issue_36] > 101387 D - 0xfffffe000ffc1b80 [zio_free_issue_35] > 101386 D - 0xfffffe000ffc1b80 [zio_free_issue_34] > 101385 D - 0xfffffe000ffc1b80 [zio_free_issue_33] > 101384 D - 0xfffffe000ffc1b80 [zio_free_issue_32] > 101383 D - 0xfffffe000ffc1b80 [zio_free_issue_31] > 100569 D - 0xfffffe000ffc1b80 [zio_free_issue_30] > 100567 D - 0xfffffe000ffc1b80 [zio_free_issue_29] > 100565 D - 0xfffffe000ffc1b80 [zio_free_issue_28] > 100560 D - 0xfffffe000ffc1b80 [zio_free_issue_27] > 100554 D - 0xfffffe000ffc1b80 [zio_free_issue_26] > 100553 D - 0xfffffe000ffc1b80 [zio_free_issue_25] > 100547 D - 0xfffffe000ffc1b80 [zio_free_issue_24] > 100545 D - 0xfffffe000ffc1b80 [zio_free_issue_23] > 100542 D - 0xfffffe000ffc1b80 [zio_free_issue_22] > 100539 D - 0xfffffe000ffc1b80 [zio_free_issue_21] > 100536 D - 0xfffffe000ffc1b80 [zio_free_issue_20] > 100530 D - 0xfffffe000ffc1b80 [zio_free_issue_19] > 100487 D - 0xfffffe000ffc1b80 [zio_free_issue_18] > 100415 D - 0xfffffe000ffc1b80 [zio_free_issue_17] > 100413 D - 0xfffffe000ffc1b80 [zio_free_issue_16] > 100407 D - 0xfffffe000ffc1b80 [zio_free_issue_15] > 100403 D - 0xfffffe000ffc1b80 [zio_free_issue_14] > 100400 D - 0xfffffe000ffc1b80 [zio_free_issue_13] > 100393 D - 0xfffffe000ffc1b80 [zio_free_issue_12] > 100391 D - 0xfffffe000ffc1b80 [zio_free_issue_11] > 100387 D - 0xfffffe000ffc1b80 [zio_free_issue_10] > 100386 D - 0xfffffe000ffc1b80 [zio_free_issue_9] > 100385 D - 0xfffffe000ffc1b80 [zio_free_issue_8] > 100384 D - 0xfffffe000ffc1b80 [zio_free_issue_7] > 100383 D - 0xfffffe000ffc1b80 [zio_free_issue_6] > 100379 D - 0xfffffe000ffc1b80 [zio_free_issue_5] > 100372 D - 0xfffffe000ffc1b80 [zio_free_issue_4] > 100367 D - 0xfffffe000ffc1b80 [zio_free_issue_3] > 100366 D - 0xfffffe000ffc1b80 [zio_free_issue_2] > 100361 D - 0xfffffe000ffc1b80 [zio_free_issue_1] > 100360 D - 0xfffffe000ffc1b80 [zio_free_issue_0] > 100359 D - 0xfffffe001ca67280 [zio_write_intr_high] > 100358 D - 0xfffffe001ca67280 [zio_write_intr_high] > 100357 D - 0xfffffe001ca67280 [zio_write_intr_high] > 100354 D - 0xfffffe001ca67280 [zio_write_intr_high] > 100353 D - 0xfffffe001ca67280 [zio_write_intr_high] > 100349 D - 0xfffffe000fd72700 [zio_write_intr_7] > 100348 D - 0xfffffe000fd72700 [zio_write_intr_6] > 100345 D - 0xfffffe000fd72700 [zio_write_intr_5] > 100343 D - 0xfffffe000fd72700 [zio_write_intr_4] > 100342 D - 0xfffffe000fd72700 [zio_write_intr_3] > 100341 D - 0xfffffe000fd72700 [zio_write_intr_2] > 100340 D - 0xfffffe000fd72700 [zio_write_intr_1] > 100339 D - 0xfffffe000fd72700 [zio_write_intr_0] > 100337 D - 0xfffffe001196ce00 [zio_write_issue_hig] > 100336 D - 0xfffffe001196ce00 [zio_write_issue_hig] > 100334 D - 0xfffffe001196ce00 [zio_write_issue_hig] > 100330 D - 0xfffffe001196ce00 [zio_write_issue_hig] > 100327 D - 0xfffffe001196ce00 [zio_write_issue_hig] > 100324 D - 0xfffffe00110cfb00 [zio_write_issue_7] > 100322 D - 0xfffffe00110cfb00 [zio_write_issue_6] > 100321 D - 0xfffffe00110cfb00 [zio_write_issue_5] > 100316 D - 0xfffffe00110cfb00 [zio_write_issue_4] > 100314 D - 0xfffffe00110cfb00 [zio_write_issue_3] > 100312 D - 0xfffffe00110cfb00 [zio_write_issue_2] > 100311 D - 0xfffffe00110cfb00 [zio_write_issue_1] > 100307 D - 0xfffffe00110cfb00 [zio_write_issue_0] > 100306 D - 0xfffffe000ffbfc80 [zio_read_intr_7] > 100305 D - 0xfffffe000ffbfc80 [zio_read_intr_6] > 100303 D - 0xfffffe000ffbfc80 [zio_read_intr_5] > 100300 D - 0xfffffe000ffbfc80 [zio_read_intr_4] > 100298 D - 0xfffffe000ffbfc80 [zio_read_intr_3] > 100297 D - 0xfffffe000ffbfc80 [zio_read_intr_2] > 100293 D - 0xfffffe000ffbfc80 [zio_read_intr_1] > 100292 D - 0xfffffe000ffbfc80 [zio_read_intr_0] > 100291 D - 0xfffffe00110cf000 [zio_read_issue_7] > 100289 D - 0xfffffe00110cf000 [zio_read_issue_6] > 100288 D - 0xfffffe00110cf000 [zio_read_issue_5] > 100286 D - 0xfffffe00110cf000 [zio_read_issue_4] > 100282 D - 0xfffffe00110cf000 [zio_read_issue_3] > 100281 D - 0xfffffe00110cf000 [zio_read_issue_2] > 100280 D - 0xfffffe00110cf000 [zio_read_issue_1] > 100278 D - 0xfffffe00110cf000 [zio_read_issue_0] > 100275 D - 0xfffffe001113b500 [zio_null_intr] > 100273 D - 0xfffffe001196c800 [zio_null_issue] > 100120 D - 0xfffffe0011370300 [system_taskq_7] > 100119 D - 0xfffffe0011370300 [system_taskq_6] > 100118 D - 0xfffffe0011370300 [system_taskq_5] > 100117 D - 0xfffffe0011370300 [system_taskq_4] > 100116 D - 0xfffffe0011370300 [system_taskq_3] > 100115 D - 0xfffffe0011370300 [system_taskq_2] > 100114 D - 0xfffffe0011370300 [system_taskq_1] > 100113 D - 0xfffffe0011370300 [system_taskq_0] > 100066 D - 0xfffffe000f239a80 [mca taskq] > 100058 D - 0xfffffe000c69b900 [nfe3 taskq] > 100057 D - 0xfffffe000c698480 [nfe2 taskq] > 100050 D - 0xfffffe000c620400 [nfe1 taskq] > 100049 D - 0xfffffe000c61b500 [nfe0 taskq] > 100037 D - 0xfffffe000c24bb00 [acpi_task_2] > 100036 D - 0xfffffe000c24bb00 [acpi_task_1] > 100035 D - 0xfffffe000c24bb00 [acpi_task_0] > 100033 D - 0xfffffe000c24be00 [kqueue taskq] > 100032 D - 0xfffffe000c24c000 [ffs_trim taskq] > 100029 D - 0xfffffe000c20c780 [thread taskq] > 100024 D - 0xfffffe000c07fb80 [firmware taskq] > 100000 D sched 0xffffffff8123d280 [swapper] > 895 892 873 0 Z perl > > Setting this: > # sysctl dev.mpt.0.debug=255 > and doing a dd again from a disk on that controller prints this onto the > console: > SCSI IO Request @ 0xffffff80003046f0 > Chain Offset 0x00 > MsgFlags 0x00 > MsgContext 0x000201c5 > Bus: 0 > TargetID 0 > SenseBufferLength 32 > LUN: 0x0 > Control 0x02000200 READ ORDEREDQ > DataLength 0x00000200 > SenseBufAddr 0x0c678be0 > CDB[0:6] 08 00 00 00 01 00 > SE64 0xffffff87ffd33a30: Addr=0x000000070cc08400 FlagsLength=0xd3000200 > 64_BIT_ADDRESSING LAST_ELEMENT END_OF_BUFFER END_OF_LIST > mpt0: Send Request 453 (c678a00): > mpt0: 00000000 00002006 000201c5 00000000 00000000 02000200 00000008 > 00000001 > mpt0: 00000000 00000000 00000200 0c678be0 d3000200 0cc08400 00000007 > ffffffff > mpt0: ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff > ffffffff > mpt0: ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff > ffffffff > mpt0: enter mpt_intr > mpt0: Context Reply: 0x000201c5 > mpt0: exit mpt_intr > > And dd freezes. > > Alltrace from a couple of stuck processes: > Tracing command dd pid 36971 tid 101570 td 0xfffffe001efce000 > sched_switch() at sched_switch+0x115 > mi_switch() at mi_switch+0x186 > sleepq_wait() at sleepq_wait+0x42 > _sleep() at _sleep+0x379 > bwait() at bwait+0x64 > physio() at physio+0x1c8 > devfs_read_f() at devfs_read_f+0x90 > dofileread() at dofileread+0xa1 > kern_readv() at kern_readv+0x6c > sys_read() at sys_read+0x64 > amd64_syscall() at amd64_syscall+0x540 > Xfast_syscall() at Xfast_syscall+0xf7 > --- syscall (3, FreeBSD ELF64, sys_read), rip = 0x800916c8c, rsp = > 0x7fffffffd658, rbp = 0x7fffffffd6b0 --- > > Tracing command zfs pid 36686 tid 101593 td 0xfffffe001ecb3900 > sched_switch() at sched_switch+0x115 > mi_switch() at mi_switch+0x186 > sleepq_wait() at sleepq_wait+0x42 > _cv_wait() at _cv_wait+0x112 > zio_wait() at zio_wait+0x61 > dbuf_read() at dbuf_read+0x5e5 > dmu_buf_hold() at dmu_buf_hold+0xe0 > zap_lockdir() at zap_lockdir+0x58 > zap_cursor_retrieve() at zap_cursor_retrieve+0x19b > dmu_snapshot_list_next() at dmu_snapshot_list_next+0xaf > zfs_ioc_snapshot_list_next() at zfs_ioc_snapshot_list_next+0x101 > zfsdev_ioctl() at zfsdev_ioctl+0xe6 > devfs_ioctl_f() at devfs_ioctl_f+0x7b > kern_ioctl() at kern_ioctl+0x106 > sys_ioctl() at sys_ioctl+0xfd > amd64_syscall() at amd64_syscall+0x540 > Xfast_syscall() at Xfast_syscall+0xf7 > --- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x801be2c2c, rsp = > 0x7fffffff8938, rbp = 0x4000 --- > From owner-freebsd-fs@FreeBSD.ORG Mon Oct 22 23:29:19 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 343CC718 for ; Mon, 22 Oct 2012 23:29:19 +0000 (UTC) (envelope-from fjwcash@gmail.com) Received: from mail-lb0-f182.google.com (mail-lb0-f182.google.com [209.85.217.182]) by mx1.freebsd.org (Postfix) with ESMTP id A10F38FC0C for ; Mon, 22 Oct 2012 23:29:18 +0000 (UTC) Received: by mail-lb0-f182.google.com with SMTP id b5so2593575lbd.13 for ; Mon, 22 Oct 2012 16:29:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=3KV9O2rGBPoL7e2DCkop+avNsTp4RRNqIqAVaiuy1is=; b=UgqRc5F2YmR/amnFiVSEoUAqtAeSdlpcmxyJpYErnaRHkbaW4axm4+oNXa3A2csgvU njIYtnNrMMcu1Eoiqh7LrMgcAtpY7Sg0u6vevv0dAUIKW4kKHgkw2r4iIh+hcqqLASgF xhgmurzbWelLebpVBrFWN9Khi7WBZL+oKx3oOgPXAmrLwc+HHiY1iGbOvRKxfc9A4+7A cQkUYxJV6qUo0thPpNQ6W2Q9dkutZQxmK8PoQ53SabhOA2t3+Jcy73yJqW0GjpnprZwy gu3qTYEXOdQnh+cE1YwIupfWCRvujNEqdKvaM083Eq1PqMetfeIYUz5qBicpKQmCM6cz ENFQ== MIME-Version: 1.0 Received: by 10.112.17.169 with SMTP id p9mr4267220lbd.9.1350948557226; Mon, 22 Oct 2012 16:29:17 -0700 (PDT) Received: by 10.114.24.66 with HTTP; Mon, 22 Oct 2012 16:29:17 -0700 (PDT) In-Reply-To: References: <1350698905.86715.33.camel@btw.pki2.com> <1350711509.86715.59.camel@btw.pki2.com> <50825598.3070505@FreeBSD.org> <1350744349.88577.10.camel@btw.pki2.com> <1350765093.86715.69.camel@btw.pki2.com> <508322EC.4080700@FreeBSD.org> <1350778257.86715.106.camel@btw.pki2.com> <5084F6D5.5080400@digsys.bg> Date: Mon, 22 Oct 2012 16:29:17 -0700 Message-ID: Subject: Re: ZFS HBAs + LSI chip sets (Was: ZFS hang (system #2)) From: Freddie Cash To: Andrew Leonard Content-Type: text/plain; charset=UTF-8 Cc: FreeBSD Filesystems X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 Oct 2012 23:29:19 -0000 On Mon, Oct 22, 2012 at 3:24 PM, Andrew Leonard wrote: > On Mon, Oct 22, 2012 at 9:31 AM, Freddie Cash wrote: >> On Mon, Oct 22, 2012 at 6:47 AM, Freddie Cash wrote: >>> I'll double-check when I get to work, but I'm pretty sure it's 10.something. >> >> mpt(4) on alpha has firmware 1.5.20.0. >> >> mps(4) on beta has firmware 09.00.00.00, driver 14.00.00.01-fbsd. >> >> mps(4) on omega has firmware 10.00.02.00, driver 14.00.00.01-fbsd. > > There was an assertion a couple months back that for mps cards, the > firmware version must match the driver version, and that v14 was a > substantial improvement: > > http://www.mail-archive.com/freebsd-stable@freebsd.org/msg122546.html Interesting. I remember seeing that thread, but not paying attention to it. I think I'll leave things the way they are, as we have not had any driver, disk, stability issues with these boxes (knock wood). -- Freddie Cash fjwcash@gmail.com From owner-freebsd-fs@FreeBSD.ORG Mon Oct 22 23:29:28 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 74DE8783 for ; Mon, 22 Oct 2012 23:29:28 +0000 (UTC) (envelope-from freebsd@pki2.com) Received: from btw.pki2.com (btw.pki2.com [IPv6:2001:470:a:6fd::2]) by mx1.freebsd.org (Postfix) with ESMTP id 19EF58FC0A for ; Mon, 22 Oct 2012 23:29:28 +0000 (UTC) Received: from [127.0.0.1] (localhost [127.0.0.1]) by btw.pki2.com (8.14.5/8.14.5) with ESMTP id q9MNT53e078980; Mon, 22 Oct 2012 16:29:05 -0700 (PDT) (envelope-from freebsd@pki2.com) Subject: Re: ZFS HBAs + LSI chip sets (Was: ZFS hang (system #2)) From: Dennis Glatting To: Freddie Cash In-Reply-To: References: <1350698905.86715.33.camel@btw.pki2.com> <1350711509.86715.59.camel@btw.pki2.com> <50825598.3070505@FreeBSD.org> <1350744349.88577.10.camel@btw.pki2.com> <1350765093.86715.69.camel@btw.pki2.com> <508322EC.4080700@FreeBSD.org> <1350778257.86715.106.camel@btw.pki2.com> <5084F6D5.5080400@digsys.bg> Content-Type: text/plain; charset="ISO-8859-1" Date: Mon, 22 Oct 2012 16:29:05 -0700 Message-ID: <1350948545.86715.147.camel@btw.pki2.com> Mime-Version: 1.0 X-Mailer: Evolution 2.32.1 FreeBSD GNOME Team Port Content-Transfer-Encoding: 7bit X-yoursite-MailScanner-Information: Dennis Glatting X-yoursite-MailScanner-ID: q9MNT53e078980 X-yoursite-MailScanner: Found to be clean X-MailScanner-From: freebsd@pki2.com Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 Oct 2012 23:29:28 -0000 On Mon, 2012-10-22 at 09:31 -0700, Freddie Cash wrote: > On Mon, Oct 22, 2012 at 6:47 AM, Freddie Cash wrote: > > I'll double-check when I get to work, but I'm pretty sure it's 10.something. > > mpt(4) on alpha has firmware 1.5.20.0. > > mps(4) on beta has firmware 09.00.00.00, driver 14.00.00.01-fbsd. > > mps(4) on omega has firmware 10.00.02.00, driver 14.00.00.01-fbsd. > > Hope that helps. > Because one of the RAID1 OS disks failed (System #1), I replaced both disks and downgraded to stable/8. Two hours ago I submitted a job. I noticed on boot smartd issued warnings about disk firmware, which I'll update this coming weekend, unless the system hangs before then. I first want to see if that system will also hang under 8.3. I have noticed a looping "ls" of the target ZFS directory is MUCH snappier under 8.3 than 9.x. My CentOS 6.3 ZFS-on-Linux system (System #3) is crunching along (24 hours now). This system under stable/9 would previously spontaneously reboot whenever I sent a ZFS data set too it. System #2 is hung (stable/9). From owner-freebsd-fs@FreeBSD.ORG Tue Oct 23 01:55:46 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: by hub.freebsd.org (Postfix, from userid 821) id 7890B840; Tue, 23 Oct 2012 01:55:46 +0000 (UTC) Date: Tue, 23 Oct 2012 01:55:46 +0000 From: John To: Dennis Glatting Subject: Re: ZFS HBAs + LSI chip sets (Was: ZFS hang (system #2)) Message-ID: <20121023015546.GA60182@FreeBSD.org> References: <50825598.3070505@FreeBSD.org> <1350744349.88577.10.camel@btw.pki2.com> <1350765093.86715.69.camel@btw.pki2.com> <508322EC.4080700@FreeBSD.org> <1350778257.86715.106.camel@btw.pki2.com> <5084F6D5.5080400@digsys.bg> <1350948545.86715.147.camel@btw.pki2.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1350948545.86715.147.camel@btw.pki2.com> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Oct 2012 01:55:46 -0000 ----- Dennis Glatting's Original Message ----- > On Mon, 2012-10-22 at 09:31 -0700, Freddie Cash wrote: > > On Mon, Oct 22, 2012 at 6:47 AM, Freddie Cash wrote: > > > I'll double-check when I get to work, but I'm pretty sure it's 10.something. > > > > mpt(4) on alpha has firmware 1.5.20.0. > > > > mps(4) on beta has firmware 09.00.00.00, driver 14.00.00.01-fbsd. > > > > mps(4) on omega has firmware 10.00.02.00, driver 14.00.00.01-fbsd. > > > > Hope that helps. > > > > Because one of the RAID1 OS disks failed (System #1), I replaced both > disks and downgraded to stable/8. Two hours ago I submitted a job. > > I noticed on boot smartd issued warnings about disk firmware, which I'll > update this coming weekend, unless the system hangs before then. > > I first want to see if that system will also hang under 8.3. I have > noticed a looping "ls" of the target ZFS directory is MUCH snappier > under 8.3 than 9.x. > > My CentOS 6.3 ZFS-on-Linux system (System #3) is crunching along (24 > hours now). This system under stable/9 would previously spontaneously > reboot whenever I sent a ZFS data set too it. > > System #2 is hung (stable/9). Hi Folks, I just caught up on this thread and thought I toss out some info. I have a number of systems running 9-stable (with some local patches), none running 8. The basic architecture is: http://people.freebsd.org/~jwd/zfsnfsserver.jpg LSI SAS 9201-16e 6G/s 16-Port SATA+SAS Host Bus Adapter All cards are up-to-date on firmware: mps0: Firmware: 14.00.00.00, Driver: 14.00.00.01-fbsd mps1: Firmware: 14.00.00.00, Driver: 14.00.00.01-fbsd mps2: Firmware: 14.00.00.00, Driver: 14.00.00.01-fbsd All drives a geom multipath configured. Currently, these systems are used almost exclusively for iSCSI. I have seen no lockups that I can track down to the driver. I have seen one lockup which I did post about (received no feedback) where I believe an active I/O from istgt is interupted by an ABRT from the client which causes a lock-up. This one is hard to replicate and on the do-do list. It is worth noting that a few drives were replaced early on due to various I/O problems and one with what might be considered a lockup. As has been noted elsewhere, watching gstat can be informative. Also make sure cables are firmly plugged in.. Seems obvious, I know.. I did recently commit a small patch to current to handle a case where if the system has greater than 255 disks, the 255th disk is hidden/masked by the mps initiator id that is statically coded into the driver. I think it might be good to document a bit better the type of mount and test job/test stream running when/if you see a lockup. I am not currently using NFS so there is an entire code-path I am not exercising. Servers are 12 processor, 96GB Ram. The highest cpu load I've seen on the systems is about 800%. All networking is 10G via Chelsio cards - configured to use isr maxthread 6 with a defaultqlimit of 4096. I have seen no problems in this area. Hope this helps a bit. Happy to answer questions. Cheers, John ps: With all that's been said above, it's worth noting that a correctly configured client makes a huge difference. From owner-freebsd-fs@FreeBSD.ORG Tue Oct 23 04:06:59 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 1F268DA0 for ; Tue, 23 Oct 2012 04:06:59 +0000 (UTC) (envelope-from jurgen.weber@theiconic.com.au) Received: from exprod6og114.obsmtp.com (exprod6og114.obsmtp.com [64.18.1.33]) by mx1.freebsd.org (Postfix) with SMTP id 7F4AD8FC17 for ; Tue, 23 Oct 2012 04:06:57 +0000 (UTC) Received: from mail-da0-f72.google.com ([209.85.210.72]) (using TLSv1) by exprod6ob114.postini.com ([64.18.5.12]) with SMTP ID DSNKUIYX274cQGULBIAr+YM53BTb7SF2LiLw@postini.com; Mon, 22 Oct 2012 21:06:58 PDT Received: by mail-da0-f72.google.com with SMTP id r28so5915769daj.7 for ; Mon, 22 Oct 2012 21:06:50 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type:x-gm-message-state; bh=ioSTgOHdiEQrcj4UTGeBXZCV0B+hVPQGkDkbsV5PAfw=; b=QVC+uW6gwS+lHPnZepBNzLXjJr9PGBHw8HvDyrs623jZWJJlIFqxyQlsoS7SRGBEIN yqFyIXaHsos/zm8tEx+AKl+3FsbPYL/8o3IRiFD3i81n/ZeMDBGynVbICfSm1fcpn4EG kExdQIoQtYZpZaQQ/NRrS+7Ol3Ma4o3bDhLf/7KbkpV9WtxufDFkVpWY4I5wPzse0Ji7 dM3Z2HCN1OsFolBKt6ecXRp4wSVu0ZUXpJN/K6lUSTbwo2Yw4JS2ab/lkyt92HfGWH5/ dNEPZ4G0fIhKtcBNNU6dP7/3dW4soRnnybI0ZniAmL7rdhZqA6xJHQk2D3w0Igfhojqk 78bQ== Received: by 10.66.87.132 with SMTP id ay4mr1685424pab.67.1350964902476; Mon, 22 Oct 2012 21:01:42 -0700 (PDT) Received: by 10.66.87.132 with SMTP id ay4mr1685410pab.67.1350964902333; Mon, 22 Oct 2012 21:01:42 -0700 (PDT) Received: from [172.20.24.157] ([202.126.107.170]) by mx.google.com with ESMTPS id sa2sm6993493pbc.4.2012.10.22.21.01.40 (version=SSLv3 cipher=OTHER); Mon, 22 Oct 2012 21:01:41 -0700 (PDT) Message-ID: <508616A2.60609@theiconic.com.au> Date: Tue, 23 Oct 2012 15:01:38 +1100 From: =?ISO-8859-1?Q?J=FCrgen_Weber?= User-Agent: Mozilla/5.0 (X11; Linux i686 on x86_64; rv:16.0) Gecko/20121010 Thunderbird/16.0.1 MIME-Version: 1.0 To: freebsd-fs@freebsd.org Subject: Re: mfi0 timeout error zfs boot mount problem References: <508090E8.4010300@theiconic.com.au> <5081CE05.1010108@theiconic.com.au> <50830EA3.6020001@theiconic.com.au> <508471E0.9010805@theiconic.com.au> <5084A19A.5050905@theiconic.com.au> In-Reply-To: <5084A19A.5050905@theiconic.com.au> X-Gm-Message-State: ALoCoQm7WZ3KQxiW18qiy4oakG4a69sQndFrhrqgPwcer3RFkagTAD7ebtPAZPPUIx60n9BSsk2xzE+lk5wmRZGv7RXIURvDJ3glAK+IDsJg8ej3Rkh6UFh6f2iLCC23tUAueFDRQogpvHnmN9CmGI5vSEg20MidkQ== Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-Content-Filtered-By: Mailman/MimeDel 2.1.14 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Oct 2012 04:06:59 -0000 Hi Well, while this thread has been very quiet I have resolved my issues. With patience changing: kern.maxfiles=5000000 kern.maxvnodes=5000000 vfs.zfs.zil_disable="1" vfs.zfs.prefetch_disable="1" vfs.zfs.txg.timeout="5" The above solves the system unable to import or mount the pool. I have also gone into the Card settings BIOS and changed under advanced settings "Forward Read" to "none". This solves the mfi0 timeout. Once I had the system up, I then added a l2arc cache via a usb2 SSD HDD. I then shut hte system down and it took 3 hours to shut down.. but it eventually did. When I turned the system back on again, it booted as normal. The lesson learnt?! Do not turn on deduping on a large file system unless you have a lot of RAM or L2ARC! I would say 32GB of RAM/L2ARC for every 10TB as a good rule of thumb, if not... double. Thanks Jurgen On 22/10/12 12:30, Jürgen Weber wrote: > Some more updates! > > on the bootloader I have also tried: > kern.maxfiles=5000000 > kern.maxvnodes=5000000 > > I have also gone into the Card settings BIOS and changed under > advanced settings "Forward Read" to "none". > > Now the systems gets to > > "Trying to mount root from zfs:tank/root []..... " and then after > maybe 1 to 5 minutes the next couple of lines load like its working! > > eg: > "Setting hostuuid: xxxxx" > "Setting hostid: xxxxxx" > "Entropy harvesting:interrupts ethernet point_to_point kickstart" > "Starting file system checks:" > "Mounting local file systems:." > > and stops. I have had the machine on my desk all morning observing it > and I can see the disk access is going crazy,, it is doing something. > > I have found this article: > > http://constantin.glez.de/blog/2011/07/zfs-dedupe-or-not-dedupe > > I have a 15TB file system which has dedup on from the start (10TB. I > feel its trying to load the DDT and its going to swap/there is not > enough RAM (only have 16GB's). Hopefully my 64GB RAM upgrade is enough. > > Thanks > > Jurgen > > > > On 22/10/12 09:06, Jürgen Weber wrote: >> This is still a problem for me, is anyone there? :) >> >> I have tried the following at the bootime loader. >> >> vfs.zfs.zil_disable="1" >> vfs.zfs.prefetch_disable="1" >> vfs.zfs.txg.timeout="5" >> >> Any other suggestions on how to get this zpool to import and mount >> again? >> >> Thanks >> >> On 21/10/12 07:50, Jurgen Weber wrote: >>> Hi >>> >>> Lastly, is there a way at boot time, some sysctl's or something I >>> can set to bring zfs to a minimalistic state? Turn off features, etc >>> to get this to mount? >>> >>> Any ideas appreciated. >>> >>> Thanks >>> >>> Jurgen >>> On 20/10/2012 9:02 AM, Jurgen Weber wrote: >>>> Guys >>>> >>>> Some more details on this, some insight would be greatly appreciated. >>>> >>>> As my day wore on trying to get this zpool to import or mount I >>>> have learnt a few things. I think over time this issue has came >>>> about as more and more data was added to the file systems. >>>> >>>> Some further details: >>>> >>>> Its a 8 disk raidz pool that the system boots from as well. The >>>> disk are all 2TB. >>>> The server has 16GB Of RAM, I notcied the day before this happen >>>> the server was struggling with its RAM griding to a halt and >>>> dumping its RAM. >>>> The issue is not hardware because I found another server (same one) >>>> swapped the harddrives out took another 8GB of RAM and I have the >>>> same problem. >>>> The main data file systems have dedup and gzip compression on. >>>> >>>> I have booted from open/Oracle Solars 11 adn attempted to import >>>> and the Solaris live CD will not import either. In the Solaris >>>> system the disk detach from the system. >>>> >>>> I get the feeling that ZFS is hitting some root limit when >>>> attempting to mount and its not finishing the job. >>>> >>>> Thanks >>>> >>>> Jurgen >>>> >>>> On 19/10/2012 10:29 AM, Jürgen Weber wrote: >>>>> Team >>>>> >>>>> I have googled around for a solution and I see a lot of posts >>>>> about firmware versions and patches for FreeBSD 8.*. >>>>> >>>>> I have a FreeBSD 9.1rc1 system, which was beta1 orginally and has >>>>> been running for months. >>>>> >>>>> Now it will not boot, I get the following: >>>>> >>>>> "Trying to mount root from zfs:tank/root []..... >>>>> mfi0: COMMAND 0Xffffff8000cb83530 TIMEOUT AFTER xxx SECONDS >>>>> (this just repeats). >>>>> >>>>> I have not seen this error before during normal runtime, _only_ >>>>> during boot. >>>>> >>>>> Originally when I had the problem I could boot off a USB stick >>>>> (9.1beta1 or rc1), run a 'zpool import -f tank' and it would work >>>>> on the livecd. Rebooting and the main system would work. >>>>> >>>>> This time this work around does not work for me. When I am on the >>>>> USB stick I can run a 'zpool import' and all of the disk are >>>>> recognised, the pool is recognised and the file system is healthy. >>>>> >>>>> The Card is a H700 PERC, with 12.10.3 firmware in a Dell R515. >>>>> Running FreeBSD 9.1-RC1, latest zfs and zpool versions. >>>>> >>>>> I have tried disabling the cache (mfiutil cache xxx disable). I >>>>> have also gone into the Card settings and changed under advanced >>>>> settings "adaptive forward read" to "read only". >>>>> >>>>> Any help, appreciated. >>>>> >>>>> Thanks >>>>> >>>> >>> >> > > -- > Jürgen Weber > > Systems Engineer > IT Infrastructure Team Leader > > THE ICONIC | Ejurgen.weber@theiconic.com.au |www.theiconic.com.au -- Jürgen Weber Systems Engineer IT Infrastructure Team Leader THE ICONIC | E jurgen.weber@theiconic.com.au | www.theiconic.com.au From owner-freebsd-fs@FreeBSD.ORG Tue Oct 23 04:40:07 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id B8BCD63B; Tue, 23 Oct 2012 04:40:07 +0000 (UTC) (envelope-from gezeala@gmail.com) Received: from mail-pb0-f54.google.com (mail-pb0-f54.google.com [209.85.160.54]) by mx1.freebsd.org (Postfix) with ESMTP id 7E90E8FC0A; Tue, 23 Oct 2012 04:40:07 +0000 (UTC) Received: by mail-pb0-f54.google.com with SMTP id rp8so146637pbb.13 for ; Mon, 22 Oct 2012 21:40:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=xvmqflH7cII84zLfWRltXU4VY69l2zPLHamxz1Nk5gw=; b=03/DM8omXJgsRMnkyaKFTyYVHQLS9PEqOdLd93G3DfBG/Yy6GUOo4JXbr3x5pN0h+F Q+8Qv3PxmZleY7eNOEGWMdCCSLdDHM/SAEVrTmc3Agghxiz7BD75NtKifAuNFSLpU8b/ EKbhRzeuMBEvd4kAzOmnyouRQqK0KlHB5yJjy22TrpM9KiQAlUYqa0NyoGtx/qZ0YlFp 2e6pTIBryb9xegGj6Le7YTu1XuHMfss/cG2HptrNA8ZbnqFCp4J6K1d/we1bw17BKgzm L3S9tP7++E1e0FnYORUw3bBcIxyP5gxMYJAPHnzYz3KL7wG6BPOQbUrNlJcKmTEjSi0F sXBA== Received: by 10.68.131.40 with SMTP id oj8mr37134044pbb.40.1350967206925; Mon, 22 Oct 2012 21:40:06 -0700 (PDT) MIME-Version: 1.0 Received: by 10.68.74.69 with HTTP; Mon, 22 Oct 2012 21:39:26 -0700 (PDT) In-Reply-To: <20121023015546.GA60182@FreeBSD.org> References: <50825598.3070505@FreeBSD.org> <1350744349.88577.10.camel@btw.pki2.com> <1350765093.86715.69.camel@btw.pki2.com> <508322EC.4080700@FreeBSD.org> <1350778257.86715.106.camel@btw.pki2.com> <5084F6D5.5080400@digsys.bg> <1350948545.86715.147.camel@btw.pki2.com> <20121023015546.GA60182@FreeBSD.org> From: =?ISO-8859-1?Q?Gezeala_M=2E_Bacu=F1o_II?= Date: Mon, 22 Oct 2012 21:39:26 -0700 Message-ID: Subject: Re: ZFS HBAs + LSI chip sets (Was: ZFS hang (system #2)) To: John Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Oct 2012 04:40:07 -0000 On Mon, Oct 22, 2012 at 6:55 PM, John wrote: > ----- Dennis Glatting's Original Message ----- >> On Mon, 2012-10-22 at 09:31 -0700, Freddie Cash wrote: >> > On Mon, Oct 22, 2012 at 6:47 AM, Freddie Cash wrote: >> > > I'll double-check when I get to work, but I'm pretty sure it's 10.something. >> > >> > mpt(4) on alpha has firmware 1.5.20.0. >> > >> > mps(4) on beta has firmware 09.00.00.00, driver 14.00.00.01-fbsd. >> > >> > mps(4) on omega has firmware 10.00.02.00, driver 14.00.00.01-fbsd. >> > >> > Hope that helps. >> > >> >> Because one of the RAID1 OS disks failed (System #1), I replaced both >> disks and downgraded to stable/8. Two hours ago I submitted a job. >> >> I noticed on boot smartd issued warnings about disk firmware, which I'll >> update this coming weekend, unless the system hangs before then. >> >> I first want to see if that system will also hang under 8.3. I have >> noticed a looping "ls" of the target ZFS directory is MUCH snappier >> under 8.3 than 9.x. >> >> My CentOS 6.3 ZFS-on-Linux system (System #3) is crunching along (24 >> hours now). This system under stable/9 would previously spontaneously >> reboot whenever I sent a ZFS data set too it. >> >> System #2 is hung (stable/9). > > Hi Folks, > > I just caught up on this thread and thought I toss out some info. > > I have a number of systems running 9-stable (with some local patches), > none running 8. > > The basic architecture is: http://people.freebsd.org/~jwd/zfsnfsserver.jpg > > LSI SAS 9201-16e 6G/s 16-Port SATA+SAS Host Bus Adapter > > All cards are up-to-date on firmware: > > mps0: Firmware: 14.00.00.00, Driver: 14.00.00.01-fbsd > mps1: Firmware: 14.00.00.00, Driver: 14.00.00.01-fbsd > mps2: Firmware: 14.00.00.00, Driver: 14.00.00.01-fbsd > > All drives a geom multipath configured. > > Currently, these systems are used almost exclusively for iSCSI. > > I have seen no lockups that I can track down to the driver. I have seen > one lockup which I did post about (received no feedback) where I believe > an active I/O from istgt is interupted by an ABRT from the client which > causes a lock-up. This one is hard to replicate and on the do-do list. > > It is worth noting that a few drives were replaced early on > due to various I/O problems and one with what might be considered a > lockup. As has been noted elsewhere, watching gstat can be informative. > Also make sure cables are firmly plugged in.. Seems obvious, I know.. > > I did recently commit a small patch to current to handle a case > where if the system has greater than 255 disks, the 255th disk > is hidden/masked by the mps initiator id that is statically coded into > the driver. > > I think it might be good to document a bit better the type of > mount and test job/test stream running when/if you see a lockup. > I am not currently using NFS so there is an entire code-path I > am not exercising. > > Servers are 12 processor, 96GB Ram. The highest cpu load I've > seen on the systems is about 800%. > > All networking is 10G via Chelsio cards - configured to > use isr maxthread 6 with a defaultqlimit of 4096. I have seen > no problems in this area. > > Hope this helps a bit. Happy to answer questions. > > Cheers, > John > > ps: With all that's been said above, it's worth noting that a correctly > configured client makes a huge difference. > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" Hello, I remember seeing your diagram while looking up multipath. Have you used this device (or similar) : http://www.lsi.com/channel/products/storagecomponents/Pages/LSISAS6160Switch.aspx ? If yes, have you setup multipath with it? Thanks. From owner-freebsd-fs@FreeBSD.ORG Tue Oct 23 09:04:41 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 80BB2CAF for ; Tue, 23 Oct 2012 09:04:41 +0000 (UTC) (envelope-from prvs=1643314351=killing@multiplay.co.uk) Received: from mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23]) by mx1.freebsd.org (Postfix) with ESMTP id 042E58FC12 for ; Tue, 23 Oct 2012 09:04:40 +0000 (UTC) Received: from r2d2 ([188.220.16.49]) by mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23]) (MDaemon PRO v10.0.4) with ESMTP id md50000802422.msg for ; Tue, 23 Oct 2012 10:04:32 +0100 X-Spam-Processed: mail1.multiplay.co.uk, Tue, 23 Oct 2012 10:04:32 +0100 (not processed: message from valid local sender) X-MDRemoteIP: 188.220.16.49 X-Return-Path: prvs=1643314351=killing@multiplay.co.uk X-Envelope-From: killing@multiplay.co.uk X-MDaemon-Deliver-To: freebsd-fs@freebsd.org Message-ID: From: "Steven Hartland" To: =?iso-8859-1?Q?J=FCrgen_Weber?= , References: <508090E8.4010300@theiconic.com.au> <5081CE05.1010108@theiconic.com.au> <50830EA3.6020001@theiconic.com.au> <508471E0.9010805@theiconic.com.au> <5084A19A.5050905@theiconic.com.au> <508616A2.60609@theiconic.com.au> Subject: Re: mfi0 timeout error zfs boot mount problem Date: Tue, 23 Oct 2012 10:04:20 +0100 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=response Content-Transfer-Encoding: 8bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.5931 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6157 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Oct 2012 09:04:41 -0000 The usb shutdown issue could well be down to the new cleanup code in 9.x and a bad driver. You could try sysctl hw.usb.no_shutdown_wait=1 to see if that helps did here, although be aware this may have other implications. Regards Steve ----- Original Message ----- From: "Jürgen Weber" To: Sent: Tuesday, October 23, 2012 5:01 AM Subject: Re: mfi0 timeout error zfs boot mount problem Hi Well, while this thread has been very quiet I have resolved my issues. With patience changing: kern.maxfiles=5000000 kern.maxvnodes=5000000 vfs.zfs.zil_disable="1" vfs.zfs.prefetch_disable="1" vfs.zfs.txg.timeout="5" The above solves the system unable to import or mount the pool. I have also gone into the Card settings BIOS and changed under advanced settings "Forward Read" to "none". This solves the mfi0 timeout. Once I had the system up, I then added a l2arc cache via a usb2 SSD HDD. I then shut hte system down and it took 3 hours to shut down.. but it eventually did. When I turned the system back on again, it booted as normal. The lesson learnt?! Do not turn on deduping on a large file system unless you have a lot of RAM or L2ARC! I would say 32GB of RAM/L2ARC for every 10TB as a good rule of thumb, if not... double. Thanks Jurgen On 22/10/12 12:30, Jürgen Weber wrote: > Some more updates! > > on the bootloader I have also tried: > kern.maxfiles=5000000 > kern.maxvnodes=5000000 > > I have also gone into the Card settings BIOS and changed under advanced settings "Forward Read" to "none". > > Now the systems gets to > > "Trying to mount root from zfs:tank/root []..... " and then after maybe 1 to 5 minutes the next couple of lines load like its > working! > > eg: > "Setting hostuuid: xxxxx" > "Setting hostid: xxxxxx" > "Entropy harvesting:interrupts ethernet point_to_point kickstart" > "Starting file system checks:" > "Mounting local file systems:." > > and stops. I have had the machine on my desk all morning observing it and I can see the disk access is going crazy,, it is doing > something. > > I have found this article: > > http://constantin.glez.de/blog/2011/07/zfs-dedupe-or-not-dedupe > > I have a 15TB file system which has dedup on from the start (10TB. I feel its trying to load the DDT and its going to swap/there > is not enough RAM (only have 16GB's). Hopefully my 64GB RAM upgrade is enough. > > Thanks > > Jurgen > > > > On 22/10/12 09:06, Jürgen Weber wrote: >> This is still a problem for me, is anyone there? :) >> >> I have tried the following at the bootime loader. >> >> vfs.zfs.zil_disable="1" >> vfs.zfs.prefetch_disable="1" >> vfs.zfs.txg.timeout="5" >> >> Any other suggestions on how to get this zpool to import and mount again? >> >> Thanks >> >> On 21/10/12 07:50, Jurgen Weber wrote: >>> Hi >>> >>> Lastly, is there a way at boot time, some sysctl's or something I can set to bring zfs to a minimalistic state? Turn off >>> features, etc to get this to mount? >>> >>> Any ideas appreciated. >>> >>> Thanks >>> >>> Jurgen >>> On 20/10/2012 9:02 AM, Jurgen Weber wrote: >>>> Guys >>>> >>>> Some more details on this, some insight would be greatly appreciated. >>>> >>>> As my day wore on trying to get this zpool to import or mount I have learnt a few things. I think over time this issue has >>>> came about as more and more data was added to the file systems. >>>> >>>> Some further details: >>>> >>>> Its a 8 disk raidz pool that the system boots from as well. The disk are all 2TB. >>>> The server has 16GB Of RAM, I notcied the day before this happen the server was struggling with its RAM griding to a halt and >>>> dumping its RAM. >>>> The issue is not hardware because I found another server (same one) swapped the harddrives out took another 8GB of RAM and I >>>> have the same problem. >>>> The main data file systems have dedup and gzip compression on. >>>> >>>> I have booted from open/Oracle Solars 11 adn attempted to import and the Solaris live CD will not import either. In the >>>> Solaris system the disk detach from the system. >>>> >>>> I get the feeling that ZFS is hitting some root limit when attempting to mount and its not finishing the job. >>>> >>>> Thanks >>>> >>>> Jurgen >>>> >>>> On 19/10/2012 10:29 AM, Jürgen Weber wrote: >>>>> Team >>>>> >>>>> I have googled around for a solution and I see a lot of posts about firmware versions and patches for FreeBSD 8.*. >>>>> >>>>> I have a FreeBSD 9.1rc1 system, which was beta1 orginally and has been running for months. >>>>> >>>>> Now it will not boot, I get the following: >>>>> >>>>> "Trying to mount root from zfs:tank/root []..... >>>>> mfi0: COMMAND 0Xffffff8000cb83530 TIMEOUT AFTER xxx SECONDS >>>>> (this just repeats). >>>>> >>>>> I have not seen this error before during normal runtime, _only_ during boot. >>>>> >>>>> Originally when I had the problem I could boot off a USB stick (9.1beta1 or rc1), run a 'zpool import -f tank' and it would >>>>> work on the livecd. Rebooting and the main system would work. >>>>> >>>>> This time this work around does not work for me. When I am on the USB stick I can run a 'zpool import' and all of the disk >>>>> are recognised, the pool is recognised and the file system is healthy. >>>>> >>>>> The Card is a H700 PERC, with 12.10.3 firmware in a Dell R515. >>>>> Running FreeBSD 9.1-RC1, latest zfs and zpool versions. >>>>> >>>>> I have tried disabling the cache (mfiutil cache xxx disable). I have also gone into the Card settings and changed under >>>>> advanced settings "adaptive forward read" to "read only". >>>>> >>>>> Any help, appreciated. >>>>> >>>>> Thanks >>>>> >>>> >>> >> > > -- > Jürgen Weber > > Systems Engineer > IT Infrastructure Team Leader > > THE ICONIC | Ejurgen.weber@theiconic.com.au |www.theiconic.com.au -- Jürgen Weber Systems Engineer IT Infrastructure Team Leader THE ICONIC | E jurgen.weber@theiconic.com.au | www.theiconic.com.au _______________________________________________ freebsd-fs@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-fs To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" ================================================ This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmaster@multiplay.co.uk. From owner-freebsd-fs@FreeBSD.ORG Tue Oct 23 15:40:26 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 1A717753 for ; Tue, 23 Oct 2012 15:40:26 +0000 (UTC) (envelope-from dustinwenz@ebureau.com) Received: from internet02.ebureau.com (internet02.tru-signal.biz [65.127.24.21]) by mx1.freebsd.org (Postfix) with ESMTP id C7B018FC0C for ; Tue, 23 Oct 2012 15:40:25 +0000 (UTC) Received: from service02.office.ebureau.com (internet06.ebureau.com [65.127.24.25]) by internet02.ebureau.com (Postfix) with ESMTP id 2A596DF392F for ; Tue, 23 Oct 2012 10:40:19 -0500 (CDT) Received: from localhost (localhost [127.0.0.1]) by service02.office.ebureau.com (Postfix) with ESMTP id 26146B1F663 for ; Tue, 23 Oct 2012 10:40:19 -0500 (CDT) X-Virus-Scanned: amavisd-new at ebureau.com Received: from service02.office.ebureau.com ([127.0.0.1]) by localhost (internet06.ebureau.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id fFrNC2988rE4 for ; Tue, 23 Oct 2012 10:40:16 -0500 (CDT) Received: from square.office.ebureau.com (square.office.ebureau.com [10.10.20.22]) by service02.office.ebureau.com (Postfix) with ESMTPSA id CF3DEB1F641 for ; Tue, 23 Oct 2012 10:40:16 -0500 (CDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 6.1 \(1498\)) Subject: Re: Imposing ZFS latency limits From: Dustin Wenz In-Reply-To: Date: Tue, 23 Oct 2012 10:40:16 -0500 Content-Transfer-Encoding: quoted-printable Message-Id: <2CB1D556-1EAF-43F9-8A24-36548C663ED8@ebureau.com> References: <6116A56E-4565-4485-887E-46E3ED231606@ebureau.com> <089898A4493042448C934643FD5C3887@multiplay.co.uk> To: "" X-Mailer: Apple Mail (2.1498) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Oct 2012 15:40:26 -0000 On Oct 22, 2012, at 8:21 AM, Mark Felder wrote: > On Tue, 16 Oct 2012 10:46:00 -0500, Steven Hartland = wrote: >=20 >>=20 >> Interesting, what metrics where you using which made it easy to = detect, >> work be nice to know your process there Mark? >=20 > One reason is that our virtual machine performance gets awful and we = get alerted for higher than usual load and/or disk io latency by the = hypervisor. Another thing we've implemented is watching for some SCSI = errors on the server too. They seem to let us know before it really gets = bad. >=20 > It's nice knowing ZFS is doing everything within its power to read the = data off the disk, but when there's a fully intact raidz it should be = smart enough to kick a disk out that's being problematic. What hypervisor are you using? Is it with a passive JBOD? There are other situations where a disk is not failing that you may not = get constant read performance, such as when a disk is undergoing thermal = recalibration, being scanned for diagnostics, etc. Any sort of realtime = database or streaming application could benefit from better latency = control. It's possible that we have no control over this, and are subject to = whatever features Oracle decides to include or omit from ZFS. - .Dustin From owner-freebsd-fs@FreeBSD.ORG Tue Oct 23 17:27:23 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id AA4F36F5 for ; Tue, 23 Oct 2012 17:27:23 +0000 (UTC) (envelope-from feld@feld.me) Received: from feld.me (unknown [IPv6:2607:f4e0:100:300::2]) by mx1.freebsd.org (Postfix) with ESMTP id 66D038FC0A for ; Tue, 23 Oct 2012 17:27:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=feld.me; s=blargle; h=In-Reply-To:Message-Id:From:Mime-Version:Date:References:Subject:To:Content-Type; bh=Yj02fOef6hjid/yxmcVdM9IPULzBkQmM0dSDl4yS538=; b=qu7Ixjleph7YXnpZxtZlzP3rTMI2GoKqvpJBSHwjR20uwyL8ksNyRZz8q5YUH4J3oV5NzRQK1kRQBBhohPhf85PoDFfJOMBxBvVt8+P4UQ9xncbSVsZ9223cY+eA4FxL; Received: from localhost ([127.0.0.1] helo=mwi1.coffeenet.org) by feld.me with esmtp (Exim 4.80 (FreeBSD)) (envelope-from ) id 1TQiG8-0001c3-CV; Tue, 23 Oct 2012 12:27:21 -0500 Received: from feld@feld.me by mwi1.coffeenet.org (Archiveopteryx 3.1.4) with esmtpa id 1351013233-65253-65252/5/3; Tue, 23 Oct 2012 17:27:13 +0000 Content-Type: text/plain; charset=utf-8; format=flowed; delsp=yes To: "" , Dustin Wenz Subject: Re: Imposing ZFS latency limits References: <6116A56E-4565-4485-887E-46E3ED231606@ebureau.com> <089898A4493042448C934643FD5C3887@multiplay.co.uk> <2CB1D556-1EAF-43F9-8A24-36548C663ED8@ebureau.com> Date: Tue, 23 Oct 2012 12:27:10 -0500 Mime-Version: 1.0 From: Mark Felder Message-Id: In-Reply-To: <2CB1D556-1EAF-43F9-8A24-36548C663ED8@ebureau.com> User-Agent: Opera Mail/12.02 (FreeBSD) X-SA-Report: ALL_TRUSTED=-1, KHOP_THREADED=-0.5 X-SA-Score: -1.5 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Oct 2012 17:27:23 -0000 On Tue, 23 Oct 2012 10:40:16 -0500, Dustin Wenz wrote: > What hypervisor are you using? Is it with a passive JBOD? VMWare ESXi and Xen accessing ZVOLs on FreeBSD shared via iSCSI (istgt). Yes, it's with a passive JBOD. The paths from the hypervisors to the HP DL380 head units to the JBODs are redundant all the way down to the disks. The SSDs for cache/log exist in the head units to not waste precious space in the JBOD chassis. ZFS basically has full control over everything those disks do. From owner-freebsd-fs@FreeBSD.ORG Tue Oct 23 18:46:51 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id CDF047D8 for ; Tue, 23 Oct 2012 18:46:51 +0000 (UTC) (envelope-from lists@yamagi.org) Received: from mail.yamagi.org (mail.yamagi.org [IPv6:2a01:4f8:121:2102:1::7]) by mx1.freebsd.org (Postfix) with ESMTP id 617B98FC08 for ; Tue, 23 Oct 2012 18:46:51 +0000 (UTC) Received: from maka.home.yamagi.org (hmbg-4d06976c.pool.mediaWays.net [77.6.151.108]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.yamagi.org (Postfix) with ESMTPSA id BA59A1666312; Tue, 23 Oct 2012 20:46:49 +0200 (CEST) Date: Tue, 23 Oct 2012 20:46:23 +0200 From: Yamagi Burmeister To: rmacklem@uoguelph.ca Subject: Can not read from ZFS exported over NFSv4 but write to it Message-Id: <20121023204623.a1eef4f99b5f786050229b6c@yamagi.org> X-Mailer: Sylpheed 3.2.0 (GTK+ 2.24.6; amd64-portbld-freebsd8.3) Mime-Version: 1.0 Content-Type: multipart/signed; protocol="application/pgp-signature"; micalg="PGP-SHA1"; boundary="Signature=_Tue__23_Oct_2012_20_46_23_+0200_1Wtz=sA.hgI3e4WJ" Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Oct 2012 18:46:51 -0000 --Signature=_Tue__23_Oct_2012_20_46_23_+0200_1Wtz=sA.hgI3e4WJ Content-Type: text/plain; charset=US-ASCII Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hello, I have 2 boxes. Both of them are running FreeBSD 9.1-RC2. On box A a ZFS filesystem and an UFS filesystem are exported over NFSv4. Both filesystems are mounted on box B. On box B the UFS filesystem is working as expected. But I'm only able write to the ZFS filesystem, when trying to read from it only a small amount of data is transmitted before the NFSv4 mount stalls. A subsequent "umount -f" takes several minutes to complete. This behavior is 100% reproduceable. The /etc/exports on box A: # ZFS /usr/home/yamagi # UFS /mnt V4: / -sec=3Dsys 192.168.0.13 Mounted on box B: % mount a:/usr/home/yamagi on /mnt (nfs, nfsv4acls) After the mount stalled (just try to copy some data) the kernel on box B shows:=20 nfsv4 client/server protocol prob err=3D10020 newnfs server a:/usr/home/yamagi: not responding newnfs server a:/usr/home/yamagi: is alive again newnfs server a:/usr/home/yamagi: not responding ... But the network connection is stable at all times. Not a single "ping" failes, the ssh connection between the two hosts works just fine. The kernel on box A doesn't show anything. The nfsd processes are looking fine. A "procstat -kk" doesn't show anything: =20 1844 100392 nfsd nfsd: master mi_switch+0x186 sleepq_catch_signals+0x2cc sleepq_timedwait_sig+0x19 _cv_timedwait_sig +0x13c svc_run_internal+0x7a1 svc_run+0x8f nfsrvd_nfsd+0x1c7 nfssvc_nfsd +0x9b sys_nfssvc+0x90 amd64_syscall+0x546 Xfast_syscall +0xf7 1838 101289 nfsd - mi_switch+0x186 sleepq_catch_signals+0x2cc sleepq_wait_sig+0x16 _cv_wait_sig+0x12e seltdwait+0x110 kern_select+0x6ef sys_select+0x5d amd64_syscall+0x546 Xfast_syscall+0xf7 Any help is welcome. More information can be provided if needed.=20 Ciao, Yamagi --=20 Homepage: www.yamagi.org XMPP: yamagi@yamagi.org GnuPG/GPG: 0xEFBCCBCB --Signature=_Tue__23_Oct_2012_20_46_23_+0200_1Wtz=sA.hgI3e4WJ Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iEYEARECAAYFAlCG5g4ACgkQWTjlg++8y8upCgCdGST6+cge+deoyd35la/1zFbK bFsAoKKezZ6JXRv3lu5dYVhIx73Swp3e =A7lX -----END PGP SIGNATURE----- --Signature=_Tue__23_Oct_2012_20_46_23_+0200_1Wtz=sA.hgI3e4WJ-- From owner-freebsd-fs@FreeBSD.ORG Tue Oct 23 20:33:04 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id CC2CE7DA for ; Tue, 23 Oct 2012 20:33:04 +0000 (UTC) (envelope-from tom@claimlynx.com) Received: from na3sys009aog121.obsmtp.com (na3sys009aog121.obsmtp.com [74.125.149.145]) by mx1.freebsd.org (Postfix) with SMTP id 286FD8FC12 for ; Tue, 23 Oct 2012 20:33:03 +0000 (UTC) Received: from mail-qc0-f200.google.com ([209.85.216.200]) (using TLSv1) by na3sys009aob121.postini.com ([74.125.148.12]) with SMTP ID DSNKUIb++FmRVsHunCHUxOsmIyK1XgiStd66@postini.com; Tue, 23 Oct 2012 13:33:04 PDT Received: by mail-qc0-f200.google.com with SMTP id l42so7920971qco.7 for ; Tue, 23 Oct 2012 13:32:56 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:x-gm-message-state; bh=Fzz8q6RTRYeJfpAT53t/OqfSf7R2NQYRcllgZhkHGBM=; b=YrCqO/6nM4X3zeeGQF7TutsTSbVbUSuiOlBnjMUtJg/9BjWfCKzga4sBXWVrWf6kwz Gf+36PgLDLHNq66826yREedlR6TCSWqAMjeBMq39Fm2LhmpYC73W64jxcZNzBsMaOF9x 50q6lTqzCLr9QwSiltyNjqVUrlOzOqzom65CqzmVa270g00P9s/CCXK0kxBvKWJr9XP3 4HVWEBLlSGFj/Iqh243KCevaxCdizypgPQ4Q0bGsRkWoFEMqHcHpTOSz4Ap7St5UQPXw kEM7CTjSRDJW9pOhd5eVPqUDN2c17y3Yuis6ARWjCp/W7QUEFbWl+VPU7G5f6SCdc4PT M2yg== Received: by 10.52.98.105 with SMTP id eh9mr18567725vdb.11.1351024376122; Tue, 23 Oct 2012 13:32:56 -0700 (PDT) MIME-Version: 1.0 Received: by 10.52.98.105 with SMTP id eh9mr18567708vdb.11.1351024375758; Tue, 23 Oct 2012 13:32:55 -0700 (PDT) Received: by 10.58.28.138 with HTTP; Tue, 23 Oct 2012 13:32:55 -0700 (PDT) In-Reply-To: <238159534.2504674.1350598282110.JavaMail.root@erie.cs.uoguelph.ca> References: <238159534.2504674.1350598282110.JavaMail.root@erie.cs.uoguelph.ca> Date: Tue, 23 Oct 2012 15:32:55 -0500 Message-ID: Subject: Re: Poor throughput using new NFS client (9.0) vs. old (8.2/9.0) From: Thomas Johnson To: Rick Macklem X-Gm-Message-State: ALoCoQnsNhAk7eVdmMsJDf1cR/o+L8zd3/yi78NFA2gzGiSVeavITU0HHHl0X5hSf5nAoFtDeIsCLca14oC2phESCK7jxRebO9E4W3z4pUUCoZdSw2eVsb2aRiW8bzLvKomWxvbRmwPDUcSUABCosCFj3o6Ylzehgw== Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-fs@freebsd.org, Ronald Klop X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Oct 2012 20:33:05 -0000 I built a test image based on 9.1-rc2, per your suggestion Rick. The results are below. I was not able to exactly reproduce the workload in my original message, so I have also included results for the new (very similar) workload on my 9.0 client image as well. To summarize, 9.1-rc2 using newnfs seems to perform better than 9.0-p4, but oldnfs appears to still be significantly faster in both cases. I will get packet traces to Rick, but I want to get new results to the list. -Tom root@test:/test-> uname -a FreeBSD test.claimlynx.com 9.1-RC2 FreeBSD 9.1-RC2 #1: Fri Oct 19 08:27:12 CDT 2012 root@builder.claimlynx.com:/usr/obj/usr/src/sys/GENERIC amd64 root@test:/-> mount | grep test server:/array/test on /test (nfs) root@test:/test-> zip BIGGER_PILE.zip BIG_PILE_53* adding: BIG_PILE_5306.zip (stored 0%) adding: BIG_PILE_5378.zip (stored 0%) adding: BIG_PILE_5386.zip (stored 0%) root@test:/test-> ll -h BIGGER_PILE.zip -rw-rw-r-- 1 root claimlynx 5.5M Oct 23 14:05 BIGGER_PILE.zip root@test:/test-> time zip BIGGER_PILE.zip 53*.zip > /dev/null 0.664u 1.693s 0:30.21 7.7% 296+3084k 0+2926io 0pf+0w 0.726u 0.989s 0:08.04 21.1% 230+2667k 0+2956io 0pf+0w 0.829u 1.268s 0:11.89 17.4% 304+3037k 0+2961io 0pf+0w 0.807u 0.902s 0:08.02 21.1% 233+2676k 0+2947io 0pf+0w 0.753u 1.354s 0:12.73 16.4% 279+2879k 0+2947io 0pf+0w root@test:/test-> ll -h BIGGER_PILE.zip -rw-rw-r-- 1 root claimlynx 89M Oct 23 14:03 BIGGER_PILE.zip root@test:/test-> mount | grep test server:/array/test on /test (oldnfs) root@test:/test-> time zip BIGGER_PILE.zip 53*.zip > /dev/null 0.645u 1.435s 0:08.05 25.7% 295+3044k 0+5299io 0pf+0w 0.783u 0.993s 0:06.48 27.3% 225+2499k 0+5320io 0pf+0w 0.787u 1.000s 0:06.28 28.3% 246+2884k 0+5317io 0pf+0w 0.707u 1.392s 0:07.94 26.3% 266+2743k 0+5313io 0pf+0w 0.709u 1.056s 0:06.08 28.7% 246+2814k 0+5318io 0pf+0w root@test:/home/tom-> uname -a FreeBSD test.claimlynx.com 9.0-RELEASE-p4 FreeBSD 9.0-RELEASE-p4 #0: Tue Sep 18 11:51:11 CDT 2012 root@builder.claimlynx.com:/usr/obj/usr/src/sys/GENERIC amd64 root@test:/test-> mount | grep test server:/array/test on /test (nfs) root@test:/test-> time zip BIGGER_PILE.zip 53*.zip > /dev/null 0.721u 1.819s 0:31.13 8.1% 284+2886k 0+2932io 0pf+0w 0.725u 1.386s 0:12.84 16.3% 247+2631k 0+2957io 0pf+0w 0.675u 1.392s 0:13.94 14.7% 300+3005k 0+2928io 0pf+0w 0.705u 1.206s 0:10.72 17.7% 278+2874k 0+2973io 0pf+0w 0.727u 1.200s 0:18.28 10.5% 274+2872k 0+2947io 0pf+0w root@test:/-> umount /test root@test:/-> mount -t oldnfs server:/array/test /test root@test:/-> mount | grep test server:/array/test on /test (oldnfs) root@test:/test-> time zip BIGGER_PILE.zip 53*.zip > /dev/null 0.694u 1.820s 0:10.82 23.1% 271+2964k 0+5320io 0pf+0w 0.726u 1.293s 0:06.37 31.5% 303+2998k 0+5322io 0pf+0w 0.717u 1.248s 0:06.08 32.0% 246+2607k 0+5354io 0pf+0w 0.733u 1.230s 0:06.17 31.7% 256+2536k 0+5311io 0pf+0w 0.549u 1.581s 0:08.02 26.4% 302+3116k 0+5321io 0pf+0w On Thu, Oct 18, 2012 at 5:11 PM, Rick Macklem wrote: > Ronald Klop wrote: > > On Thu, 18 Oct 2012 18:16:16 +0200, Thomas Johnson > > wrote: > > > > > We recently upgraded a number of hosts from FreeBSD 8.2 to 9.0. > > > Almost > > > immediately, we received reports from users of poor performance. The > > > upgraded hosts are PXE-booted, with an NFS-mounted root. > > > Additionally, > > > they > > > mount a number of other NFS shares, which is where our users work > > > from. > > > After a week of tweaking rsize/wsize/readahead parameters (per > > > guidance), > > > it finally occurred to me that 9.0 defaults to the new NFS client > > > and > > > server. I remounted the user shares using the oldnfs file type, and > > > users > > > reported that performance returned to its expected level. > > > > > > This is obviously a workaround, rather than a solution. We would > > > prefer > > > to > > > get our hosts using the newnfs client, since presumably oldnfs will > > > be > > > deprecated at some point in the future. Is there some change that we > > > should > > > have made to our NFS configuration with the upgrade to 9.0, or is it > > > possible that our workload is exposing some deficiency with newnfs? > > > We > > > tend > > > to deal with a huge number of tiny files (several KB in size). The > > > NFS > > > server has been running 9.0 for some time (prior to the client > > > upgrade) > > > without any issue. NFS is served from a zpool, backed by a Dell > > > MD3000, > > > populated with 15k SAS disks. Clients and server are connected with > > > Gig-E > > > links. The general hardware configuration has not changed in nearly > > > 3 > > > years. > > > > > > As an example of the performance difference, here is some of the > > > testing > > > I > > > did while troubleshooting. Given a directory containing 5671 zip > > > files, > > > with an average size of 15KB. I append all files to an existing zip > > > file. > > > Using the newnfs mount, I found that this operation generally takes > > > ~30 > > > seconds (wall time). Switching the mount to oldnfs resulted in the > > > same > > > operation taking ~10 seconds. > > > > > > tom@test-1:/test-> ls 53*zip | wc -l > > > 5671 > > > tom@test-1:/test-> ll -h BIG* > > > -rw-rw-r-- 1 tom claimlynx 8.9M Oct 17 14:06 BIGGER_PILE_1.zip > > > tom@test-1:/test-> time zip BIGGER_PILE_1.zip 53*.zip > > > 0.646u 0.826s 0:51.01 2.8% 199+2227k 0+2769io 0pf+0w > > > ...reset and repeat... > > > 0.501u 0.629s 0:30.49 3.6% 208+2319k 0+2772io 0pf+0w > > > ...reset and repeat... > > > 0.601u 0.522s 0:32.37 3.4% 220+2406k 0+2771io 0pf+0w > > > > > > tom@test-1:/-> cd / > > > tom@test-1:/-> sudo umount /test > > > tom@test-1:/-> sudo mount -t oldnfs -o rw server:/array/test /test > > > tom@test-1:/-> mount | grep test > > > server:/array/test on /test (oldnfs) > > > tom@test-1:/-> cd /test > > > ...reset and repeat... > > > 0.470u 0.903s 0:13.09 10.4% 203+2229k 0+5107io 0pf+0w > > > ...reset and repeat... > > > 0.547u 0.640s 0:08.65 13.6% 231+2493k 0+5086io 0pf+0w > > > tom@test-1:/test-> ll -h BIG* > > > -rw-rw-r-- 1 tom claimlynx 92M Oct 17 14:14 BIGGER_PILE_1.zip > > > > > > Thanks! > > > > > > > > > You might find this thread from today interesting. > > http://lists.freebsd.org/pipermail/freebsd-fs/2012-October/015441.html > > > Yes, although I can't explain why Alexey's problem went away > when he went from 9.0->9.1 for his NFS server, it would be > interesting if Thomas could try the same thing? > > About the only thing different between the old and new NFS > clients is the default rsize/wsize. However, if Thomas tried > rsize=32768,wsize=32768 for the default (new) NFS client, then > that would be ruled out. To be honest, the new client uses code > cloned from the old one for all the caching etc (which is where > the clients are "smart"). They use different RPC parsing code, > since the new one does NFSv4 as well, but that code is pretty > straightforward, so I can't think why it would result in a > factor of 3 in performance. > > If Thomas were to capture a packet trace of the above test > for two clients and emailed them to me, I could take a look > and see if I can see what is going on. (For Alexey's case, > it was a whole bunch of Read RPCs without replies, but that > was a Linux client, of course. It also had a significant # of > TCP layer retransmits and out of order TCP segments in it.) > > It would be nice to figure this out, since I was thinking > that the old client might go away for 10.0 (can't if these > issues still exist). > > rick > > > Ronald. > > _______________________________________________ > > -- Thomas Johnson ClaimLynx, Inc. From owner-freebsd-fs@FreeBSD.ORG Tue Oct 23 21:55:17 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id BC529637 for ; Tue, 23 Oct 2012 21:55:17 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.mail.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id 6EB128FC0C for ; Tue, 23 Oct 2012 21:55:17 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ap4EACgRh1CDaFvO/2dsb2JhbABEhhS8c4IeAQEEASMmMAUWDgoCAg0ZAlkGE4d+Bqg+gjuQLIEgkAyBEgOVcpA5gwuBRzU X-IronPort-AV: E=Sophos;i="4.80,637,1344225600"; d="scan'208";a="187775471" Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-annu-pri.mail.uoguelph.ca with ESMTP; 23 Oct 2012 17:55:15 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id D0E82B4044; Tue, 23 Oct 2012 17:55:15 -0400 (EDT) Date: Tue, 23 Oct 2012 17:55:15 -0400 (EDT) From: Rick Macklem To: Yamagi Burmeister Message-ID: <1579346453.2736080.1351029315835.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: <20121023204623.a1eef4f99b5f786050229b6c@yamagi.org> Subject: Re: Can not read from ZFS exported over NFSv4 but write to it MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.201] X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692) Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Oct 2012 21:55:18 -0000 Yamagi Burmeister wrote: > Hello, > I have 2 boxes. Both of them are running FreeBSD 9.1-RC2. On box A a > ZFS filesystem and an UFS filesystem are exported over NFSv4. Both > filesystems are mounted on box B. On box B the UFS filesystem is > working as expected. But I'm only able write to the ZFS filesystem, > when trying to read from it only a small amount of data is transmitted > before the NFSv4 mount stalls. A subsequent "umount -f" takes several > minutes to complete. This behavior is 100% reproduceable. > > The /etc/exports on box A: > > # ZFS > /usr/home/yamagi > # UFS > /mnt > V4: / -sec=sys 192.168.0.13 > For ZFS, all volumes down to yamagi must be exported. You don't show what your ZFS setup is, but you either need to export "home" and "usr" if those are ZFS volumes. (The above /etc/exports would be ok, only if /, /usr and /home are all UFS volumes and /usr/home/yamagi is the root of a ZFS volume.) For UFS, non-exported volumes can be traversed by "mount", but for ZFS that is not the case. The only way I know of to fix this inconsistency is to disable the traversal capability for UFS, but that would be a POLA violation, so the inconsistency (caused by ZFS checking exports itself instead of leaving to the VFS layer) remains. OR you can specify the root of V4 in the exported volume. For example, you could: # ZFS /usr/home/yamagi V4: /usr/home/yamagi -sec=sys 192.168.0.13 And then the client mount would be: a:/ on /mnt since "/" would be at /usr/home/yamagi. (If you do this, the /mnt UFS volume wouldn't be mountable via NFSv4.) > Mounted on box B: > > % mount > a:/usr/home/yamagi on /mnt (nfs, nfsv4acls) > > After the mount stalled (just try to copy some data) the kernel on box > B > shows: > nfsv4 client/server protocol prob err=10020 error 10020 is NFS4ERR_NOFILEHANDLE and that is usually because some entry in the path isn't exported. As such, it looks like something in the path to /usr/home/yamagi isn't exported. rick ps: If you still have the problem after you are convinced that your /etc/exports is ok, you can capture packets for the ZFS mount attempt and email the packet trace to me. I never use ZFS, so I don't see ZFS specific problems. > newnfs server a:/usr/home/yamagi: not responding > newnfs server a:/usr/home/yamagi: is alive again > newnfs server a:/usr/home/yamagi: not responding > ... > > But the network connection is stable at all times. Not a single "ping" > failes, the ssh connection between the two hosts works just fine. > > The kernel on box A doesn't show anything. The nfsd processes are > looking fine. A "procstat -kk" doesn't show anything: > > 1844 100392 nfsd nfsd: master mi_switch+0x186 > sleepq_catch_signals+0x2cc sleepq_timedwait_sig+0x19 _cv_timedwait_sig > +0x13c svc_run_internal+0x7a1 svc_run+0x8f nfsrvd_nfsd+0x1c7 > nfssvc_nfsd +0x9b sys_nfssvc+0x90 amd64_syscall+0x546 Xfast_syscall > +0xf7 > > 1838 101289 nfsd - mi_switch+0x186 > sleepq_catch_signals+0x2cc sleepq_wait_sig+0x16 _cv_wait_sig+0x12e > seltdwait+0x110 kern_select+0x6ef sys_select+0x5d amd64_syscall+0x546 > Xfast_syscall+0xf7 > > Any help is welcome. More information can be provided if needed. > > Ciao, > Yamagi > > -- > Homepage: www.yamagi.org > XMPP: yamagi@yamagi.org > GnuPG/GPG: 0xEFBCCBCB From owner-freebsd-fs@FreeBSD.ORG Tue Oct 23 23:37:21 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 3C099907 for ; Tue, 23 Oct 2012 23:37:21 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.mail.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id DA20B8FC08 for ; Tue, 23 Oct 2012 23:37:20 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ap4EAN0ph1CDaFvO/2dsb2JhbABEhhS8c4IeAQEBAwEBAiAEUhsOCgICDRkCKi8GExuHYwYLqC2CO5AogSCKQCeFJYESA5JAgQWCLYEXjyKDC4FAPA X-IronPort-AV: E=Sophos;i="4.80,638,1344225600"; d="scan'208";a="187786721" Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-annu-pri.mail.uoguelph.ca with ESMTP; 23 Oct 2012 19:37:19 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 482E5B3F62; Tue, 23 Oct 2012 19:37:19 -0400 (EDT) Date: Tue, 23 Oct 2012 19:37:19 -0400 (EDT) From: Rick Macklem To: Thomas Johnson Message-ID: <86699361.2739800.1351035439228.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: Subject: Re: Poor throughput using new NFS client (9.0) vs. old (8.2/9.0) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.202] X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692) Cc: freebsd-fs@freebsd.org, Ronald Klop X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Oct 2012 23:37:21 -0000 Thomas Johnson wrote: > I built a test image based on 9.1-rc2, per your suggestion Rick. The > results are below. I was not able to exactly reproduce the workload in > my original message, so I have also included results for the new (very > similar) workload on my 9.0 client image as well. > > To summarize, 9.1-rc2 using newnfs seems to perform better than > 9.0-p4, but oldnfs appears to still be significantly faster in both > cases. > > I will get packet traces to Rick, but I want to get new results to the > list. > > -Tom > > root@test:/test-> uname -a > FreeBSD test.claimlynx.com 9.1-RC2 FreeBSD 9.1-RC2 #1: Fri Oct 19 > 08:27:12 CDT 2012 > root@builder.claimlynx.com:/usr/obj/usr/src/sys/GENERIC amd64 > > > root@test:/-> mount | grep test > server:/array/test on /test (nfs) > root@test:/test-> zip BIGGER_PILE.zip BIG_PILE_53* > adding: BIG_PILE_5306.zip (stored 0%) > adding: BIG_PILE_5378.zip (stored 0%) > adding: BIG_PILE_5386.zip (stored 0%) > root@test:/test-> ll -h BIGGER_PILE.zip > -rw-rw-r-- 1 root claimlynx 5.5M Oct 23 14:05 BIGGER_PILE.zip > root@test:/test-> time zip BIGGER_PILE.zip 53*.zip > /dev/null > 0.664u 1.693s 0:30.21 7.7% 296+3084k 0+2926io 0pf+0w > 0.726u 0.989s 0:08.04 21.1% 230+2667k 0+2956io 0pf+0w > 0.829u 1.268s 0:11.89 17.4% 304+3037k 0+2961io 0pf+0w > 0.807u 0.902s 0:08.02 21.1% 233+2676k 0+2947io 0pf+0w > 0.753u 1.354s 0:12.73 16.4% 279+2879k 0+2947io 0pf+0w > root@test:/test-> ll -h BIGGER_PILE.zip > -rw-rw-r-- 1 root claimlynx 89M Oct 23 14:03 BIGGER_PILE.zip > Although the runs take much longer (I have no idea why and hopefully I can spot something in the packet traces), it shows about half the I/O ops. This suggests that it is running at the 64K rsize, wsize instead of the 32K used by the old client. Just to confirm. Did you run a test using the new nfs client with rsize=32768,wsize=32768 mount options, so the I/O size is the same as with the old client? rick > > root@test:/test-> mount | grep test > server:/array/test on /test (oldnfs) > root@test:/test-> time zip BIGGER_PILE.zip 53*.zip > /dev/null > 0.645u 1.435s 0:08.05 25.7% 295+3044k 0+5299io 0pf+0w > 0.783u 0.993s 0:06.48 27.3% 225+2499k 0+5320io 0pf+0w > 0.787u 1.000s 0:06.28 28.3% 246+2884k 0+5317io 0pf+0w > 0.707u 1.392s 0:07.94 26.3% 266+2743k 0+5313io 0pf+0w > 0.709u 1.056s 0:06.08 28.7% 246+2814k 0+5318io 0pf+0w > > > > root@test:/home/tom-> uname -a > FreeBSD test.claimlynx.com 9.0-RELEASE-p4 FreeBSD 9.0-RELEASE-p4 #0: > Tue Sep 18 11:51:11 CDT 2012 > root@builder.claimlynx.com:/usr/obj/usr/src/sys/GENERIC amd64 > > > root@test:/test-> mount | grep test > server:/array/test on /test (nfs) > root@test:/test-> time zip BIGGER_PILE.zip 53*.zip > /dev/null > 0.721u 1.819s 0:31.13 8.1% 284+2886k 0+2932io 0pf+0w > 0.725u 1.386s 0:12.84 16.3% 247+2631k 0+2957io 0pf+0w > 0.675u 1.392s 0:13.94 14.7% 300+3005k 0+2928io 0pf+0w > 0.705u 1.206s 0:10.72 17.7% 278+2874k 0+2973io 0pf+0w > 0.727u 1.200s 0:18.28 10.5% 274+2872k 0+2947io 0pf+0w > > > root@test:/-> umount /test > root@test:/-> mount -t oldnfs server:/array/test /test > root@test:/-> mount | grep test > server:/array/test on /test (oldnfs) > root@test:/test-> time zip BIGGER_PILE.zip 53*.zip > /dev/null > 0.694u 1.820s 0:10.82 23.1% 271+2964k 0+5320io 0pf+0w > 0.726u 1.293s 0:06.37 31.5% 303+2998k 0+5322io 0pf+0w > 0.717u 1.248s 0:06.08 32.0% 246+2607k 0+5354io 0pf+0w > 0.733u 1.230s 0:06.17 31.7% 256+2536k 0+5311io 0pf+0w > 0.549u 1.581s 0:08.02 26.4% 302+3116k 0+5321io 0pf+0w > > > On Thu, Oct 18, 2012 at 5:11 PM, Rick Macklem < rmacklem@uoguelph.ca > > wrote: > > > > > Ronald Klop wrote: > > On Thu, 18 Oct 2012 18:16:16 +0200, Thomas Johnson < > > tom@claimlynx.com > > > wrote: > > > > > We recently upgraded a number of hosts from FreeBSD 8.2 to 9.0. > > > Almost > > > immediately, we received reports from users of poor performance. > > > The > > > upgraded hosts are PXE-booted, with an NFS-mounted root. > > > Additionally, > > > they > > > mount a number of other NFS shares, which is where our users work > > > from. > > > After a week of tweaking rsize/wsize/readahead parameters (per > > > guidance), > > > it finally occurred to me that 9.0 defaults to the new NFS client > > > and > > > server. I remounted the user shares using the oldnfs file type, > > > and > > > users > > > reported that performance returned to its expected level. > > > > > > This is obviously a workaround, rather than a solution. We would > > > prefer > > > to > > > get our hosts using the newnfs client, since presumably oldnfs > > > will > > > be > > > deprecated at some point in the future. Is there some change that > > > we > > > should > > > have made to our NFS configuration with the upgrade to 9.0, or is > > > it > > > possible that our workload is exposing some deficiency with > > > newnfs? > > > We > > > tend > > > to deal with a huge number of tiny files (several KB in size). The > > > NFS > > > server has been running 9.0 for some time (prior to the client > > > upgrade) > > > without any issue. NFS is served from a zpool, backed by a Dell > > > MD3000, > > > populated with 15k SAS disks. Clients and server are connected > > > with > > > Gig-E > > > links. The general hardware configuration has not changed in > > > nearly > > > 3 > > > years. > > > > > > As an example of the performance difference, here is some of the > > > testing > > > I > > > did while troubleshooting. Given a directory containing 5671 zip > > > files, > > > with an average size of 15KB. I append all files to an existing > > > zip > > > file. > > > Using the newnfs mount, I found that this operation generally > > > takes > > > ~30 > > > seconds (wall time). Switching the mount to oldnfs resulted in the > > > same > > > operation taking ~10 seconds. > > > > > > tom@test-1:/test-> ls 53*zip | wc -l > > > 5671 > > > tom@test-1:/test-> ll -h BIG* > > > -rw-rw-r-- 1 tom claimlynx 8.9M Oct 17 14:06 BIGGER_PILE_1.zip > > > tom@test-1:/test-> time zip BIGGER_PILE_1.zip 53*.zip > > > 0.646u 0.826s 0:51.01 2.8% 199+2227k 0+2769io 0pf+0w > > > ...reset and repeat... > > > 0.501u 0.629s 0:30.49 3.6% 208+2319k 0+2772io 0pf+0w > > > ...reset and repeat... > > > 0.601u 0.522s 0:32.37 3.4% 220+2406k 0+2771io 0pf+0w > > > > > > tom@test-1:/-> cd / > > > tom@test-1:/-> sudo umount /test > > > tom@test-1:/-> sudo mount -t oldnfs -o rw server:/array/test /test > > > tom@test-1:/-> mount | grep test > > > server:/array/test on /test (oldnfs) > > > tom@test-1:/-> cd /test > > > ...reset and repeat... > > > 0.470u 0.903s 0:13.09 10.4% 203+2229k 0+5107io 0pf+0w > > > ...reset and repeat... > > > 0.547u 0.640s 0:08.65 13.6% 231+2493k 0+5086io 0pf+0w > > > tom@test-1:/test-> ll -h BIG* > > > -rw-rw-r-- 1 tom claimlynx 92M Oct 17 14:14 BIGGER_PILE_1.zip > > > > > > Thanks! > > > > > > > > > You might find this thread from today interesting. > > http://lists.freebsd.org/pipermail/freebsd-fs/2012-October/015441.html > > > Yes, although I can't explain why Alexey's problem went away > when he went from 9.0->9.1 for his NFS server, it would be > interesting if Thomas could try the same thing? > > About the only thing different between the old and new NFS > clients is the default rsize/wsize. However, if Thomas tried > rsize=32768,wsize=32768 for the default (new) NFS client, then > that would be ruled out. To be honest, the new client uses code > cloned from the old one for all the caching etc (which is where > the clients are "smart"). They use different RPC parsing code, > since the new one does NFSv4 as well, but that code is pretty > straightforward, so I can't think why it would result in a > factor of 3 in performance. > > If Thomas were to capture a packet trace of the above test > for two clients and emailed them to me, I could take a look > and see if I can see what is going on. (For Alexey's case, > it was a whole bunch of Read RPCs without replies, but that > was a Linux client, of course. It also had a significant # of > TCP layer retransmits and out of order TCP segments in it.) > > It would be nice to figure this out, since I was thinking > that the old client might go away for 10.0 (can't if these > issues still exist). > > rick > > > > > Ronald. > > _______________________________________________ > > > > > -- > Thomas Johnson > ClaimLynx, Inc. From owner-freebsd-fs@FreeBSD.ORG Wed Oct 24 08:06:46 2012 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 1E6B1D6 for ; Wed, 24 Oct 2012 08:06:46 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail09.syd.optusnet.com.au (mail09.syd.optusnet.com.au [211.29.132.190]) by mx1.freebsd.org (Postfix) with ESMTP id BF5F18FC14 for ; Wed, 24 Oct 2012 08:06:44 +0000 (UTC) Received: from c122-106-175-26.carlnfd1.nsw.optusnet.com.au (c122-106-175-26.carlnfd1.nsw.optusnet.com.au [122.106.175.26]) by mail09.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q9O86TOQ025384 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 24 Oct 2012 19:06:31 +1100 Date: Wed, 24 Oct 2012 19:06:29 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Rick Macklem Subject: Re: Poor throughput using new NFS client (9.0) vs. old (8.2/9.0) In-Reply-To: <86699361.2739800.1351035439228.JavaMail.root@erie.cs.uoguelph.ca> Message-ID: <20121024180148.L978@besplex.bde.org> References: <86699361.2739800.1351035439228.JavaMail.root@erie.cs.uoguelph.ca> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-Cloudmark-Score: 0 X-Optus-Cloudmark-Analysis: v=2.0 cv=H/GDWJki c=1 sm=1 a=IWGeZYvDLtgA:10 a=kj9zAlcOel0A:10 a=PO7r1zJSAAAA:8 a=JzwRw_2MAAAA:8 a=UB2gkalWg_0A:10 a=wiGRUvtbjSz_psDBSAYA:9 a=CjuIK1q_8ugA:10 a=bxQHXO5Py4tHmhUgaywp5w==:117 Cc: freebsd-fs@FreeBSD.org, Ronald Klop X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 24 Oct 2012 08:06:46 -0000 On Tue, 23 Oct 2012, Rick Macklem wrote: > Thomas Johnson wrote: >> I built a test image based on 9.1-rc2, per your suggestion Rick. The >> results are below. I was not able to exactly reproduce the workload in >> my original message, so I have also included results for the new (very >> similar) workload on my 9.0 client image as well. >> ... >> root@test:/-> mount | grep test >> server:/array/test on /test (nfs) >> root@test:/test-> zip BIGGER_PILE.zip BIG_PILE_53* >> adding: BIG_PILE_5306.zip (stored 0%) >> adding: BIG_PILE_5378.zip (stored 0%) >> adding: BIG_PILE_5386.zip (stored 0%) >> root@test:/test-> ll -h BIGGER_PILE.zip >> -rw-rw-r-- 1 root claimlynx 5.5M Oct 23 14:05 BIGGER_PILE.zip >> root@test:/test-> time zip BIGGER_PILE.zip 53*.zip > /dev/null >> 0.664u 1.693s 0:30.21 7.7% 296+3084k 0+2926io 0pf+0w >> 0.726u 0.989s 0:08.04 21.1% 230+2667k 0+2956io 0pf+0w >> 0.829u 1.268s 0:11.89 17.4% 304+3037k 0+2961io 0pf+0w >> 0.807u 0.902s 0:08.02 21.1% 233+2676k 0+2947io 0pf+0w >> 0.753u 1.354s 0:12.73 16.4% 279+2879k 0+2947io 0pf+0w >> root@test:/test-> ll -h BIGGER_PILE.zip >> -rw-rw-r-- 1 root claimlynx 89M Oct 23 14:03 BIGGER_PILE.zip >> >> [context moved]: >> root@test:/test-> mount | grep test >> server:/array/test on /test (oldnfs) >> root@test:/test-> time zip BIGGER_PILE.zip 53*.zip > /dev/null >> 0.645u 1.435s 0:08.05 25.7% 295+3044k 0+5299io 0pf+0w >> 0.783u 0.993s 0:06.48 27.3% 225+2499k 0+5320io 0pf+0w >> 0.787u 1.000s 0:06.28 28.3% 246+2884k 0+5317io 0pf+0w >> 0.707u 1.392s 0:07.94 26.3% 266+2743k 0+5313io 0pf+0w >> 0.709u 1.056s 0:06.08 28.7% 246+2814k 0+5318io 0pf+0w >> > Although the runs take much longer (I have no idea why and hopefully > I can spot something in the packet traces), it shows about half the > I/O ops. The variance is also much larger. oldnfs takes about 27% of the CPU[s] in all cases, while newnfs takes between 7.7% and 21.1%, with the difference being mainly due to the extra time taken by newnfs. It looks like newnfs is stalling and doing nothing much in the extra time. > This suggests that it is running at the 64K rsize, wsize > instead of the 32K used by the old client. Even 32K is too large for me, but newfs for ffs now defaults to the same broken 32K. The comment in sys/param.h still says that the normal size is 8K, and the buffer cache is still tuned for this size Sizes larger than 8K are supported up to 16K, at a cost of wasting up to half of the buffer cache for sizes of 8K (or 31/32 of the buffer cache for the minimum size of 512 bytes). Ones larger than 16K can cause severe buffer cache kva fragmentation. However, I haven't seen the expected large performance losses from the fragmentation for more than 10 years, except possibly with block sizes of 64K and with mixtures of file systems with very different block sizes. Even with oldnfs, I saw mysterious dependencies on the block size, and almost understood them at one point. Smaller block sizes tend to reduce stalls, but when they are too small there are larger sources of lack of performance. When stalls occured, I was able to see them easily for large files (~1GB) by watching network throughput using netstat -I 1. On my low-end hardware, nfs could saturate the link to not quite achieve the disk i/o speed of 45-55 MB/S when sending a single large file. It got within 5% of that when it didn't stall. When a stall occured, the network traffic dropped to almost none for a second or more, and the worst results were when it stalled for several seconds instead of only 1. Some stalls were caused by the server's caches filling up. Then the sender must stall since there is nowhere to put its data. Any stall reduces throughput, so nfs should try not to write so fast that stalls occur. But stalls should only reduce the throughput by a small percentage, with low variance. To get the above large variance from stalls, there must be a problem restarting promptly after a stall. > Just to confirm. Did you run a test using the new nfs client > with rsize=32768,wsize=32768 mount options, so the I/O size is > the same as with the old client? I also tested with udp. udp tends to be faster iff there are no lost packets, and my LAN rarely loses packets. With very old nfs clients, there are different bugs affecting udp and tcp that give very confusing differences for different packet sizes. Another detail that I never understood is that rsize != wsize generally works worse that rsize == wsize, even in the direction that you think you are optimizing by increasing of decreasing the size for. >>>> We >>>> tend >>>> to deal with a huge number of tiny files (several KB in size). The >>>> NFS >>>> server has been running 9.0 for some time (prior to the client >>>> upgrade) >>>> without any issue. NFS is served from a zpool, backed by a Dell >>>> MD3000, >>>> populated with 15k SAS disks. Clients and server are connected >>>> with >>>> Gig-E >>>> links. The general hardware configuration has not changed in >>>> nearly >>>> 3 >>>> years. I mainly tested throughput for large files. For small files, I optimize for latency instead of throughput by reducing interrupt moderation as much as possible. My LAN mostly has (ping) latencies of 100 usec when undoaded, but I can get this down to 50-60 by tuning. >>>> As an example of the performance difference, here is some of the >>>> testing >>>> I >>>> did while troubleshooting. Given a directory containing 5671 zip >>>> files, >>>> with an average size of 15KB. I append all files to an existing >>>> zip >>>> file. >>>> Using the newnfs mount, I found that this operation generally >>>> takes >>>> ~30 >>>> seconds (wall time). Switching the mount to oldnfs resulted in the >>>> same >>>> operation taking ~10 seconds. Mixed file sizes exercise the buffer cache fragmentation. I think that if most are < 16K, buffers of size <= 16K are allocated for them (unless nfs always allocates its r or w size). The worst case is if you have all buffers in use, with each having 16K of kva. Then to get a 32K buffer, the system has to free 2 contiguous 16K ones (with the first on a 32K boundary). It has to do extra vm searching and vm remapping operations for this, compared with using 16K buffers throughout -- then the system just frees the LRU buffer and uses it with its kva mapping unchanged. Usually the search succeeds and just takes more CPU, but sometimes it fails and the system has to sleep waiting for kva. The sleep message for this stall is "nbufkv". In other tests, I see the buffer cache stalling for several seconds (with high variance) when writing to slow media like dvds. The buffer cache certainly fills up in this cases, but it seems suboptimal to stall for several seconds waiting for a single buffer (I think it means that all buffers are in use for writing and the disk hardware doesn't report any completions for several seconds). Stalling for a much shorter time more often would be little different for throughput to a dvd, but better for nfs. Summary: I think reducing the block size should fix your problem, but larger block sizes shouldn't work so badly and shouldn't be the default when they do. Bruce From owner-freebsd-fs@FreeBSD.ORG Wed Oct 24 19:36:10 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 4C5345D8 for ; Wed, 24 Oct 2012 19:36:10 +0000 (UTC) (envelope-from lists@yamagi.org) Received: from mail.yamagi.org (mail.yamagi.org [IPv6:2a01:4f8:121:2102:1::7]) by mx1.freebsd.org (Postfix) with ESMTP id D073F8FC17 for ; Wed, 24 Oct 2012 19:36:09 +0000 (UTC) Received: from happy.home.yamagi.org (hmbg-4d06dee4.pool.mediaWays.net [77.6.222.228]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.yamagi.org (Postfix) with ESMTPSA id 6CAB41666312; Wed, 24 Oct 2012 21:36:07 +0200 (CEST) Date: Wed, 24 Oct 2012 21:36:02 +0200 From: Yamagi Burmeister To: rmacklem@uoguelph.ca Subject: Re: Can not read from ZFS exported over NFSv4 but write to it Message-Id: <20121024213602.b727c557f0332f28a66f87cc@yamagi.org> In-Reply-To: <1579346453.2736080.1351029315835.JavaMail.root@erie.cs.uoguelph.ca> References: <20121023204623.a1eef4f99b5f786050229b6c@yamagi.org> <1579346453.2736080.1351029315835.JavaMail.root@erie.cs.uoguelph.ca> X-Mailer: Sylpheed 3.2.0 (GTK+ 2.24.6; amd64-portbld-freebsd9.0) Mime-Version: 1.0 Content-Type: multipart/signed; protocol="application/pgp-signature"; micalg="PGP-SHA1"; boundary="Signature=_Wed__24_Oct_2012_21_36_02_+0200_X.MkMC0OSo=8XO+b" Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 24 Oct 2012 19:36:10 -0000 --Signature=_Wed__24_Oct_2012_21_36_02_+0200_X.MkMC0OSo=8XO+b Content-Type: text/plain; charset=US-ASCII Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hello On Tue, 23 Oct 2012 17:55:15 -0400 (EDT) Rick Macklem wrote: > > # ZFS > > /usr/home/yamagi > > # UFS > > /mnt > > V4: / -sec=3Dsys 192.168.0.13 > >=20 > For ZFS, all volumes down to yamagi must be exported. > You don't show what your ZFS setup is, but you either > need to export "home" and "usr" if those are ZFS volumes. > (The above /etc/exports would be ok, only if /, /usr and > /home are all UFS volumes and /usr/home/yamagi is the root > of a ZFS volume.) For UFS, non-exported volumes can be > traversed by "mount", but for ZFS that is not the case. >=20 > The only way I know of to fix this inconsistency is to > disable the traversal capability for UFS, but that would > be a POLA violation, so the inconsistency (caused by ZFS > checking exports itself instead of leaving to the VFS layer) > remains. >=20 > OR > you can specify the root of V4 in the exported volume. > For example, you could: > # ZFS > /usr/home/yamagi > V4: /usr/home/yamagi -sec=3Dsys 192.168.0.13 >=20 > And then the client mount would be: > a:/ on /mnt > since "/" would be at /usr/home/yamagi. (If you do this, > the /mnt UFS volume wouldn't be mountable via NFSv4.) Okay, I didn't know that. What about adding a small notice to the nfsv4 (4) manpage to put users into the right direction? A correct /etc/exports didn't solve the problem. So I took some tcpdumps, while analyzing them I noticed that packages send by client never arived at the server. After I changed the NIC (I was using a rather cheap age(4) onboard NIC) everything worked okay. Apparently NFSv4 exhibited a bug in the driver that never showed up before. I'm sorry that i've wasted your time. Thanks again, Yamagi --=20 Homepage: www.yamagi.org XMPP: yamagi@yamagi.org GnuPG/GPG: 0xEFBCCBCB --Signature=_Wed__24_Oct_2012_21_36_02_+0200_X.MkMC0OSo=8XO+b Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iEYEARECAAYFAlCIQyYACgkQWTjlg++8y8vZ4wCePa+YOukPlFNXzexcRhcDSExW FXUAmwWs0HE4FSOHLjbnGU6BSgqLxVXl =0fl6 -----END PGP SIGNATURE----- --Signature=_Wed__24_Oct_2012_21_36_02_+0200_X.MkMC0OSo=8XO+b-- From owner-freebsd-fs@FreeBSD.ORG Wed Oct 24 23:56:56 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id C44F32B9 for ; Wed, 24 Oct 2012 23:56:56 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id 7CC858FC14 for ; Wed, 24 Oct 2012 23:56:56 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ap4EAOJ+iFCDaFvO/2dsb2JhbABEhhS9A4IeAQEEASMmMAUWDgoCAg0ZAlkGE4d+BqoZknOBIIpBhVqBEwOVc5A5gwuBfQ X-IronPort-AV: E=Sophos;i="4.80,643,1344225600"; d="scan'208";a="185108569" Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-jnhn-pri.mail.uoguelph.ca with ESMTP; 24 Oct 2012 19:56:48 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id A4DF1B405C; Wed, 24 Oct 2012 19:56:48 -0400 (EDT) Date: Wed, 24 Oct 2012 19:56:48 -0400 (EDT) From: Rick Macklem To: Yamagi Burmeister Message-ID: <1319632362.2803236.1351123008648.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: <20121024213602.b727c557f0332f28a66f87cc@yamagi.org> Subject: Re: Can not read from ZFS exported over NFSv4 but write to it MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.203] X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692) Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 24 Oct 2012 23:56:56 -0000 Yamagi Burmeister wrote: > Hello > > On Tue, 23 Oct 2012 17:55:15 -0400 (EDT) > Rick Macklem wrote: > > > > # ZFS > > > /usr/home/yamagi > > > # UFS > > > /mnt > > > V4: / -sec=sys 192.168.0.13 > > > > > For ZFS, all volumes down to yamagi must be exported. > > You don't show what your ZFS setup is, but you either > > need to export "home" and "usr" if those are ZFS volumes. > > (The above /etc/exports would be ok, only if /, /usr and > > /home are all UFS volumes and /usr/home/yamagi is the root > > of a ZFS volume.) For UFS, non-exported volumes can be > > traversed by "mount", but for ZFS that is not the case. > > > > The only way I know of to fix this inconsistency is to > > disable the traversal capability for UFS, but that would > > be a POLA violation, so the inconsistency (caused by ZFS > > checking exports itself instead of leaving to the VFS layer) > > remains. > > > > OR > > you can specify the root of V4 in the exported volume. > > For example, you could: > > # ZFS > > /usr/home/yamagi > > V4: /usr/home/yamagi -sec=sys 192.168.0.13 > > > > And then the client mount would be: > > a:/ on /mnt > > since "/" would be at /usr/home/yamagi. (If you do this, > > the /mnt UFS volume wouldn't be mountable via NFSv4.) > > Okay, I didn't know that. What about adding a small notice to the > nfsv4 > (4) manpage to put users into the right direction? > Yep, both nfsv4(4) and exports(5) should be fixed for this. (The current man pages were written for non-ZFS cases, because I didn't realize ZFS would be different;-) > A correct /etc/exports didn't solve the problem. So I took some > tcpdumps, while analyzing them I noticed that packages send by client > never arived at the server. After I changed the NIC (I was using a > rather cheap age(4) onboard NIC) everything worked okay. Apparently > NFSv4 exhibited a bug in the driver that never showed up before. Yep, a majority of NFS issues that I've looked at (in particular ones related to terrible performance) have been a network fabric problem and most often the NIC/NIC driver. One common area of difficulties is TSO, so if you wanted to, you could try age(4) again, but with TSO disabled. > I'm > sorry that i've wasted your time. > Not at all wasted, except that my suggestion didn't help;-) Glad you figured it out and let us know. Have fun with it, rick > Thanks again, > Yamagi > > -- > Homepage: www.yamagi.org > XMPP: yamagi@yamagi.org > GnuPG/GPG: 0xEFBCCBCB From owner-freebsd-fs@FreeBSD.ORG Thu Oct 25 02:03:54 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 45D12A08 for ; Thu, 25 Oct 2012 02:03:54 +0000 (UTC) (envelope-from pyunyh@gmail.com) Received: from mail-pa0-f54.google.com (mail-pa0-f54.google.com [209.85.220.54]) by mx1.freebsd.org (Postfix) with ESMTP id 0DAF68FC0A for ; Thu, 25 Oct 2012 02:03:53 +0000 (UTC) Received: by mail-pa0-f54.google.com with SMTP id bi1so839127pad.13 for ; Wed, 24 Oct 2012 19:03:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:date:to:cc:subject:message-id:reply-to:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=qdCWmp5nga+mY1Vi1DmJlF6jjWTZUhCXldPJ2M/ymEA=; b=KTFqYM9dE+kQ5q+eDeU7DbRSeZ4lXAo/EIwIHrBAR3Kd7IoxZKZqEl2r3noz7BwgoS YbBlUpIH2XnprPGCNqoIKiK2T/k2Ds54P6DD5u3PJGRFFJ73MW3GFm0CeqeD1daZbG9K Q9ofADDQHNzLPqpIzLm6PFqJJeaa1zHkHjvZ6+Vi66mXoLlqOee6+rYN3vMVnSHNZ/6L PdVPSW8q8CamDwvxYY7aRnaczHfqmqjvzNL3QIFlIGjCXtWOxG3FxDaKPnLx/0R72RGk IKRk8ZxlFJKj6Kdy/u1ha/B8NUQTVHcGibnCEkB5xMvVJBgLGE7bHSyAF90ax1U0iOv4 M+BQ== Received: by 10.68.226.167 with SMTP id rt7mr55755186pbc.94.1351130633326; Wed, 24 Oct 2012 19:03:53 -0700 (PDT) Received: from pyunyh@gmail.com (lpe4.p59-icn.cdngp.net. [114.111.62.249]) by mx.google.com with ESMTPS id pf4sm9492879pbc.38.2012.10.24.19.03.49 (version=TLSv1/SSLv3 cipher=OTHER); Wed, 24 Oct 2012 19:03:51 -0700 (PDT) Received: by pyunyh@gmail.com (sSMTP sendmail emulation); Thu, 25 Oct 2012 11:03:44 -0700 From: YongHyeon PYUN Date: Thu, 25 Oct 2012 11:03:44 -0700 To: Yamagi Burmeister Subject: Re: Can not read from ZFS exported over NFSv4 but write to it Message-ID: <20121025180344.GC3267@michelle.cdnetworks.com> References: <20121023204623.a1eef4f99b5f786050229b6c@yamagi.org> <1579346453.2736080.1351029315835.JavaMail.root@erie.cs.uoguelph.ca> <20121024213602.b727c557f0332f28a66f87cc@yamagi.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20121024213602.b727c557f0332f28a66f87cc@yamagi.org> User-Agent: Mutt/1.4.2.3i Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: pyunyh@gmail.com List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 Oct 2012 02:03:54 -0000 On Wed, Oct 24, 2012 at 09:36:02PM +0200, Yamagi Burmeister wrote: > Hello > > On Tue, 23 Oct 2012 17:55:15 -0400 (EDT) > Rick Macklem wrote: > > > > # ZFS > > > /usr/home/yamagi > > > # UFS > > > /mnt > > > V4: / -sec=sys 192.168.0.13 > > > > > For ZFS, all volumes down to yamagi must be exported. > > You don't show what your ZFS setup is, but you either > > need to export "home" and "usr" if those are ZFS volumes. > > (The above /etc/exports would be ok, only if /, /usr and > > /home are all UFS volumes and /usr/home/yamagi is the root > > of a ZFS volume.) For UFS, non-exported volumes can be > > traversed by "mount", but for ZFS that is not the case. > > > > The only way I know of to fix this inconsistency is to > > disable the traversal capability for UFS, but that would > > be a POLA violation, so the inconsistency (caused by ZFS > > checking exports itself instead of leaving to the VFS layer) > > remains. > > > > OR > > you can specify the root of V4 in the exported volume. > > For example, you could: > > # ZFS > > /usr/home/yamagi > > V4: /usr/home/yamagi -sec=sys 192.168.0.13 > > > > And then the client mount would be: > > a:/ on /mnt > > since "/" would be at /usr/home/yamagi. (If you do this, > > the /mnt UFS volume wouldn't be mountable via NFSv4.) > > Okay, I didn't know that. What about adding a small notice to the nfsv4 > (4) manpage to put users into the right direction? > > A correct /etc/exports didn't solve the problem. So I took some > tcpdumps, while analyzing them I noticed that packages send by client > never arived at the server. After I changed the NIC (I was using a > rather cheap age(4) onboard NIC) everything worked okay. Apparently > NFSv4 exhibited a bug in the driver that never showed up before. I'm age(4) is cheap and consumer grade controller but it shows good performance on various network loads. It's much better choice than using other cheap controllers. Would you show me dmesg output(age(4) and atphy(4) only)? And try disabling TSO or TX checksum offloading and see whether that makes any difference. I remember age(4) had a 64bit DMA bug but it was fixed long time ago. > sorry that i've wasted your time. > > Thanks again, > Yamagi From owner-freebsd-fs@FreeBSD.ORG Thu Oct 25 15:34:13 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 3555AFAA for ; Thu, 25 Oct 2012 15:34:13 +0000 (UTC) (envelope-from freebsd@pki2.com) Received: from btw.pki2.com (btw.pki2.com [IPv6:2001:470:a:6fd::2]) by mx1.freebsd.org (Postfix) with ESMTP id EC3628FC08 for ; Thu, 25 Oct 2012 15:34:12 +0000 (UTC) Received: from [127.0.0.1] (localhost [127.0.0.1]) by btw.pki2.com (8.14.5/8.14.5) with ESMTP id q9PFY1MU006537; Thu, 25 Oct 2012 08:34:01 -0700 (PDT) (envelope-from freebsd@pki2.com) Subject: stable/9 + ZFS IS NOT ready, me thinks From: Dennis Glatting To: freebsd-fs@freebsd.org Content-Type: text/plain; charset="ISO-8859-1" Date: Thu, 25 Oct 2012 08:34:01 -0700 Message-ID: <1351179241.12775.34.camel@btw.pki2.com> Mime-Version: 1.0 X-Mailer: Evolution 2.32.1 FreeBSD GNOME Team Port Content-Transfer-Encoding: 7bit X-yoursite-MailScanner-Information: Dennis Glatting X-yoursite-MailScanner-ID: q9PFY1MU006537 X-yoursite-MailScanner: Found to be clean X-MailScanner-From: freebsd@pki2.com X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 Oct 2012 15:34:13 -0000 At least that is what I suspect. As I have previously mentioned, I have five servers with stable/9 running ZFS. Four are AMD systems (similar but not identical) and the fifth Intel. The AMD systems are the workhorses. The AMDs have a long history of stalling under load. Specifically, the kernel, keyboard, display, and network I/O are still there, but the disks are stalled across all volumes, arrays, and disks (e.g., if I enter a command not on the disks, such as on a memory disk, and statically linked, the command will run, otherwise the command DOES NOT run). Over the last week I changed operating systems on two of these systems. System #1 I downgraded to stable/8. System #3 I installed CentOS 6.3 ZFS-on-Linux (ZoL). These two systems have been running the same job (2d17h on the first and 3d on the second) without trouble. Previously System #1 would have within 48 hours, typically less than 12, and System #3 would spontaneously reboot whenever I tried to send a data set via "zfs send" to it. On System #1 I found one of the OS disks, a hardware RAID1 array, was toast. I found and replaced that disk before I installed 8.3. You can argue the problem with stable/9 was that disk but I don't believe it because I have the SAME problem across all four systems. When a new set of disks arrive I plan to re-introduce stable/9 to that system to see if the faulting returns. Also, smartd says I need to update the firmware in some of my disks, which I plan to do this weekend (below). Under ZoL and 8.3 the systems are more responsive than stable/9. For example, a "ls" of the busy data set returns data MUCH more quickly under ZoL and 8.3. Under stable/9 it sputters out the data. Here is the current load on System #1: mc# top last pid: 53918; load averages: 73.73, 73.08, 72.81 up 2+17:58:24 08:16:47 61 processes: 10 running, 51 sleeping CPU: 11.4% user, 46.0% nice, 42.6% system, 0.1% interrupt, 0.0% idle Mem: 702M Active, 1003M Inact, 35G Wired, 160K Cache, 88M Buf, 88G Free ARC: 32G Total, 3594M MRU, 27G MFU, 32M Anon, 581M Header, 562M Other Swap: 233G Total, 233G Free mc# zpool list NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT disk-1 16.2T 6.57T 9.68T 40% 1.33x ONLINE - disk-2 3.62T 3.63G 3.62T 0% 1.00x ONLINE - All of the data is going onto disk-1 which had under 10GB when I started the job. Here is System #3, running the same job but has only 25% of the cores as System #1: [root@rotfl ~]# top top - 08:19:13 up 3 days, 16:13, 7 users, load average: 94.61, 94.57, 100.94 Tasks: 710 total, 10 running, 700 sleeping, 0 stopped, 0 zombie Cpu(s): 13.3%us, 4.4%sy, 82.2%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.1%si, 0.0%st Mem: 65951592k total, 39561920k used, 26389672k free, 154372k buffers Swap: 134217720k total, 0k used, 134217720k free, 377996k cached [root@rotfl ~]# zpool list NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT disk-1 16.2T 6.72T 9.53T 41% 1.00x ONLINE - disk-2 1.81T 3.24G 1.81T 0% 1.00x ONLINE - Like System #1, the data is going to disk-1 which also had less than 10GB when started. I am working on getting many TB of data off one of the remaining two stable/9 systems for more experimentation but the system stalls, which makes the process a bit cumbersome. I strongly suspect a contributing factor is the system cron scripts that run at night. Finally, as I have also previously mentioned, I am NOT the only one having this problem. One individual stated that he did update his BIOS, his controller firmware, and disk firmware but that didn't help. I am happy to work with FreeBSD component knowledgeable folks but only one stepped forward. From owner-freebsd-fs@FreeBSD.ORG Thu Oct 25 17:17:52 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id CC8D2D46 for ; Thu, 25 Oct 2012 17:17:52 +0000 (UTC) (envelope-from lists@yamagi.org) Received: from mail.yamagi.org (mail.yamagi.org [IPv6:2a01:4f8:121:2102:1::7]) by mx1.freebsd.org (Postfix) with ESMTP id 590F28FC08 for ; Thu, 25 Oct 2012 17:17:52 +0000 (UTC) Received: from happy.home.yamagi.org (hmbg-4d06f908.pool.mediaWays.net [77.6.249.8]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.yamagi.org (Postfix) with ESMTPSA id A384E1666312; Thu, 25 Oct 2012 19:17:50 +0200 (CEST) Date: Thu, 25 Oct 2012 19:17:45 +0200 From: Yamagi Burmeister To: pyunyh@gmail.com Subject: Re: Can not read from ZFS exported over NFSv4 but write to it Message-Id: <20121025191745.7f6a7582d4401de467d3fe18@yamagi.org> In-Reply-To: <20121025180344.GC3267@michelle.cdnetworks.com> References: <20121023204623.a1eef4f99b5f786050229b6c@yamagi.org> <1579346453.2736080.1351029315835.JavaMail.root@erie.cs.uoguelph.ca> <20121024213602.b727c557f0332f28a66f87cc@yamagi.org> <20121025180344.GC3267@michelle.cdnetworks.com> X-Mailer: Sylpheed 3.2.0 (GTK+ 2.24.6; amd64-portbld-freebsd9.0) Mime-Version: 1.0 Content-Type: multipart/signed; protocol="application/pgp-signature"; micalg="PGP-SHA1"; boundary="Signature=_Thu__25_Oct_2012_19_17_45_+0200_cvMMz+uEE_Kf6=kL" Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 Oct 2012 17:17:53 -0000 --Signature=_Thu__25_Oct_2012_19_17_45_+0200_cvMMz+uEE_Kf6=kL Content-Type: text/plain; charset=US-ASCII Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hello :) On Thu, 25 Oct 2012 11:03:44 -0700 YongHyeon PYUN wrote: > age(4) is cheap and consumer grade controller but it shows > good performance on various network loads. It's much better choice > than using other cheap controllers. >=20 > Would you show me dmesg output(age(4) and atphy(4) only)? > And try disabling TSO or TX checksum offloading and see whether > that makes any difference. > I remember age(4) had a 64bit DMA bug but it was fixed long > time ago. Yeah I was the one who reported it. This is the same machine... If disabled TSO and it seems to work. I've copied ~10GB from the client to the server and the other way round without any problem. The dmesg (with verbose boot enabled) is: age0: mem 0xfeac0000-0xfeafffff irq 18 at device 0.0 on pci2 age0: PCI device revision : 0x00b0 age0: Chip id/revision : 0x9006 age0: 1280 Tx FIFO, 2364 Rx FIFO age0: MSIX count : 0 age0: MSI count : 1 age0: attempting to allocate 1 MSI vectors (1 supported) msi: routing MSI IRQ 256 to local APIC 0 vector 48 age0: using IRQ 256 for MSI age0: Using 1 MSI messages. age0: Read request size : 512 bytes. age0: TLP payload size : 128 bytes. age0: 4GB boundary crossed, switching to 32bit DMA addressing mode. age0: PCI VPD capability not found! miibus0: on age0 atphy0: PHY 0 on miibus0 atphy0: OUI 0x00c82e, model 0x0001, rev. 5 atphy0: none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT-FDX, 1000baseT-FDX-master, auto age0: bpf attached age0: Ethernet address: 00:23:54:31:a0:12 pci3: driver added age0: link state changed to DOWN age0: interrupt moderation is 100 us. age0: link state changed to UP Ciao, Yamagi --=20 Homepage: www.yamagi.org XMPP: yamagi@yamagi.org GnuPG/GPG: 0xEFBCCBCB --Signature=_Thu__25_Oct_2012_19_17_45_+0200_cvMMz+uEE_Kf6=kL Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iEYEARECAAYFAlCJdD0ACgkQWTjlg++8y8tx1ACfdN/wQaAegdwxCMISn1/0KAFP YjoAn0+bquUz/FNazSsLKw+e1JeMuUAk =SY0g -----END PGP SIGNATURE----- --Signature=_Thu__25_Oct_2012_19_17_45_+0200_cvMMz+uEE_Kf6=kL-- From owner-freebsd-fs@FreeBSD.ORG Thu Oct 25 19:41:32 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 6A82650E for ; Thu, 25 Oct 2012 19:41:32 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id 1C3A68FC14 for ; Thu, 25 Oct 2012 19:41:31 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ap4EAIyUiVCDaFvO/2dsb2JhbABEhha9KYIeAQEEASMmMAUWDgoCAg0ZAiM2BhOHcgMJBqtJiRYNiVSBIIlaZ4VagRMDlCGBVYsthRCDC4F9 X-IronPort-AV: E=Sophos;i="4.80,650,1344225600"; d="scan'208";a="185231321" Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-jnhn-pri.mail.uoguelph.ca with ESMTP; 25 Oct 2012 15:41:30 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 831C9B4069; Thu, 25 Oct 2012 15:41:30 -0400 (EDT) Date: Thu, 25 Oct 2012 15:41:30 -0400 (EDT) From: Rick Macklem To: Yamagi Burmeister Message-ID: <974991789.2863688.1351194090522.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: <20121025191745.7f6a7582d4401de467d3fe18@yamagi.org> Subject: Re: Can not read from ZFS exported over NFSv4 but write to it MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.201] X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692) Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 Oct 2012 19:41:32 -0000 Yamagi Burmeister wrote: > Hello :) > > On Thu, 25 Oct 2012 11:03:44 -0700 > YongHyeon PYUN wrote: > > > age(4) is cheap and consumer grade controller but it shows > > good performance on various network loads. It's much better choice > > than using other cheap controllers. > > > > Would you show me dmesg output(age(4) and atphy(4) only)? > > And try disabling TSO or TX checksum offloading and see whether > > that makes any difference. > > I remember age(4) had a 64bit DMA bug but it was fixed long > > time ago. > > Yeah I was the one who reported it. This is the same machine... > If disabled TSO and it seems to work. I've copied ~10GB from > the client to the server and the other way round without any > problem. > Just in case it might be related, the client will sosend() a segment just a little over 64Kbytes in size for writes. (I vaguely remember a recent thread related to TSO of segments a little over 64Kbytes. I don't know if that issue was specific to one type of network interface.) Good luck with it, rick > The dmesg (with verbose boot enabled) is: > > age0: mem > 0xfeac0000-0xfeafffff irq 18 at device 0.0 on pci2 age0: PCI device > revision : 0x00b0 age0: Chip id/revision : 0x9006 > age0: 1280 Tx FIFO, 2364 Rx FIFO > age0: MSIX count : 0 > age0: MSI count : 1 > age0: attempting to allocate 1 MSI vectors (1 supported) > msi: routing MSI IRQ 256 to local APIC 0 vector 48 > age0: using IRQ 256 for MSI > age0: Using 1 MSI messages. > age0: Read request size : 512 bytes. > age0: TLP payload size : 128 bytes. > age0: 4GB boundary crossed, switching to 32bit DMA addressing mode. > age0: PCI VPD capability not found! > miibus0: on age0 > atphy0: PHY 0 on miibus0 > atphy0: OUI 0x00c82e, model 0x0001, rev. 5 > atphy0: none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, > 1000baseT-FDX, 1000baseT-FDX-master, auto age0: bpf attached > age0: Ethernet address: 00:23:54:31:a0:12 > pci3: driver added > age0: link state changed to DOWN > age0: interrupt moderation is 100 us. > age0: link state changed to UP > > Ciao, > Yamagi > > -- > Homepage: www.yamagi.org > XMPP: yamagi@yamagi.org > GnuPG/GPG: 0xEFBCCBCB From owner-freebsd-fs@FreeBSD.ORG Thu Oct 25 20:18:47 2012 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id AB961D31; Thu, 25 Oct 2012 20:18:47 +0000 (UTC) (envelope-from crees@FreeBSD.org) Received: from freefall.freebsd.org (freefall.FreeBSD.org [8.8.178.135]) by mx1.freebsd.org (Postfix) with ESMTP id 7C0FB8FC12; Thu, 25 Oct 2012 20:18:47 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q9PKIloB000708; Thu, 25 Oct 2012 20:18:47 GMT (envelope-from crees@freefall.freebsd.org) Received: (from crees@localhost) by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q9PKIles000704; Thu, 25 Oct 2012 20:18:47 GMT (envelope-from crees) Date: Thu, 25 Oct 2012 20:18:47 GMT Message-Id: <201210252018.q9PKIles000704@freefall.freebsd.org> To: crees@FreeBSD.org, freebsd-rc@FreeBSD.org, freebsd-fs@FreeBSD.org From: crees@FreeBSD.org Subject: Re: conf/144213: [rc.d] [patch] Disappearing zvols on reboot X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 Oct 2012 20:18:47 -0000 Synopsis: [rc.d] [patch] Disappearing zvols on reboot Responsible-Changed-From-To: freebsd-rc->freebsd-fs Responsible-Changed-By: crees Responsible-Changed-When: Thu Oct 25 20:18:39 UTC 2012 Responsible-Changed-Why: Do you guys agree with this analysis? http://www.freebsd.org/cgi/query-pr.cgi?pr=144213 From owner-freebsd-fs@FreeBSD.ORG Thu Oct 25 21:21:28 2012 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 9DE35E9E; Thu, 25 Oct 2012 21:21:28 +0000 (UTC) (envelope-from prvs=164559abdc=killing@multiplay.co.uk) Received: from mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23]) by mx1.freebsd.org (Postfix) with ESMTP id 32B7B8FC0A; Thu, 25 Oct 2012 21:21:24 +0000 (UTC) Received: from r2d2 ([188.220.16.49]) by mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23]) (MDaemon PRO v10.0.4) with ESMTP id md50000831393.msg; Thu, 25 Oct 2012 22:21:15 +0100 X-Spam-Processed: mail1.multiplay.co.uk, Thu, 25 Oct 2012 22:21:15 +0100 (not processed: message from valid local sender) X-MDRemoteIP: 188.220.16.49 X-Return-Path: prvs=164559abdc=killing@multiplay.co.uk X-Envelope-From: killing@multiplay.co.uk Message-ID: <3E6135761A1D4A10A573A86B4E76B8BB@multiplay.co.uk> From: "Steven Hartland" To: , , References: <201210252018.q9PKIles000704@freefall.freebsd.org> Subject: Re: conf/144213: [rc.d] [patch] Disappearing zvols on reboot Date: Thu, 25 Oct 2012 22:21:03 +0100 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=original Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.5931 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6157 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 Oct 2012 21:21:28 -0000 I may be missing something but there is no "zfs volinit" here on 8.3-RELEASE. I've also tested rebooting with a zvol and its present after reboot. Checked HEAD no volinit in zfs_main.c either, so I suspect this is an old issue no longer present. ----- Original Message ----- From: To: ; ; Sent: Thursday, October 25, 2012 9:18 PM Subject: Re: conf/144213: [rc.d] [patch] Disappearing zvols on reboot > Synopsis: [rc.d] [patch] Disappearing zvols on reboot > > Responsible-Changed-From-To: freebsd-rc->freebsd-fs > Responsible-Changed-By: crees > Responsible-Changed-When: Thu Oct 25 20:18:39 UTC 2012 > Responsible-Changed-Why: > Do you guys agree with this analysis? > > http://www.freebsd.org/cgi/query-pr.cgi?pr=144213 > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > ================================================ This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmaster@multiplay.co.uk. From owner-freebsd-fs@FreeBSD.ORG Thu Oct 25 21:23:27 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 4A0F72B9; Thu, 25 Oct 2012 21:23:27 +0000 (UTC) (envelope-from utisoft@gmail.com) Received: from mail-bk0-f54.google.com (mail-bk0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id 9F9C28FC16; Thu, 25 Oct 2012 21:23:26 +0000 (UTC) Received: by mail-bk0-f54.google.com with SMTP id jf20so1049333bkc.13 for ; Thu, 25 Oct 2012 14:23:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type; bh=IoX8pBcWAJUJqYBSmNYFTD+/nZQUcaiy52iDB63ZjQ0=; b=kyRjquWOOvGkjjuC2k8uCkYgtULRJ7RKyDd4KziOzs2lqAsmGryzhZxh5OKCPJPfiv YIcoyQeHO1zYApkP/iKPHYd5kEu+Jw2iaRF1e9G8Z9uHvUcHo0fVX6mIikeiU43MDAXp 8Lv2nBBX2Hch2y7BvXIj9BBLJZnehrBDjiD00SkXFO2maL0eN5UxWb0Ky+Swzs2lEOhT F8nRPDKTzXvsPVIlTF8HqJ0PwGLQevtWDXd9z2Xr1YmH2XolJFx2KkrK5yn8Yt8p3rBc Z2NTMuaDq2fVDtKOCT/ToQGPBr4y9hH9fj/Xfp5dvuhem4miROChg3bBk13U8O+9oqXP OJJw== Received: by 10.204.156.74 with SMTP id v10mr6635761bkw.39.1351200205806; Thu, 25 Oct 2012 14:23:25 -0700 (PDT) MIME-Version: 1.0 Sender: utisoft@gmail.com Received: by 10.204.50.197 with HTTP; Thu, 25 Oct 2012 14:22:55 -0700 (PDT) In-Reply-To: <3E6135761A1D4A10A573A86B4E76B8BB@multiplay.co.uk> References: <201210252018.q9PKIles000704@freefall.freebsd.org> <3E6135761A1D4A10A573A86B4E76B8BB@multiplay.co.uk> From: Chris Rees Date: Thu, 25 Oct 2012 22:22:55 +0100 X-Google-Sender-Auth: WemdCWyV124dqLcigaJsUwY4Bh4 Message-ID: Subject: Re: conf/144213: [rc.d] [patch] Disappearing zvols on reboot To: Steven Hartland Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-fs@freebsd.org, freebsd-rc@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 Oct 2012 21:23:27 -0000 On 25 October 2012 22:21, Steven Hartland wrote: > I may be missing something but there is no "zfs volinit" here on > 8.3-RELEASE. I've also tested rebooting with a zvol and its present > after reboot. > > Checked HEAD no volinit in zfs_main.c either, so I suspect this is > an old issue no longer present. I agree too. I'll close this if no-one else speaks up. Thanks! Chris From owner-freebsd-fs@FreeBSD.ORG Fri Oct 26 01:38:57 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 09C2ED58 for ; Fri, 26 Oct 2012 01:38:57 +0000 (UTC) (envelope-from pyunyh@gmail.com) Received: from mail-da0-f54.google.com (mail-da0-f54.google.com [209.85.210.54]) by mx1.freebsd.org (Postfix) with ESMTP id C2B6D8FC0C for ; Fri, 26 Oct 2012 01:38:56 +0000 (UTC) Received: by mail-da0-f54.google.com with SMTP id z9so1125078dad.13 for ; Thu, 25 Oct 2012 18:38:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:date:to:cc:subject:message-id:reply-to:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=dLaATWJ20Q/+J86Jz6O7EPB1m4YpK1VdRuZQVBnPfxE=; b=V1NonOilYTj2AQcfGaB+s4/jM/ABdk3YKVkaq87gr3fxs/rjVPMwEtETofQdWUjide rzEUHPRh+wMcrEo5j+KVlXe2Xx5VMjl9R5tG86HW+ivz/oIdn5xYp3AZGWVROw0UjLV6 EcYbMS7UgjCDftXSUYhkHymPeYLdZuHcUYUzqFSvC3RgylGLrQ+V0MdZjYzkKEt0u+y/ Hq8mWDa8lLvARZWiFRACoMBbXNaZ8ujB8DuDRh7mrvkTURiaK31w0La1PuiNF3s4yhej QQTPEjbyKK22t71qQHIXq4IQVbvIghrvO8c0OsXrgDgi9L0AijrpohsMnaNxmSm1UNgT 2Qow== Received: by 10.66.76.98 with SMTP id j2mr57700833paw.65.1351215536075; Thu, 25 Oct 2012 18:38:56 -0700 (PDT) Received: from pyunyh@gmail.com (lpe4.p59-icn.cdngp.net. [114.111.62.249]) by mx.google.com with ESMTPS id a10sm95649paz.35.2012.10.25.18.38.52 (version=TLSv1/SSLv3 cipher=OTHER); Thu, 25 Oct 2012 18:38:54 -0700 (PDT) Received: by pyunyh@gmail.com (sSMTP sendmail emulation); Fri, 26 Oct 2012 10:38:47 -0700 From: YongHyeon PYUN Date: Fri, 26 Oct 2012 10:38:47 -0700 To: Yamagi Burmeister Subject: Re: Can not read from ZFS exported over NFSv4 but write to it Message-ID: <20121026173847.GA3140@michelle.cdnetworks.com> References: <20121023204623.a1eef4f99b5f786050229b6c@yamagi.org> <1579346453.2736080.1351029315835.JavaMail.root@erie.cs.uoguelph.ca> <20121024213602.b727c557f0332f28a66f87cc@yamagi.org> <20121025180344.GC3267@michelle.cdnetworks.com> <20121025191745.7f6a7582d4401de467d3fe18@yamagi.org> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="7JfCtLOvnd9MIVvH" Content-Disposition: inline In-Reply-To: <20121025191745.7f6a7582d4401de467d3fe18@yamagi.org> User-Agent: Mutt/1.4.2.3i Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: pyunyh@gmail.com List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 26 Oct 2012 01:38:57 -0000 --7JfCtLOvnd9MIVvH Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Thu, Oct 25, 2012 at 07:17:45PM +0200, Yamagi Burmeister wrote: > Hello :) > > On Thu, 25 Oct 2012 11:03:44 -0700 > YongHyeon PYUN wrote: > > > age(4) is cheap and consumer grade controller but it shows > > good performance on various network loads. It's much better choice > > than using other cheap controllers. > > > > Would you show me dmesg output(age(4) and atphy(4) only)? > > And try disabling TSO or TX checksum offloading and see whether > > that makes any difference. > > I remember age(4) had a 64bit DMA bug but it was fixed long > > time ago. > > Yeah I was the one who reported it. This is the same machine... > If disabled TSO and it seems to work. I've copied ~10GB from > the client to the server and the other way round without any > problem. > > The dmesg (with verbose boot enabled) is: > > age0: mem > 0xfeac0000-0xfeafffff irq 18 at device 0.0 on pci2 age0: PCI device > revision : 0x00b0 age0: Chip id/revision : 0x9006 > age0: 1280 Tx FIFO, 2364 Rx FIFO > age0: MSIX count : 0 > age0: MSI count : 1 > age0: attempting to allocate 1 MSI vectors (1 supported) > msi: routing MSI IRQ 256 to local APIC 0 vector 48 > age0: using IRQ 256 for MSI > age0: Using 1 MSI messages. > age0: Read request size : 512 bytes. > age0: TLP payload size : 128 bytes. > age0: 4GB boundary crossed, switching to 32bit DMA addressing mode. > age0: PCI VPD capability not found! > miibus0: on age0 > atphy0: PHY 0 on miibus0 > atphy0: OUI 0x00c82e, model 0x0001, rev. 5 > atphy0: none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, > 1000baseT-FDX, 1000baseT-FDX-master, auto age0: bpf attached > age0: Ethernet address: 00:23:54:31:a0:12 > pci3: driver added > age0: link state changed to DOWN > age0: interrupt moderation is 100 us. > age0: link state changed to UP Thanks the info. Would you try attached patch? --7JfCtLOvnd9MIVvH Content-Type: text/x-diff; charset=us-ascii Content-Disposition: attachment; filename="age.tso.diff" Index: sys/dev/age/if_age.c =================================================================== --- sys/dev/age/if_age.c (revision 242114) +++ sys/dev/age/if_age.c (working copy) @@ -1562,8 +1562,12 @@ age_encap(struct age_softc *sc, struct mbuf **m_he *m_head = NULL; return (ENOBUFS); } - ip = (struct ip *)(mtod(m, char *) + ip_off); tcp = (struct tcphdr *)(mtod(m, char *) + poff); + m = m_pullup(m, poff + (tcp->th_off << 2)); + if (m == NULL) { + *m_head = NULL; + return (ENOBUFS); + } /* * L1 requires IP/TCP header size and offset as * well as TCP pseudo checksum which complicates @@ -1578,14 +1582,11 @@ age_encap(struct age_softc *sc, struct mbuf **m_he * Reset IP checksum and recompute TCP pseudo * checksum as NDIS specification said. */ + ip = (struct ip *)(mtod(m, char *) + ip_off); + tcp = (struct tcphdr *)(mtod(m, char *) + poff); ip->ip_sum = 0; - if (poff + (tcp->th_off << 2) == m->m_pkthdr.len) - tcp->th_sum = in_pseudo(ip->ip_src.s_addr, - ip->ip_dst.s_addr, - htons((tcp->th_off << 2) + IPPROTO_TCP)); - else - tcp->th_sum = in_pseudo(ip->ip_src.s_addr, - ip->ip_dst.s_addr, htons(IPPROTO_TCP)); + tcp->th_sum = in_pseudo(ip->ip_src.s_addr, + ip->ip_dst.s_addr, htons(IPPROTO_TCP)); } *m_head = m; } --7JfCtLOvnd9MIVvH-- From owner-freebsd-fs@FreeBSD.ORG Fri Oct 26 04:15:00 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 4100BB72 for ; Fri, 26 Oct 2012 04:15:00 +0000 (UTC) (envelope-from amvandemore@gmail.com) Received: from mail-ie0-f182.google.com (mail-ie0-f182.google.com [209.85.223.182]) by mx1.freebsd.org (Postfix) with ESMTP id 0004D8FC0A for ; Fri, 26 Oct 2012 04:14:59 +0000 (UTC) Received: by mail-ie0-f182.google.com with SMTP id k10so4231723iea.13 for ; Thu, 25 Oct 2012 21:14:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=WIk5lVb7qXGhs2H22javQyci3dtKdSljH3jZpHdHqMw=; b=aUJCx1pOirkgVfys/8f7clYnYofDiU94WCo7mklm9D5c9OSFjBdGi03aBozc0B6U4C DDGu8rpimau0rmbqM95CV8Ym7EYEOX+d55IWTMsXHRVPb5RudkfaIUe6PnnoB2j33DfI UDsIJGNMHccSE4Pm1zprYOeJHfwBVs5FkTthGH46RHDgfhkfOXfjbPe65dO4vr5vB5/9 ymgfULqOGa8DMA1kr4JK7cvrwqHNis4ppcCBQxZIWpEECCOM/IYakUuZpzZGM6eX687S +3D3WB5n+zzhGdoLbDPN0sNzGUX+teNteL6Q22hwwJ51g95gEPlyX+BdZKsMQyDLMyRi +O3A== MIME-Version: 1.0 Received: by 10.42.18.193 with SMTP id y1mr18446145ica.0.1351224898891; Thu, 25 Oct 2012 21:14:58 -0700 (PDT) Received: by 10.64.165.2 with HTTP; Thu, 25 Oct 2012 21:14:58 -0700 (PDT) In-Reply-To: <974991789.2863688.1351194090522.JavaMail.root@erie.cs.uoguelph.ca> References: <20121025191745.7f6a7582d4401de467d3fe18@yamagi.org> <974991789.2863688.1351194090522.JavaMail.root@erie.cs.uoguelph.ca> Date: Thu, 25 Oct 2012 23:14:58 -0500 Message-ID: Subject: Re: Can not read from ZFS exported over NFSv4 but write to it From: Adam Vande More To: Rick Macklem Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 26 Oct 2012 04:15:00 -0000 On Thu, Oct 25, 2012 at 2:41 PM, Rick Macklem wrote: > > Just in case it might be related, the client will sosend() a > segment just a little over 64Kbytes in size for writes. (I vaguely > remember a recent thread related to TSO of segments a little over > 64Kbytes. I don't know if that issue was specific to one type of > network interface.) > For reference, I believe this is what you were referring to: http://lists.freebsd.org/pipermail/freebsd-net/2012-October/033660.html -- Adam Vande More From owner-freebsd-fs@FreeBSD.ORG Fri Oct 26 06:07:40 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 88D12D76 for ; Fri, 26 Oct 2012 06:07:40 +0000 (UTC) (envelope-from lists@yamagi.org) Received: from mail.yamagi.org (mail.yamagi.org [IPv6:2a01:4f8:121:2102:1::7]) by mx1.freebsd.org (Postfix) with ESMTP id 1727B8FC0A for ; Fri, 26 Oct 2012 06:07:39 +0000 (UTC) Received: from happy.home.yamagi.org (hmbg-4d06f908.pool.mediaWays.net [77.6.249.8]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.yamagi.org (Postfix) with ESMTPSA id 87A891666312; Fri, 26 Oct 2012 08:07:37 +0200 (CEST) Date: Fri, 26 Oct 2012 08:07:27 +0200 From: Yamagi Burmeister To: pyunyh@gmail.com Subject: Re: Can not read from ZFS exported over NFSv4 but write to it Message-Id: <20121026080727.af2c29a4eb00b6ef0308eeb2@yamagi.org> In-Reply-To: <20121026173847.GA3140@michelle.cdnetworks.com> References: <20121023204623.a1eef4f99b5f786050229b6c@yamagi.org> <1579346453.2736080.1351029315835.JavaMail.root@erie.cs.uoguelph.ca> <20121024213602.b727c557f0332f28a66f87cc@yamagi.org> <20121025180344.GC3267@michelle.cdnetworks.com> <20121025191745.7f6a7582d4401de467d3fe18@yamagi.org> <20121026173847.GA3140@michelle.cdnetworks.com> X-Mailer: Sylpheed 3.2.0 (GTK+ 2.24.6; amd64-portbld-freebsd9.0) Mime-Version: 1.0 Content-Type: multipart/signed; protocol="application/pgp-signature"; micalg="PGP-SHA1"; boundary="Signature=_Fri__26_Oct_2012_08_07_27_+0200_z_YSY_vj_PzkziSC" Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 26 Oct 2012 06:07:40 -0000 --Signature=_Fri__26_Oct_2012_08_07_27_+0200_z_YSY_vj_PzkziSC Content-Type: text/plain; charset=US-ASCII Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, 26 Oct 2012 10:38:47 -0700 YongHyeon PYUN wrote: =20 > Thanks the info. > Would you try attached patch? Still the same problem. With TSO enabled, the NFS4 mount stalls when writing to it. Without it works as aspected. --=20 Homepage: www.yamagi.org XMPP: yamagi@yamagi.org GnuPG/GPG: 0xEFBCCBCB --Signature=_Fri__26_Oct_2012_08_07_27_+0200_z_YSY_vj_PzkziSC Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iEYEARECAAYFAlCKKKgACgkQWTjlg++8y8sPmwCgpgRQ4ehcfB/CnEN0yB6pi+02 DS4An0CTf6KSD7KAiLbZe0NzIxTym5Ca =YFnj -----END PGP SIGNATURE----- --Signature=_Fri__26_Oct_2012_08_07_27_+0200_z_YSY_vj_PzkziSC-- From owner-freebsd-fs@FreeBSD.ORG Fri Oct 26 06:15:17 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id C0FF497 for ; Fri, 26 Oct 2012 06:15:17 +0000 (UTC) (envelope-from pyunyh@gmail.com) Received: from mail-da0-f54.google.com (mail-da0-f54.google.com [209.85.210.54]) by mx1.freebsd.org (Postfix) with ESMTP id 854DC8FC0A for ; Fri, 26 Oct 2012 06:15:17 +0000 (UTC) Received: by mail-da0-f54.google.com with SMTP id z9so1222378dad.13 for ; Thu, 25 Oct 2012 23:15:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:date:to:cc:subject:message-id:reply-to:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=a/ca8cLoWWft2KRCW1wIAyvITSLw18jfiK/+G/Aa/i8=; b=b/ZvgTDgy0MN3XpGr6Aebf7gP0CpMRsUPZiSgfAv3dMQJqfrLCMjfceVmV2akO2iAX p8MeANdgiqyiiw9J70k01FvrNsnmJcKdvKL7gURBRztD8lQfd/KcbnWGvj3GfJCjUjXa 3Y2Yz+LGrSaLZTsxrdteX0aF3HvFdF7u6/vUEfFPaTHGZbRUGsEeagOoCJwrSY5uwAXN JB2+F2MxUa2tIDwIN4QjYPISFw2G2xiEthBDZUGZ/+4Zp/7kCKeZPT/oqgICXY5Gz58T vP40FJ0+anh7rOymJBHdwlRbDYHKBogWVdLBzJkVV+i9Ikwj0q/Cb4lUr+eT92G85bPi UeNA== Received: by 10.66.74.40 with SMTP id q8mr59375378pav.29.1351232117150; Thu, 25 Oct 2012 23:15:17 -0700 (PDT) Received: from pyunyh@gmail.com (lpe4.p59-icn.cdngp.net. [114.111.62.249]) by mx.google.com with ESMTPS id n7sm481527pav.26.2012.10.25.23.15.14 (version=TLSv1/SSLv3 cipher=OTHER); Thu, 25 Oct 2012 23:15:16 -0700 (PDT) Received: by pyunyh@gmail.com (sSMTP sendmail emulation); Fri, 26 Oct 2012 15:15:08 -0700 From: YongHyeon PYUN Date: Fri, 26 Oct 2012 15:15:08 -0700 To: Yamagi Burmeister Subject: Re: Can not read from ZFS exported over NFSv4 but write to it Message-ID: <20121026221508.GA1463@michelle.cdnetworks.com> References: <20121023204623.a1eef4f99b5f786050229b6c@yamagi.org> <1579346453.2736080.1351029315835.JavaMail.root@erie.cs.uoguelph.ca> <20121024213602.b727c557f0332f28a66f87cc@yamagi.org> <20121025180344.GC3267@michelle.cdnetworks.com> <20121025191745.7f6a7582d4401de467d3fe18@yamagi.org> <20121026173847.GA3140@michelle.cdnetworks.com> <20121026080727.af2c29a4eb00b6ef0308eeb2@yamagi.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20121026080727.af2c29a4eb00b6ef0308eeb2@yamagi.org> User-Agent: Mutt/1.4.2.3i Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: pyunyh@gmail.com List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 26 Oct 2012 06:15:17 -0000 On Fri, Oct 26, 2012 at 08:07:27AM +0200, Yamagi Burmeister wrote: > On Fri, 26 Oct 2012 10:38:47 -0700 > YongHyeon PYUN wrote: > > > Thanks the info. > > Would you try attached patch? > > Still the same problem. With TSO enabled, the NFS4 mount > stalls when writing to it. Without it works as aspected. I have no longer access to age(4) controller so it's hard to verify the issue on my box. Can you post an URL for captured packets on both sender(age(4)) and receiver side with tcpdump? > > -- > Homepage: www.yamagi.org > XMPP: yamagi@yamagi.org > GnuPG/GPG: 0xEFBCCBCB From owner-freebsd-fs@FreeBSD.ORG Fri Oct 26 06:28:47 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 8AF7577A for ; Fri, 26 Oct 2012 06:28:47 +0000 (UTC) (envelope-from lists@yamagi.org) Received: from mail.yamagi.org (mail.yamagi.org [IPv6:2a01:4f8:121:2102:1::7]) by mx1.freebsd.org (Postfix) with ESMTP id 3F37E8FC08 for ; Fri, 26 Oct 2012 06:28:47 +0000 (UTC) Received: from happy.home.yamagi.org (hmbg-4d06f908.pool.mediaWays.net [77.6.249.8]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.yamagi.org (Postfix) with ESMTPSA id 02D8B1666312; Fri, 26 Oct 2012 08:28:45 +0200 (CEST) Date: Fri, 26 Oct 2012 08:28:40 +0200 From: Yamagi Burmeister To: pyunyh@gmail.com Subject: Re: Can not read from ZFS exported over NFSv4 but write to it Message-Id: <20121026082840.0e909c4ea1c694ac99ef39d9@yamagi.org> In-Reply-To: <20121026221508.GA1463@michelle.cdnetworks.com> References: <20121023204623.a1eef4f99b5f786050229b6c@yamagi.org> <1579346453.2736080.1351029315835.JavaMail.root@erie.cs.uoguelph.ca> <20121024213602.b727c557f0332f28a66f87cc@yamagi.org> <20121025180344.GC3267@michelle.cdnetworks.com> <20121025191745.7f6a7582d4401de467d3fe18@yamagi.org> <20121026173847.GA3140@michelle.cdnetworks.com> <20121026080727.af2c29a4eb00b6ef0308eeb2@yamagi.org> <20121026221508.GA1463@michelle.cdnetworks.com> X-Mailer: Sylpheed 3.2.0 (GTK+ 2.24.6; amd64-portbld-freebsd9.0) Mime-Version: 1.0 Content-Type: multipart/signed; protocol="application/pgp-signature"; micalg="PGP-SHA1"; boundary="Signature=_Fri__26_Oct_2012_08_28_40_+0200_MIt+XRAKbFxDAVJI" Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 26 Oct 2012 06:28:47 -0000 --Signature=_Fri__26_Oct_2012_08_28_40_+0200_MIt+XRAKbFxDAVJI Content-Type: text/plain; charset=US-ASCII Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, 26 Oct 2012 15:15:08 -0700 YongHyeon PYUN wrote: > On Fri, Oct 26, 2012 at 08:07:27AM +0200, Yamagi Burmeister wrote: > > On Fri, 26 Oct 2012 10:38:47 -0700 > > YongHyeon PYUN wrote: > > =20 > > > Thanks the info. > > > Would you try attached patch? > >=20 > > Still the same problem. With TSO enabled, the NFS4 mount > > stalls when writing to it. Without it works as aspected. >=20 > I have no longer access to age(4) controller so it's hard to verify > the issue on my box. > Can you post an URL for captured packets on both sender(age(4)) and > receiver side with tcpdump? http://deponie.yamagi.org/freebsd/misc/age0_pcap.tar.xz sender_age0.pcap is the age0 device on the NFS4 server.=20 receiver_em0.pcap is a em0 NIC on the NFS4 client. Those files were created by: 1. Mount the NFS4 export on the client 2. Try to copy a file onto it. 3. Forcefully unmount the NFS4 export on the client --=20 Homepage: www.yamagi.org XMPP: yamagi@yamagi.org GnuPG/GPG: 0xEFBCCBCB --Signature=_Fri__26_Oct_2012_08_28_40_+0200_MIt+XRAKbFxDAVJI Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iEYEARECAAYFAlCKLZ0ACgkQWTjlg++8y8sNwACg1t4Fsc+WmO/N1UtaXD1e+Kut /1MAoINIF7HT0L+Hfdv+jt+b79XN2rAz =mfVW -----END PGP SIGNATURE----- --Signature=_Fri__26_Oct_2012_08_28_40_+0200_MIt+XRAKbFxDAVJI-- From owner-freebsd-fs@FreeBSD.ORG Fri Oct 26 09:00:14 2012 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 6F8A4FB9; Fri, 26 Oct 2012 09:00:14 +0000 (UTC) (envelope-from prvs=1646183d11=killing@multiplay.co.uk) Received: from mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23]) by mx1.freebsd.org (Postfix) with ESMTP id A82068FC1C; Fri, 26 Oct 2012 09:00:12 +0000 (UTC) Received: from r2d2 ([188.220.16.49]) by mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23]) (MDaemon PRO v10.0.4) with ESMTP id md50000835132.msg; Fri, 26 Oct 2012 10:00:11 +0100 X-Spam-Processed: mail1.multiplay.co.uk, Fri, 26 Oct 2012 10:00:11 +0100 (not processed: message from valid local sender) X-MDRemoteIP: 188.220.16.49 X-Return-Path: prvs=1646183d11=killing@multiplay.co.uk X-Envelope-From: killing@multiplay.co.uk Message-ID: <6B3FDA029094479C8FD44EF4B435CBF4@multiplay.co.uk> From: "Steven Hartland" To: "Andriy Gapon" References: <505DF1A3.1020809@FreeBSD.org> <80F518854AE34A759D9441AE1A60D2DC@multiplay.co.uk> <506C4D4F.2090909@FreeBSD.org> Subject: Re: zfs zvol: set geom mediasize right at creation time Date: Fri, 26 Oct 2012 10:00:03 +0100 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=original Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.5931 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6157 Cc: freebsd-fs@FreeBSD.org, freebsd-geom@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 26 Oct 2012 09:00:14 -0000 ----- Original Message ----- From: "Andriy Gapon" > on 23/09/2012 04:04 Steven Hartland said the following: >> Do you know what the effect of the volblocksize change >> will have on a volume who's disk block size changes? >> e.g. via a quirk for a 4k disk being added > > I am not sure that I got your question... > My patch doesn't affect neither volblocksize value nor > disk block size [geom property]. > It changes only stripe size geom property. > >> I ask as we've testing a patch here which changes ashift to >> be based on stripesize instead of sectorsize but in its >> current form it has some odd side effects on pools which >> are boot pools. >> >> Said patch is attached for reference. > > I think that the patch makes sense and would be curious to > learn more about the side-effects. Right finally tracked the problem down. The issue was that opening a pool which has multiple vdev's vdev_open is called where vd->vdev_asize != 0. This means the vd->vdev_ashift calculation is done on existing devices despite the comment:- /* * This is the first-ever open, so use the computed values. * For testing purposes, a higher ashift can be requested. */ Due to the alignment requirement check below this causes the zpool open to fail. I've now fixed this by moving the use dashift into the volume label code which is only run on creation. PR with updated patch submitted:- http://www.freebsd.org/cgi/query-pr.cgi?pr=173115 Regards Steve ================================================ This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmaster@multiplay.co.uk. From owner-freebsd-fs@FreeBSD.ORG Fri Oct 26 13:30:01 2012 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 98AAE9C2 for ; Fri, 26 Oct 2012 13:30:01 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.FreeBSD.org [8.8.178.135]) by mx1.freebsd.org (Postfix) with ESMTP id 7FDBD8FC0C for ; Fri, 26 Oct 2012 13:30:01 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q9QDU1pA005957 for ; Fri, 26 Oct 2012 13:30:01 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q9QDU1uP005954; Fri, 26 Oct 2012 13:30:01 GMT (envelope-from gnats) Date: Fri, 26 Oct 2012 13:30:01 GMT Message-Id: <201210261330.q9QDU1uP005954@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org Cc: From: Martin Birgmeier Subject: Re: kern/136865: [nfs] [patch] NFS exports atomic and on-the-fly atomic updates X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: Martin Birgmeier List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 26 Oct 2012 13:30:01 -0000 The following reply was made to PR kern/136865; it has been noted by GNATS. From: Martin Birgmeier To: Andrey Simonenko , bug-followup@FreeBSD.org Cc: Subject: Re: kern/136865: [nfs] [patch] NFS exports atomic and on-the-fly atomic updates Date: Fri, 26 Oct 2012 15:15:56 +0200 Hi Andrey, Today I started applying your changes and did the following: 1. downloaded nfse-20121025.tar.bz2 from sourceforge 2. read INSTALL-all 3. checked out release/8.2.0 from FreeBSD SVN 4. applied src/cddl.diff ==> this failed 5. checked out head from FreeBSD SVN 6. applied src/cddl diff ==> this failed as well I have imported all nfse patch files from sourceforge in a local mercurial repo to be able to easier follow what is changing. There I see that cddl.diff was updated for the last time on May 17. Could you help me with the following questions: - Is INSTALL-all still relevant, and if yes, for which cases? - What for is cddl.diff? - I am heavily using zfs. Which patches from your patchset do I need to get nfse to fully support zfs? Lastly, I believe it might be more helpful to combine INSTALL-all and INSTALL-kern into a single file INSTALL and in that file clearly point out the differences between the two methods (what does one method give you, what the other, what do I need to do for the first method, what for the other). Regards, Martin On 10/22/12 16:37, Andrey Simonenko wrote: > On Sat, Oct 20, 2012 at 11:43:17AM +0200, Martin Birgmeier wrote: >> Andrey, >> >> I'd really like to use this. However, I need to use it with FreeBSD 7.4, >> 8.2, and 9.0 (and 9.1 in the near future); I tried to backport your >> changes, but this turned out to be too difficult for me. >> > sys.diff for 8.2 in the attachment. From owner-freebsd-fs@FreeBSD.ORG Fri Oct 26 15:49:27 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 67769A0F; Fri, 26 Oct 2012 15:49:27 +0000 (UTC) (envelope-from freebsd@penx.com) Received: from btw.pki2.com (btw.pki2.com [IPv6:2001:470:a:6fd::2]) by mx1.freebsd.org (Postfix) with ESMTP id 239588FC18; Fri, 26 Oct 2012 15:49:27 +0000 (UTC) Received: from [127.0.0.1] (localhost [127.0.0.1]) by btw.pki2.com (8.14.5/8.14.5) with ESMTP id q9QFnK2a075093; Fri, 26 Oct 2012 08:49:20 -0700 (PDT) (envelope-from freebsd@penx.com) Subject: Re: ZFS HBAs + LSI chip sets (Was: ZFS hang (system #2)) From: Dennis Glatting To: John In-Reply-To: <20121023015546.GA60182@FreeBSD.org> References: <50825598.3070505@FreeBSD.org> <1350744349.88577.10.camel@btw.pki2.com> <1350765093.86715.69.camel@btw.pki2.com> <508322EC.4080700@FreeBSD.org> <1350778257.86715.106.camel@btw.pki2.com> <5084F6D5.5080400@digsys.bg> <1350948545.86715.147.camel@btw.pki2.com> <20121023015546.GA60182@FreeBSD.org> Content-Type: text/plain; charset="us-ascii" Date: Fri, 26 Oct 2012 08:49:20 -0700 Message-ID: <1351266560.49566.12.camel@btw.pki2.com> Mime-Version: 1.0 X-Mailer: Evolution 2.32.1 FreeBSD GNOME Team Port Content-Transfer-Encoding: 7bit X-yoursite-MailScanner-Information: Dennis Glatting X-yoursite-MailScanner-ID: q9QFnK2a075093 X-yoursite-MailScanner: Found to be clean X-MailScanner-From: freebsd@penx.com Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 26 Oct 2012 15:49:27 -0000 On Tue, 2012-10-23 at 01:55 +0000, John wrote: > ----- Dennis Glatting's Original Message ----- > > On Mon, 2012-10-22 at 09:31 -0700, Freddie Cash wrote: > > > On Mon, Oct 22, 2012 at 6:47 AM, Freddie Cash wrote: > > > > I'll double-check when I get to work, but I'm pretty sure it's 10.something. > > > > > > mpt(4) on alpha has firmware 1.5.20.0. > > > > > > mps(4) on beta has firmware 09.00.00.00, driver 14.00.00.01-fbsd. > > > > > > mps(4) on omega has firmware 10.00.02.00, driver 14.00.00.01-fbsd. > > > > > > Hope that helps. > > > > > > > Because one of the RAID1 OS disks failed (System #1), I replaced both > > disks and downgraded to stable/8. Two hours ago I submitted a job. > > > > I noticed on boot smartd issued warnings about disk firmware, which I'll > > update this coming weekend, unless the system hangs before then. > > > > I first want to see if that system will also hang under 8.3. I have > > noticed a looping "ls" of the target ZFS directory is MUCH snappier > > under 8.3 than 9.x. > > > > My CentOS 6.3 ZFS-on-Linux system (System #3) is crunching along (24 > > hours now). This system under stable/9 would previously spontaneously > > reboot whenever I sent a ZFS data set too it. > > > > System #2 is hung (stable/9). > > Hi Folks, > > I just caught up on this thread and thought I toss out some info. > > I have a number of systems running 9-stable (with some local patches), > none running 8. > > The basic architecture is: http://people.freebsd.org/~jwd/zfsnfsserver.jpg > > LSI SAS 9201-16e 6G/s 16-Port SATA+SAS Host Bus Adapter > > All cards are up-to-date on firmware: > > mps0: Firmware: 14.00.00.00, Driver: 14.00.00.01-fbsd > mps1: Firmware: 14.00.00.00, Driver: 14.00.00.01-fbsd > mps2: Firmware: 14.00.00.00, Driver: 14.00.00.01-fbsd > > All drives a geom multipath configured. > > Currently, these systems are used almost exclusively for iSCSI. > > I have seen no lockups that I can track down to the driver. I have seen > one lockup which I did post about (received no feedback) where I believe > an active I/O from istgt is interupted by an ABRT from the client which > causes a lock-up. This one is hard to replicate and on the do-do list. > > It is worth noting that a few drives were replaced early on > due to various I/O problems and one with what might be considered a > lockup. As has been noted elsewhere, watching gstat can be informative. > Also make sure cables are firmly plugged in.. Seems obvious, I know.. > > I did recently commit a small patch to current to handle a case > where if the system has greater than 255 disks, the 255th disk > is hidden/masked by the mps initiator id that is statically coded into > the driver. > > I think it might be good to document a bit better the type of > mount and test job/test stream running when/if you see a lockup. > I am not currently using NFS so there is an entire code-path I > am not exercising. > > Servers are 12 processor, 96GB Ram. The highest cpu load I've > seen on the systems is about 800%. > > All networking is 10G via Chelsio cards - configured to > use isr maxthread 6 with a defaultqlimit of 4096. I have seen > no problems in this area. > > Hope this helps a bit. Happy to answer questions. > I realized this morning that I neglected to ask a question: How big are your files? Mine are anywhere up to 12T/ea. From one of my servers: bd3# ls -lh total 7400750995 drwxr-xr-x 3 root wheel 12B Oct 26 08:14 ./ drwxr-xr-x 7 root wheel 7B Aug 14 10:50 ../ drwxr-xr-x 2 root wheel 2B Oct 25 21:37 Kore/ -rw-r--r-- 1 root wheel 12T Sep 8 10:24 Merged.0.txt -rw-r--r-- 1 root wheel 1.1T Jul 18 07:30 Merged.2.cleansed.print.txt.gz -rw-r--r-- 1 root wheel 1.2T Jul 18 04:13 Merged.3.cleansed.print.txt.gz -rw-r--r-- 1 root wheel 985G Sep 7 17:25 Merged.KoreLogic.1.txt.bz2 -rw-r--r-- 1 root wheel 1.1T Sep 16 00:02 Merged.KoreLogic.3.txt.bz2 -rw-r--r-- 1 root wheel 670G Jul 27 10:01 Merged.outpost9.cleansed.print.txt.bz2 -rw-r--r-- 1 root wheel 639G Aug 30 06:47 Merged.packet.storm.1.print.cleansed.txt.bz2 -rw-r--r-- 1 root wheel 733G Jul 21 03:49 Merged.wordlist.0.cleansed.print.txt.bz2 Trying to work with the 12T file eventually hangs that system. > Cheers, > John > > ps: With all that's been said above, it's worth noting that a correctly > configured client makes a huge difference. > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" From owner-freebsd-fs@FreeBSD.ORG Fri Oct 26 21:30:48 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 165627EC for ; Fri, 26 Oct 2012 21:30:48 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id C285E8FC08 for ; Fri, 26 Oct 2012 21:30:47 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ap4EAOL/ilCDaFvO/2dsb2JhbABEhhi9RoIeAQEFIyYwGw4KAgINGQIjNgYThhCBZAMPC6lpiQANiVSBIIlqZ4VbgRMDlB6BVYEXiheFEIMLgUgXHg X-IronPort-AV: E=Sophos;i="4.80,657,1344225600"; d="scan'208";a="185386444" Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-jnhn-pri.mail.uoguelph.ca with ESMTP; 26 Oct 2012 17:30:40 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id EB7EBB4041; Fri, 26 Oct 2012 17:30:40 -0400 (EDT) Date: Fri, 26 Oct 2012 17:30:40 -0400 (EDT) From: Rick Macklem To: Yamagi Burmeister Message-ID: <1276463871.2926827.1351287040945.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: <20121026082840.0e909c4ea1c694ac99ef39d9@yamagi.org> Subject: Re: Can not read from ZFS exported over NFSv4 but write to it MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.201] X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692) Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 26 Oct 2012 21:30:48 -0000 Yamagi Burmeister wrote: > On Fri, 26 Oct 2012 15:15:08 -0700 > YongHyeon PYUN wrote: > > > On Fri, Oct 26, 2012 at 08:07:27AM +0200, Yamagi Burmeister wrote: > > > On Fri, 26 Oct 2012 10:38:47 -0700 > > > YongHyeon PYUN wrote: > > > > > > > Thanks the info. > > > > Would you try attached patch? > > > > > > Still the same problem. With TSO enabled, the NFS4 mount > > > stalls when writing to it. Without it works as aspected. > > > > I have no longer access to age(4) controller so it's hard to verify > > the issue on my box. > > Can you post an URL for captured packets on both sender(age(4)) and > > receiver side with tcpdump? > > http://deponie.yamagi.org/freebsd/misc/age0_pcap.tar.xz > > sender_age0.pcap is the age0 device on the NFS4 server. > receiver_em0.pcap is a em0 NIC on the NFS4 client. > > Those files were created by: > 1. Mount the NFS4 export on the client > 2. Try to copy a file onto it. > 3. Forcefully unmount the NFS4 export on the client > I took a look at the packet traces and all that seems to be on them are a couple of OPEN attempts which reply NFS4ERR_GRACE. This is normal. After starting an NFSv4 server, it remains in a Grace period (where it will only handle lock state recovery operations) for greater than a lease duration. After the lease duration (2 minutes for a FreeBSD server), it will allow other operations like non-reclaim OPENs. I'd suggest you wait at least 2 minutes after doing the mount, before you try to do a file copy (or wait several minutes after starting the first file copy before assuming it is hung). Once Grace is over, the server will no longer reply NFS4ERR_GRACE until it is rebooted. If you capture packets again, but wait 5 minutes before doing the "umount -f", we should be able to see what is going on. rick > -- > Homepage: www.yamagi.org > XMPP: yamagi@yamagi.org > GnuPG/GPG: 0xEFBCCBCB From owner-freebsd-fs@FreeBSD.ORG Fri Oct 26 22:47:55 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 9013D752 for ; Fri, 26 Oct 2012 22:47:55 +0000 (UTC) (envelope-from tom@claimlynx.com) Received: from na3sys009aog137.obsmtp.com (na3sys009aog137.obsmtp.com [74.125.149.18]) by mx1.freebsd.org (Postfix) with SMTP id E87DF8FC08 for ; Fri, 26 Oct 2012 22:47:54 +0000 (UTC) Received: from mail-yh0-f72.google.com ([209.85.213.72]) (using TLSv1) by na3sys009aob137.postini.com ([74.125.148.12]) with SMTP ID DSNKUIsTFKAJrqphhTehYwJ9gVutSqS59Pf6@postini.com; Fri, 26 Oct 2012 15:47:55 PDT Received: by mail-yh0-f72.google.com with SMTP id q46so5911060yhf.7 for ; Fri, 26 Oct 2012 15:47:48 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:x-gm-message-state; bh=OoPLAmyor9BTVs+yWuF1FgWvpUP0UdcpTxUPIZ+5Rcc=; b=RpDgpKpnB1GdyOSz4uH6bbWaNfR3ELgVl3aXljKucvY1RQh0ZYpk02Ryc7s4RRfZ2+ eftJ60QdeQ4vp0jE7xC/RPWV6oproXhSyRRLv4j05ZZwmfuglnQ4SztXNYSWViRSF6HO lEV2VZdLu5qC6C+U6RQ46WEA0QnmoZdyoPInKjTOmfx0QyMcVELsCjJCnduWg3m3FE5l SjnrRrKU5+x62QcuWcOdgAo93aqMXnt1K+b1CJji83m2kZxMXUMA0RlTEg8Vs4yYd3Ks 3djxcOGap0rH7TIwoX4OWORWH0lAwCFE597+Rf4ptWyzzpgiru65+uTmG8GNvvVepn/m mYVw== Received: by 10.52.26.133 with SMTP id l5mr31658855vdg.132.1351291667957; Fri, 26 Oct 2012 15:47:47 -0700 (PDT) MIME-Version: 1.0 Received: by 10.52.26.133 with SMTP id l5mr31658840vdg.132.1351291667675; Fri, 26 Oct 2012 15:47:47 -0700 (PDT) Received: by 10.58.28.138 with HTTP; Fri, 26 Oct 2012 15:47:47 -0700 (PDT) In-Reply-To: <86699361.2739800.1351035439228.JavaMail.root@erie.cs.uoguelph.ca> References: <86699361.2739800.1351035439228.JavaMail.root@erie.cs.uoguelph.ca> Date: Fri, 26 Oct 2012 17:47:47 -0500 Message-ID: Subject: Re: Poor throughput using new NFS client (9.0) vs. old (8.2/9.0) From: Thomas Johnson To: Rick Macklem X-Gm-Message-State: ALoCoQkzxDBxX+N8o2inXQJs4pkoKqgMlRO+BUxfbBMcKm/ezqTNr8fve0f+ebul3DUr8wZUlwX+mkIF6jM7KTlCRCW7iXyfFYvSbS1lS8Zkd0Ma/VEWPHohTkZ4aCiecQApW6G0cscq5Pc3OI11M4MXOIQCCAkutA== Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 26 Oct 2012 22:47:55 -0000 You are exactly correct. I went back to the logs, apparently when we tried changing the newnfs wsize/rsize parameters we _changed_ them to 64k (derp). Further benchmarking indicates that with newnfs, we see the best performance at 16k and 32k; 8k also performs quite well. 9.0 vs. 9.1 seems very close as well, though it is difficult to draw conclusions from a busy production system. Good to know that this is a case of PEBKAC, rather than an actual problem. Thanks to everyone for the assistance! On Tue, Oct 23, 2012 at 6:37 PM, Rick Macklem wrote: > Thomas Johnson wrote: > > I built a test image based on 9.1-rc2, per your suggestion Rick. The > > results are below. I was not able to exactly reproduce the workload in > > my original message, so I have also included results for the new (very > > similar) workload on my 9.0 client image as well. > > > > To summarize, 9.1-rc2 using newnfs seems to perform better than > > 9.0-p4, but oldnfs appears to still be significantly faster in both > > cases. > > > > I will get packet traces to Rick, but I want to get new results to the > > list. > > > > -Tom > > > > root@test:/test-> uname -a > > FreeBSD test.claimlynx.com 9.1-RC2 FreeBSD 9.1-RC2 #1: Fri Oct 19 > > 08:27:12 CDT 2012 > > root@builder.claimlynx.com:/usr/obj/usr/src/sys/GENERIC amd64 > > > > > > root@test:/-> mount | grep test > > server:/array/test on /test (nfs) > > root@test:/test-> zip BIGGER_PILE.zip BIG_PILE_53* > > adding: BIG_PILE_5306.zip (stored 0%) > > adding: BIG_PILE_5378.zip (stored 0%) > > adding: BIG_PILE_5386.zip (stored 0%) > > root@test:/test-> ll -h BIGGER_PILE.zip > > -rw-rw-r-- 1 root claimlynx 5.5M Oct 23 14:05 BIGGER_PILE.zip > > root@test:/test-> time zip BIGGER_PILE.zip 53*.zip > /dev/null > > 0.664u 1.693s 0:30.21 7.7% 296+3084k 0+2926io 0pf+0w > > 0.726u 0.989s 0:08.04 21.1% 230+2667k 0+2956io 0pf+0w > > 0.829u 1.268s 0:11.89 17.4% 304+3037k 0+2961io 0pf+0w > > 0.807u 0.902s 0:08.02 21.1% 233+2676k 0+2947io 0pf+0w > > 0.753u 1.354s 0:12.73 16.4% 279+2879k 0+2947io 0pf+0w > > root@test:/test-> ll -h BIGGER_PILE.zip > > -rw-rw-r-- 1 root claimlynx 89M Oct 23 14:03 BIGGER_PILE.zip > > > Although the runs take much longer (I have no idea why and hopefully > I can spot something in the packet traces), it shows about half the > I/O ops. This suggests that it is running at the 64K rsize, wsize > instead of the 32K used by the old client. > > Just to confirm. Did you run a test using the new nfs client > with rsize=32768,wsize=32768 mount options, so the I/O size is > the same as with the old client? > > rick > > > > > root@test:/test-> mount | grep test > > server:/array/test on /test (oldnfs) > > root@test:/test-> time zip BIGGER_PILE.zip 53*.zip > /dev/null > > 0.645u 1.435s 0:08.05 25.7% 295+3044k 0+5299io 0pf+0w > > 0.783u 0.993s 0:06.48 27.3% 225+2499k 0+5320io 0pf+0w > > 0.787u 1.000s 0:06.28 28.3% 246+2884k 0+5317io 0pf+0w > > 0.707u 1.392s 0:07.94 26.3% 266+2743k 0+5313io 0pf+0w > > 0.709u 1.056s 0:06.08 28.7% 246+2814k 0+5318io 0pf+0w > > > > > > > > root@test:/home/tom-> uname -a > > FreeBSD test.claimlynx.com 9.0-RELEASE-p4 FreeBSD 9.0-RELEASE-p4 #0: > > Tue Sep 18 11:51:11 CDT 2012 > > root@builder.claimlynx.com:/usr/obj/usr/src/sys/GENERIC amd64 > > > > > > root@test:/test-> mount | grep test > > server:/array/test on /test (nfs) > > root@test:/test-> time zip BIGGER_PILE.zip 53*.zip > /dev/null > > 0.721u 1.819s 0:31.13 8.1% 284+2886k 0+2932io 0pf+0w > > 0.725u 1.386s 0:12.84 16.3% 247+2631k 0+2957io 0pf+0w > > 0.675u 1.392s 0:13.94 14.7% 300+3005k 0+2928io 0pf+0w > > 0.705u 1.206s 0:10.72 17.7% 278+2874k 0+2973io 0pf+0w > > 0.727u 1.200s 0:18.28 10.5% 274+2872k 0+2947io 0pf+0w > > > > > > root@test:/-> umount /test > > root@test:/-> mount -t oldnfs server:/array/test /test > > root@test:/-> mount | grep test > > server:/array/test on /test (oldnfs) > > root@test:/test-> time zip BIGGER_PILE.zip 53*.zip > /dev/null > > 0.694u 1.820s 0:10.82 23.1% 271+2964k 0+5320io 0pf+0w > > 0.726u 1.293s 0:06.37 31.5% 303+2998k 0+5322io 0pf+0w > > 0.717u 1.248s 0:06.08 32.0% 246+2607k 0+5354io 0pf+0w > > 0.733u 1.230s 0:06.17 31.7% 256+2536k 0+5311io 0pf+0w > > 0.549u 1.581s 0:08.02 26.4% 302+3116k 0+5321io 0pf+0w > > > > > > On Thu, Oct 18, 2012 at 5:11 PM, Rick Macklem < rmacklem@uoguelph.ca > > > wrote: > > > > > > > > > > Ronald Klop wrote: > > > On Thu, 18 Oct 2012 18:16:16 +0200, Thomas Johnson < > > > tom@claimlynx.com > > > > wrote: > > > > > > > We recently upgraded a number of hosts from FreeBSD 8.2 to 9.0. > > > > Almost > > > > immediately, we received reports from users of poor performance. > > > > The > > > > upgraded hosts are PXE-booted, with an NFS-mounted root. > > > > Additionally, > > > > they > > > > mount a number of other NFS shares, which is where our users work > > > > from. > > > > After a week of tweaking rsize/wsize/readahead parameters (per > > > > guidance), > > > > it finally occurred to me that 9.0 defaults to the new NFS client > > > > and > > > > server. I remounted the user shares using the oldnfs file type, > > > > and > > > > users > > > > reported that performance returned to its expected level. > > > > > > > > This is obviously a workaround, rather than a solution. We would > > > > prefer > > > > to > > > > get our hosts using the newnfs client, since presumably oldnfs > > > > will > > > > be > > > > deprecated at some point in the future. Is there some change that > > > > we > > > > should > > > > have made to our NFS configuration with the upgrade to 9.0, or is > > > > it > > > > possible that our workload is exposing some deficiency with > > > > newnfs? > > > > We > > > > tend > > > > to deal with a huge number of tiny files (several KB in size). The > > > > NFS > > > > server has been running 9.0 for some time (prior to the client > > > > upgrade) > > > > without any issue. NFS is served from a zpool, backed by a Dell > > > > MD3000, > > > > populated with 15k SAS disks. Clients and server are connected > > > > with > > > > Gig-E > > > > links. The general hardware configuration has not changed in > > > > nearly > > > > 3 > > > > years. > > > > > > > > As an example of the performance difference, here is some of the > > > > testing > > > > I > > > > did while troubleshooting. Given a directory containing 5671 zip > > > > files, > > > > with an average size of 15KB. I append all files to an existing > > > > zip > > > > file. > > > > Using the newnfs mount, I found that this operation generally > > > > takes > > > > ~30 > > > > seconds (wall time). Switching the mount to oldnfs resulted in the > > > > same > > > > operation taking ~10 seconds. > > > > > > > > tom@test-1:/test-> ls 53*zip | wc -l > > > > 5671 > > > > tom@test-1:/test-> ll -h BIG* > > > > -rw-rw-r-- 1 tom claimlynx 8.9M Oct 17 14:06 BIGGER_PILE_1.zip > > > > tom@test-1:/test-> time zip BIGGER_PILE_1.zip 53*.zip > > > > 0.646u 0.826s 0:51.01 2.8% 199+2227k 0+2769io 0pf+0w > > > > ...reset and repeat... > > > > 0.501u 0.629s 0:30.49 3.6% 208+2319k 0+2772io 0pf+0w > > > > ...reset and repeat... > > > > 0.601u 0.522s 0:32.37 3.4% 220+2406k 0+2771io 0pf+0w > > > > > > > > tom@test-1:/-> cd / > > > > tom@test-1:/-> sudo umount /test > > > > tom@test-1:/-> sudo mount -t oldnfs -o rw server:/array/test /test > > > > tom@test-1:/-> mount | grep test > > > > server:/array/test on /test (oldnfs) > > > > tom@test-1:/-> cd /test > > > > ...reset and repeat... > > > > 0.470u 0.903s 0:13.09 10.4% 203+2229k 0+5107io 0pf+0w > > > > ...reset and repeat... > > > > 0.547u 0.640s 0:08.65 13.6% 231+2493k 0+5086io 0pf+0w > > > > tom@test-1:/test-> ll -h BIG* > > > > -rw-rw-r-- 1 tom claimlynx 92M Oct 17 14:14 BIGGER_PILE_1.zip > > > > > > > > Thanks! > > > > > > > > > > > > > You might find this thread from today interesting. > > > http://lists.freebsd.org/pipermail/freebsd-fs/2012-October/015441.html > > > > > Yes, although I can't explain why Alexey's problem went away > > when he went from 9.0->9.1 for his NFS server, it would be > > interesting if Thomas could try the same thing? > > > > About the only thing different between the old and new NFS > > clients is the default rsize/wsize. However, if Thomas tried > > rsize=32768,wsize=32768 for the default (new) NFS client, then > > that would be ruled out. To be honest, the new client uses code > > cloned from the old one for all the caching etc (which is where > > the clients are "smart"). They use different RPC parsing code, > > since the new one does NFSv4 as well, but that code is pretty > > straightforward, so I can't think why it would result in a > > factor of 3 in performance. > > > > If Thomas were to capture a packet trace of the above test > > for two clients and emailed them to me, I could take a look > > and see if I can see what is going on. (For Alexey's case, > > it was a whole bunch of Read RPCs without replies, but that > > was a Linux client, of course. It also had a significant # of > > TCP layer retransmits and out of order TCP segments in it.) > > > > It would be nice to figure this out, since I was thinking > > that the old client might go away for 10.0 (can't if these > > issues still exist). > From owner-freebsd-fs@FreeBSD.ORG Sat Oct 27 01:47:27 2012 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id BD29CCAA; Sat, 27 Oct 2012 01:47:27 +0000 (UTC) (envelope-from eadler@FreeBSD.org) Received: from freefall.freebsd.org (freefall.FreeBSD.org [8.8.178.135]) by mx1.freebsd.org (Postfix) with ESMTP id 893DB8FC0A; Sat, 27 Oct 2012 01:47:27 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q9R1lRnF075439; Sat, 27 Oct 2012 01:47:27 GMT (envelope-from eadler@freefall.freebsd.org) Received: (from eadler@localhost) by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q9R1lRaW075435; Sat, 27 Oct 2012 01:47:27 GMT (envelope-from eadler) Date: Sat, 27 Oct 2012 01:47:27 GMT Message-Id: <201210270147.q9R1lRaW075435@freefall.freebsd.org> To: eadler@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-fs@FreeBSD.org From: eadler@FreeBSD.org Subject: Re: kern/173136: (unionfs) mounting above the NFS read-only share panic X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 27 Oct 2012 01:47:28 -0000 Synopsis: (unionfs) mounting above the NFS read-only share panic Responsible-Changed-From-To: freebsd-bugs->freebsd-fs Responsible-Changed-By: eadler Responsible-Changed-When: Sat Oct 27 01:47:09 UTC 2012 Responsible-Changed-Why: correct synopsis and assign http://www.freebsd.org/cgi/query-pr.cgi?pr=173136 From owner-freebsd-fs@FreeBSD.ORG Sat Oct 27 22:03:22 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 90FA2B98 for ; Sat, 27 Oct 2012 22:03:22 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id EDCF78FC0C for ; Sat, 27 Oct 2012 22:03:21 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ap4EAMNZjFCDaFvO/2dsb2JhbABEhhi9ToIeAQEBAwEBAiAEUhsOCgICDRkCKi8GExuHZQYLqUOSAoEgilUKChOFI4ETA5JCgQWCLYEajyiDC4E/Ah4e X-IronPort-AV: E=Sophos;i="4.80,663,1344225600"; d="scan'208";a="185462312" Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-jnhn-pri.mail.uoguelph.ca with ESMTP; 27 Oct 2012 18:03:15 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 3A372B4084; Sat, 27 Oct 2012 18:03:15 -0400 (EDT) Date: Sat, 27 Oct 2012 18:03:15 -0400 (EDT) From: Rick Macklem To: Thomas Johnson Message-ID: <1490669451.2942006.1351375395117.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: Subject: Re: Poor throughput using new NFS client (9.0) vs. old (8.2/9.0) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.201] X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692) Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 27 Oct 2012 22:03:22 -0000 Thomas Johnson wrote: > You are exactly correct. I went back to the logs, apparently when we > tried changing the newnfs wsize/rsize parameters we _changed_ them to > 64k (derp). Further benchmarking indicates that with newnfs, we see > the best performance at 16k and 32k; 8k also performs quite well. 9.0 > vs. 9.1 seems very close as well, though it is difficult to draw > conclusions from a busy production system. Good to know that this is a > case of PEBKAC, rather than an actual problem. Thanks to everyone for > the assistance! > Ok, so with "rsize=32768,wsize=327678", the performance is about the same as "oldnfs"? (If so, I can breath a sigh of relief, since that would indicate no fundamental problem with "newnfs".) I'll admit I'm not as convinced as bde@ that 64K won't perform about as well as 32K for most/many sites. (I see slightly better perf. for 64K than 32K.) Since you were seeing dramatically poorer (factor of 3) performance for 64K, I suspect something in your network fabric couldn't handle the large TCP segments. (At the top of the list of suspects is TSO support, from my limited experience.) If you can disable TSO on the network interfaces for the client and server, it might be worth trying that and seeing if a 64K mount works well then. (I understand that this might not be practical for a production system.) Anyhow, good to hear the problem is resolved for you, rick > > On Tue, Oct 23, 2012 at 6:37 PM, Rick Macklem < rmacklem@uoguelph.ca > > wrote: > > > > > Thomas Johnson wrote: > > I built a test image based on 9.1-rc2, per your suggestion Rick. The > > results are below. I was not able to exactly reproduce the workload > > in > > my original message, so I have also included results for the new > > (very > > similar) workload on my 9.0 client image as well. > > > > To summarize, 9.1-rc2 using newnfs seems to perform better than > > 9.0-p4, but oldnfs appears to still be significantly faster in both > > cases. > > > > I will get packet traces to Rick, but I want to get new results to > > the > > list. > > > > -Tom > > > > root@test:/test-> uname -a > > FreeBSD test.claimlynx.com 9.1-RC2 FreeBSD 9.1-RC2 #1: Fri Oct 19 > > 08:27:12 CDT 2012 > > root@builder.claimlynx.com:/usr/obj/usr/src/sys/GENERIC amd64 > > > > > > root@test:/-> mount | grep test > > server:/array/test on /test (nfs) > > root@test:/test-> zip BIGGER_PILE.zip BIG_PILE_53* > > adding: BIG_PILE_5306.zip (stored 0%) > > adding: BIG_PILE_5378.zip (stored 0%) > > adding: BIG_PILE_5386.zip (stored 0%) > > root@test:/test-> ll -h BIGGER_PILE.zip > > -rw-rw-r-- 1 root claimlynx 5.5M Oct 23 14:05 BIGGER_PILE.zip > > root@test:/test-> time zip BIGGER_PILE.zip 53*.zip > /dev/null > > 0.664u 1.693s 0:30.21 7.7% 296+3084k 0+2926io 0pf+0w > > 0.726u 0.989s 0:08.04 21.1% 230+2667k 0+2956io 0pf+0w > > 0.829u 1.268s 0:11.89 17.4% 304+3037k 0+2961io 0pf+0w > > 0.807u 0.902s 0:08.02 21.1% 233+2676k 0+2947io 0pf+0w > > 0.753u 1.354s 0:12.73 16.4% 279+2879k 0+2947io 0pf+0w > > root@test:/test-> ll -h BIGGER_PILE.zip > > -rw-rw-r-- 1 root claimlynx 89M Oct 23 14:03 BIGGER_PILE.zip > > > Although the runs take much longer (I have no idea why and hopefully > I can spot something in the packet traces), it shows about half the > I/O ops. This suggests that it is running at the 64K rsize, wsize > instead of the 32K used by the old client. > > Just to confirm. Did you run a test using the new nfs client > with rsize=32768,wsize=32768 mount options, so the I/O size is > the same as with the old client? > > rick > > > > > > > root@test:/test-> mount | grep test > > server:/array/test on /test (oldnfs) > > root@test:/test-> time zip BIGGER_PILE.zip 53*.zip > /dev/null > > 0.645u 1.435s 0:08.05 25.7% 295+3044k 0+5299io 0pf+0w > > 0.783u 0.993s 0:06.48 27.3% 225+2499k 0+5320io 0pf+0w > > 0.787u 1.000s 0:06.28 28.3% 246+2884k 0+5317io 0pf+0w > > 0.707u 1.392s 0:07.94 26.3% 266+2743k 0+5313io 0pf+0w > > 0.709u 1.056s 0:06.08 28.7% 246+2814k 0+5318io 0pf+0w > > > > > > > > root@test:/home/tom-> uname -a > > FreeBSD test.claimlynx.com 9.0-RELEASE-p4 FreeBSD 9.0-RELEASE-p4 #0: > > Tue Sep 18 11:51:11 CDT 2012 > > root@builder.claimlynx.com:/usr/obj/usr/src/sys/GENERIC amd64 > > > > > > root@test:/test-> mount | grep test > > server:/array/test on /test (nfs) > > root@test:/test-> time zip BIGGER_PILE.zip 53*.zip > /dev/null > > 0.721u 1.819s 0:31.13 8.1% 284+2886k 0+2932io 0pf+0w > > 0.725u 1.386s 0:12.84 16.3% 247+2631k 0+2957io 0pf+0w > > 0.675u 1.392s 0:13.94 14.7% 300+3005k 0+2928io 0pf+0w > > 0.705u 1.206s 0:10.72 17.7% 278+2874k 0+2973io 0pf+0w > > 0.727u 1.200s 0:18.28 10.5% 274+2872k 0+2947io 0pf+0w > > > > > > root@test:/-> umount /test > > root@test:/-> mount -t oldnfs server:/array/test /test > > root@test:/-> mount | grep test > > server:/array/test on /test (oldnfs) > > root@test:/test-> time zip BIGGER_PILE.zip 53*.zip > /dev/null > > 0.694u 1.820s 0:10.82 23.1% 271+2964k 0+5320io 0pf+0w > > 0.726u 1.293s 0:06.37 31.5% 303+2998k 0+5322io 0pf+0w > > 0.717u 1.248s 0:06.08 32.0% 246+2607k 0+5354io 0pf+0w > > 0.733u 1.230s 0:06.17 31.7% 256+2536k 0+5311io 0pf+0w > > 0.549u 1.581s 0:08.02 26.4% 302+3116k 0+5321io 0pf+0w > > > > > > On Thu, Oct 18, 2012 at 5:11 PM, Rick Macklem < rmacklem@uoguelph.ca > > > > > wrote: > > > > > > > > > > Ronald Klop wrote: > > > On Thu, 18 Oct 2012 18:16:16 +0200, Thomas Johnson < > > > tom@claimlynx.com > > > > wrote: > > > > > > > We recently upgraded a number of hosts from FreeBSD 8.2 to 9.0. > > > > Almost > > > > immediately, we received reports from users of poor performance. > > > > The > > > > upgraded hosts are PXE-booted, with an NFS-mounted root. > > > > Additionally, > > > > they > > > > mount a number of other NFS shares, which is where our users > > > > work > > > > from. > > > > After a week of tweaking rsize/wsize/readahead parameters (per > > > > guidance), > > > > it finally occurred to me that 9.0 defaults to the new NFS > > > > client > > > > and > > > > server. I remounted the user shares using the oldnfs file type, > > > > and > > > > users > > > > reported that performance returned to its expected level. > > > > > > > > This is obviously a workaround, rather than a solution. We would > > > > prefer > > > > to > > > > get our hosts using the newnfs client, since presumably oldnfs > > > > will > > > > be > > > > deprecated at some point in the future. Is there some change > > > > that > > > > we > > > > should > > > > have made to our NFS configuration with the upgrade to 9.0, or > > > > is > > > > it > > > > possible that our workload is exposing some deficiency with > > > > newnfs? > > > > We > > > > tend > > > > to deal with a huge number of tiny files (several KB in size). > > > > The > > > > NFS > > > > server has been running 9.0 for some time (prior to the client > > > > upgrade) > > > > without any issue. NFS is served from a zpool, backed by a Dell > > > > MD3000, > > > > populated with 15k SAS disks. Clients and server are connected > > > > with > > > > Gig-E > > > > links. The general hardware configuration has not changed in > > > > nearly > > > > 3 > > > > years. > > > > > > > > As an example of the performance difference, here is some of the > > > > testing > > > > I > > > > did while troubleshooting. Given a directory containing 5671 zip > > > > files, > > > > with an average size of 15KB. I append all files to an existing > > > > zip > > > > file. > > > > Using the newnfs mount, I found that this operation generally > > > > takes > > > > ~30 > > > > seconds (wall time). Switching the mount to oldnfs resulted in > > > > the > > > > same > > > > operation taking ~10 seconds. > > > > > > > > tom@test-1:/test-> ls 53*zip | wc -l > > > > 5671 > > > > tom@test-1:/test-> ll -h BIG* > > > > -rw-rw-r-- 1 tom claimlynx 8.9M Oct 17 14:06 BIGGER_PILE_1.zip > > > > tom@test-1:/test-> time zip BIGGER_PILE_1.zip 53*.zip > > > > 0.646u 0.826s 0:51.01 2.8% 199+2227k 0+2769io 0pf+0w > > > > ...reset and repeat... > > > > 0.501u 0.629s 0:30.49 3.6% 208+2319k 0+2772io 0pf+0w > > > > ...reset and repeat... > > > > 0.601u 0.522s 0:32.37 3.4% 220+2406k 0+2771io 0pf+0w > > > > > > > > tom@test-1:/-> cd / > > > > tom@test-1:/-> sudo umount /test > > > > tom@test-1:/-> sudo mount -t oldnfs -o rw server:/array/test > > > > /test > > > > tom@test-1:/-> mount | grep test > > > > server:/array/test on /test (oldnfs) > > > > tom@test-1:/-> cd /test > > > > ...reset and repeat... > > > > 0.470u 0.903s 0:13.09 10.4% 203+2229k 0+5107io 0pf+0w > > > > ...reset and repeat... > > > > 0.547u 0.640s 0:08.65 13.6% 231+2493k 0+5086io 0pf+0w > > > > tom@test-1:/test-> ll -h BIG* > > > > -rw-rw-r-- 1 tom claimlynx 92M Oct 17 14:14 BIGGER_PILE_1.zip > > > > > > > > Thanks! > > > > > > > > > > > > > You might find this thread from today interesting. > > > http://lists.freebsd.org/pipermail/freebsd-fs/2012-October/015441.html > > > > > Yes, although I can't explain why Alexey's problem went away > > when he went from 9.0->9.1 for his NFS server, it would be > > interesting if Thomas could try the same thing? > > > > About the only thing different between the old and new NFS > > clients is the default rsize/wsize. However, if Thomas tried > > rsize=32768,wsize=32768 for the default (new) NFS client, then > > that would be ruled out. To be honest, the new client uses code > > cloned from the old one for all the caching etc (which is where > > the clients are "smart"). They use different RPC parsing code, > > since the new one does NFSv4 as well, but that code is pretty > > straightforward, so I can't think why it would result in a > > factor of 3 in performance. > > > > If Thomas were to capture a packet trace of the above test > > for two clients and emailed them to me, I could take a look > > and see if I can see what is going on. (For Alexey's case, > > it was a whole bunch of Read RPCs without replies, but that > > was a Linux client, of course. It also had a significant # of > > TCP layer retransmits and out of order TCP segments in it.) > > > > It would be nice to figure this out, since I was thinking > > that the old client might go away for 10.0 (can't if these > > issues still exist).