From owner-freebsd-questions@FreeBSD.ORG Thu May 12 04:51:33 2011 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 718CF1065670 for ; Thu, 12 May 2011 04:51:33 +0000 (UTC) (envelope-from dave-freebsd@pooserville.com) Received: from smtp.pooserville.com (mail.pooserville.com [69.26.223.253]) by mx1.freebsd.org (Postfix) with ESMTP id 298DC8FC13 for ; Thu, 12 May 2011 04:51:32 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=mail.pooserville.com) by smtp.pooserville.com with esmtp (Exim 4.72) (envelope-from ) id LL2G5W-001SBG-F8 for freebsd-questions@freebsd.org; Wed, 11 May 2011 23:51:32 -0500 Received: from [69.26.223.2] (account dave-sa@pooserville.com HELO [192.168.67.90]) by mail.pooserville.com (CommuniGate Pro SMTP 4.2.10) with ESMTP-TLS id 947699 for freebsd-questions@freebsd.org; Wed, 11 May 2011 23:51:32 -0500 User-Agent: Microsoft-MacOutlook/14.10.0.110310 Date: Wed, 11 May 2011 23:51:20 -0500 From: Dave Pooser To: Message-ID: Thread-Topic: Best SATA/SAS controller for ZFS on FreeBSD 8.2 RELEASE? Mime-version: 1.0 Content-type: text/plain; charset="US-ASCII" Content-transfer-encoding: 7bit Subject: Best SATA/SAS controller for ZFS on FreeBSD 8.2 RELEASE? X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 12 May 2011 04:51:33 -0000 My hardware: Dell 1950 with dual quad-core X5450 processors, 16GB RAM, boot drive connected to a SAS 6/iR controller (mpt0), pair of external ACARD 9010 RAMDisks (da3 & da4) connected to an LSI SAS3801E controller (mpt1). The RAMdisks are configured in a ZFS mirror (Backbone) in hopes of both high IOPS and data integrity. Main purpose of the database is to run a small (<4GB) PostgreSQL database. My problem: Twice in the last 3 weeks I see more and more errors from the mpt1 driver until it decides that it's lost the drives and Postgres hangs. I try a shutdown-h, which it can't complete, and eventually hold down the power button to shut the machine off. When I boot it it comes up fine, scrubs complete in seconds with zero errors found, and all is grand... Until the next time. I'm hesitant to blame the RAMdisks, because (1) I've got some of them working fine for me with other OSes and (2) "zpool scrub" consistently shows no errors. I've read some suggestions on the Net suggesting that the MPT driver in FreeBSD is sub-optimal, so that's one area I want to check-- is there another controller that would be better? Most of my ZFS experience has been in OpenSolaris, where LSI cards are pretty much the standard, but FreeBSD is not OpenSolaris.... Logfiles below: May 11 17:58:46 backbone kernel: mpt1: attempting to abort req 0xffffff800068b790:25990 function 0 May 11 17:58:46 backbone kernel: mpt1: mpt_cam_event: 0x16 May 11 17:58:46 backbone kernel: mpt1: mpt_cam_event: 0x16 May 11 17:58:47 backbone kernel: mpt1: abort of req 0xffffff800068b790:25990 completed May 11 17:58:47 backbone kernel: mpt1: attempting to abort req 0xffffff800068b790:25990 function 0 May 11 17:58:47 backbone kernel: mpt1: mpt_cam_event: 0x16 May 11 17:58:47 backbone kernel: mpt1: mpt_cam_event: 0x16 May 11 17:58:47 backbone kernel: mpt1: abort of req 0xffffff800068b790:25990 completed May 11 17:58:47 backbone kernel: mpt1: attempting to abort req 0xffffff800068b790:25990 function 0 May 11 17:58:48 backbone kernel: mpt1: abort of req 0xffffff800068b790:25990 completed May 11 17:58:48 backbone kernel: mpt1: attempting to abort req 0xffffff800068b790:25990 function 0 May 11 17:58:48 backbone kernel: mpt1: abort of req 0xffffff800068b790:25990 completed Eventually it tires of those entries and segues into: May 11 17:59:24 backbone kernel: mpt1: mpt_cam_event: 0x16 May 11 17:59:24 backbone last message repeated 2 times May 11 17:59:24 backbone kernel: (da3:mpt1:0:2:0): SYNCHRONIZE CACHE(10). CDB: 35 0 0 0 0 0 0 0 0 0 May 11 17:59:24 backbone kernel: (da3:mpt1:0:2:0): CAM status: SCSI Status Error May 11 17:59:24 backbone kernel: (da3:mpt1:0:2:0): SCSI status: Check Condition May 11 17:59:24 backbone kernel: (da3:mpt1:0:2:0): SCSI sense: UNIT ATTENTION asc:29,0 (Power on, reset, or bus device reset occurred) May 11 17:59:24 backbone kernel: (da4:mpt1:0:4:0): WRITE(10). CDB: 2a 0 0 40 53 39 0 0 18 0 May 11 17:59:24 backbone kernel: (da4:mpt1:0:4:0): CAM status: SCSI Status Error May 11 17:59:24 backbone kernel: (da4:mpt1:0:4:0): SCSI status: Check Condition May 11 17:59:24 backbone kernel: (da4:mpt1:0:4:0): SCSI sense: UNIT ATTENTION asc:29,0 (Power on, reset, or bus device reset occurred) And then it starts complaining about vdev I/O failures: May 11 17:59:58 backbone root: ZFS: vdev I/O failure, zpool=Backbone path=/dev/da3 offset=270336 size=8192 error=6 May 11 17:59:58 backbone kernel: (da3:mpt1:0:2:0): lost device May 11 17:59:58 backbone kernel: (da3:mpt1:0:2:0): Invalidating pack May 11 17:59:58 backbone last message repeated 3 times May 11 17:59:58 backbone kernel: (da4:mpt1:0:4:0): lost device May 11 17:59:58 backbone kernel: (da4:mpt1:0:4:0): Invalidating pack May 11 17:59:58 backbone last message repeated 3 times May 11 17:59:58 backbone kernel: (da3:mpt1:0:2:0): Synchronize cache failed, status == 0xa, scsi status == 0x0 May 11 17:59:58 backbone kernel: (da3:mpt1:0:2:0): removing device entry May 11 17:59:58 backbone kernel: (da4:mpt1:0:4:0): Synchronize cache failed, status == 0xa, scsi status == 0x0 May 11 17:59:58 backbone kernel: May 11 17:59:58 backbone kernel: (da4:mpt1:0:4:0): removing device entry May 11 17:59:58 backbone root: ZFS: vdev I/O failure, zpool=Backbone path=/dev/da3 offset=8589156352 size=8192 error=6 May 11 17:59:58 backbone root: ZFS: vdev I/O failure, zpool=Backbone path=/dev/da3 offset=8589418496 size=8192 error=6 May 11 17:59:58 backbone root: ZFS: vdev I/O failure, zpool=Backbone path=/dev/da4 offset=270336 size=8192 error=6 May 11 17:59:58 backbone root: ZFS: vdev I/O failure, zpool=Backbone path=/dev/da4 offset=8589156352 size=8192 error=6 May 11 17:59:58 backbone root: ZFS: vdev I/O failure, zpool=Backbone path=/dev/da4 offset=8589418496 size=8192 error=6 May 11 17:59:58 backbone root: ZFS: zpool I/O failure, zpool=Backbone error=6 May 11 17:59:58 backbone last message repeated 15 times May 11 17:59:58 backbone root: ZFS: zpool I/O failure, zpool=Backbone error=28 May 11 17:59:58 backbone last message repeated 15 times May 11 17:59:58 backbone root: ZFS: vdev I/O failure, zpool=Backbone path= offset= size= error= May 11 18:00:05 backbone kernel: mpt1: mpt_cam_event: 0x16 May 11 18:00:05 backbone kernel: mpt1: mpt_cam_event: 0x12 May 11 18:00:05 backbone kernel: mpt1: mpt_cam_event: 0x16 May 11 18:00:08 backbone kernel: mpt1: mpt_cam_event: 0x16 May 11 18:00:08 backbone kernel: mpt1: mpt_cam_event: 0x12 May 11 18:00:08 backbone kernel: mpt1: mpt_cam_event: 0x16 -- Dave Pooser Cat-Herder-in-Chief, Pooserville.com "If you think bringing a wet noodle to an Amish rake fight makes you 'better armed' for it, then you have a funny understanding of how the universe works." -- Al Iverson