From owner-freebsd-questions@FreeBSD.ORG Mon Aug 11 21:51:45 2008 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8EED71065683 for ; Mon, 11 Aug 2008 21:51:45 +0000 (UTC) (envelope-from lists@oak-wood.co.uk) Received: from ash.oak-wood.co.uk (ash.oak-wood.co.uk [62.3.200.116]) by mx1.freebsd.org (Postfix) with ESMTP id 355BC8FC1E for ; Mon, 11 Aug 2008 21:51:45 +0000 (UTC) (envelope-from lists@oak-wood.co.uk) Received: from localhost (localhost [127.0.0.1]) by ash.oak-wood.co.uk (Postfix) with ESMTP id 0B5E9BCB9E for ; Mon, 11 Aug 2008 21:51:44 +0000 (GMT) Received: from ash.oak-wood.co.uk ([127.0.0.1]) by localhost (ash.oak-wood.co.uk [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 15770-06 for ; Mon, 11 Aug 2008 21:51:42 +0000 (GMT) Received: from [192.168.37.233] (bluebell.thegrove.oak-wood.co.uk [192.168.37.233]) by ash.oak-wood.co.uk (Postfix) with ESMTPA id E6B8CBCB88 for ; Mon, 11 Aug 2008 21:51:42 +0000 (GMT) Message-ID: <48A0B46E.6000504@oak-wood.co.uk> Date: Mon, 11 Aug 2008 22:51:42 +0100 From: Chris Hastie User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.8.1.16) Gecko/20080724 Thunderbird/2.0.0.16 Mnenhy/0.7.5.666 MIME-Version: 1.0 To: freebsd-questions@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Monitoring raid health with mpt X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 Aug 2008 21:51:45 -0000 I have a Dell PowerEdge 860 with SAS 5iR RAID controller and FreeBSD 6.2. The controller is configured for RAID 1. The controller is recognised as mpt0 and seen as a SCSI device da0. All seems to be working fine, but is there any way to tell if one of the disks fails? Lots of searching has suggested that most people reckon 'no', but some reckon sysctl -a | grep nonoptimal_volumes should come up with something useful. I've had a poke around in the source, which is probably pointless since my knowledge of C is next to zilch. But it looks like a number of sysctl oids are defined in mpt_raid.c: vol_member_wce, vol_queue_depth, vol_resync_rate and nonoptimal_volumes. I see none of these, just a couple from mpt.c: paddington# sysctl dev.mpt.0 dev.mpt.0.%desc: LSILogic SAS/SATA Adapter dev.mpt.0.%driver: mpt dev.mpt.0.%location: slot=8 function=0 dev.mpt.0.%pnpinfo: vendor=0x1000 device=0x0054 subvendor=0x1028 subdevice=0x1f09 class=0x010000 dev.mpt.0.%parent: pci2 dev.mpt.0.debug: 3 dev.mpt.0.role: 1 Should I expect to see some other values? Will the nonoptimal_volumes value turn up if a drive fails? Or will I see some messages in syslog? Anything that will give me some notice of a failed drive would help - the machine is colocated so keeping an eye open for flashing LEDs isn't really an option :( This is the relevant bit of demesg: mpt0: port 0xec00-0xecff mem 0xfe9fc000-0xfe9fffff,0xfe9e0000-0xfe9effff irq 16 at device 8.0 on pci2 mpt0: [GIANT-LOCKED] mpt0: MPI Version=1.5.13.0 mpt0: mpt_cam_event: 0x16 mpt0: Unhandled Event Notify Frame. Event 0x16 (ACK not required). mpt0: mpt_cam_event: 0x12 mpt0: Unhandled Event Notify Frame. Event 0x12 (ACK not required). mpt0: mpt_cam_event: 0x12 mpt0: Unhandled Event Notify Frame. Event 0x12 (ACK not required). mpt0: mpt_cam_event: 0x16 mpt0: Unhandled Event Notify Frame. Event 0x16 (ACK not required). mpt0: mpt_cam_event: 0xb mpt0: Unhandled Event Notify Frame. Event 0xb (ACK not required). da0 at mpt0 bus 0 target 0 lun 0 da0: Fixed Direct Access SCSI-5 device da0: 300.000MB/s transfers, Tagged Queueing Enabled da0: 237464MB (486326272 512 byte sectors: 255H 63S/T 30272C) Trying to mount root from ufs:/dev/da0s1a -- Chris Hastie Find tree care advice at http://www.tree-care.info/