From owner-freebsd-stable@freebsd.org Fri Feb 19 05:38:45 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E38E6AADC77 for ; Fri, 19 Feb 2016 05:38:45 +0000 (UTC) (envelope-from TERRY@glaver.org) Received: from server.glaver.org (server.glaver.org [204.141.35.63]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id BD8B5180E for ; Fri, 19 Feb 2016 05:38:45 +0000 (UTC) (envelope-from TERRY@glaver.org) Received: from glaver.org by glaver.org (PMDF V6.6 #37010) id <01PWUKIO87SW0002VH@glaver.org> for freebsd-stable@freebsd.org; Fri, 19 Feb 2016 00:22:01 -0500 (EST) Date: Fri, 19 Feb 2016 00:08:55 -0500 (EST) From: Terry Kennedy Subject: 10.3-BETA2 regression in MPT To: freebsd-stable@freebsd.org Message-id: <01PWUKZNJW2C0002VH@glaver.org> MIME-version: 1.0 Content-type: TEXT/PLAIN; CHARSET=us-ascii X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Feb 2016 05:38:46 -0000 I have some systems which I plan to upgrade from 8.4 to 10.3 once 10.3 is released. In the meantime, I'm testing 10.3-BETA2 and have found what appears to be a regression in the MPT driver. The system is a Dell PowerEdge R300 with a Dell SAS6 controller: mpt0@pci0:5:0:0: class=0x010000 card=0x1f0e1028 chip=0x00581000 rev=0x08 hdr=0x00 vendor = 'LSI Logic / Symbios Logic' device = 'SAS1068E PCI-Express Fusion-MPT SAS' class = mass storage subclass = SCSI Both the system BIOS and the SAS6 firmware are at the latest revisions from Dell (which haven't changed in years). On the 8.4 system, "grep mpt /var/run/dmesg.boot" reports: mpt0: port 0xec00-0xecff mem 0xdfcec000-0xdfceffff,0xdfcf0000-0xdfcfffff irq 16 at device 0.0 on pci5 mpt0: [ITHREAD] mpt0: MPI Version=1.5.18.0 mpt0: Capabilities: ( RAID-0 RAID-1E RAID-1 ) mpt0: 1 Active Volume (2 Max) mpt0: 2 Hidden Drive Members (14 Max) mpt0:vol0(mpt0:0:0): Settings ( Hot-Plug-Spares High-Priority-ReSync ) mpt0:vol0(mpt0:0:0): Using Spare Pool: 0 mpt0:vol0(mpt0:0:0): 2 Members: (mpt0:1:9:0): Primary Online (mpt0:1:1:0): Secondary Online mpt0:vol0(mpt0:0:0): RAID-1 - Optimal mpt0:vol0(mpt0:0:0): Status ( Enabled ) (mpt0:vol0:1): Physical (mpt0:0:1:0), Pass-thru (mpt0:1:0:0) (mpt0:vol0:1): Online (mpt0:vol0:0): Physical (mpt0:0:9:0), Pass-thru (mpt0:1:1:0) (mpt0:vol0:0): Online (probe0:mpt0:0:0:0): REPORT LUNS. CDB: a0 00 00 00 00 00 00 00 00 10 00 00 (probe0:mpt0:0:0:0): CAM status: SCSI Status Error (probe0:mpt0:0:0:0): SCSI status: Check Condition (probe0:mpt0:0:0:0): SCSI sense: ILLEGAL REQUEST info?:39000000 asc:0,0 (No additional sense information) ses0 at mpt0 bus 0 scbus0 target 8 lun 0 pass2 at mpt0 bus 1 scbus1 target 0 lun 0 da0 at mpt0 bus 0 scbus0 target 0 lun 0 [I'm not sure what that ILLEGAL REQUEST is about.] On the same system, running 10.3-BETA2 r295785, I see: mpt0: port 0xec00-0xecff mem 0xdfcec000-0xdfceffff,0xdfcf0000-0xdfcfffff irq 16 at device 0.0 on pci5 mpt0: MPI Version=1.5.18.0 mpt0: Capabilities: ( RAID-0 RAID-1E RAID-1 ) mpt0: 1 Active Volume (2 Max) mpt0: 2 Hidden Drive Members (14 Max) mpt0:vol0(mpt0:0:0): Settings ( Hot-Plug-Spares High-Priority-ReSync ) mpt0:vol0(mpt0:0:0): Using Spare Pool: mpt0:vol0(mpt0:0:0): 2 Members: (mpt0:1:9:0): Primary Online (mpt0:1:1:0): Secondary Online mpt0:vol0(mpt0:0:0): RAID-1 - Optimal mpt0:vol0(mpt0:0:0): Status ( Enabled ) (mpt0:vol0:1): Physical (mpt0:0:1:0), Pass-thru (mpt0:1:0:0) (mpt0:vol0:1): Online (mpt0:vol0:0): Physical (mpt0:0:9:0), Pass-thru (mpt0:1:1:0) (mpt0:vol0:0): Online (probe64:mpt0:1:1:0): INQUIRY. CDB: 12 00 00 00 24 00 (probe64:mpt0:1:1:0): CAM status: Unrecoverable Host Bus Adapter Error (probe64:mpt0:1:1:0): Retrying command (probe64:mpt0:1:1:0): INQUIRY. CDB: 12 00 00 00 24 00 (probe64:mpt0:1:1:0): CAM status: Unrecoverable Host Bus Adapter Error (probe64:mpt0:1:1:0): Retrying command (probe0:mpt0:0:0:0): REPORT LUNS. CDB: a0 00 00 00 00 00 00 00 00 10 00 00 (probe0:mpt0:0:0:0): CAM status: SCSI Status Error (probe0:mpt0:0:0:0): SCSI status: Check Condition (probe0:mpt0:0:0:0): SCSI sense: ILLEGAL REQUEST asc:ffffffff,ffffffff (Reserved ASC/ASCQ pair) (probe0:mpt0:0:0:0): Error 22, Unretryable error (probe64:mpt0:1:1:0): INQUIRY. CDB: 12 00 00 00 24 00 (probe64:mpt0:1:1:0): CAM status: Unrecoverable Host Bus Adapter Error (probe64:mpt0:1:1:0): Retrying command (probe64:mpt0:1:1:0): INQUIRY. CDB: 12 00 00 00 24 00 (probe64:mpt0:1:1:0): CAM status: Unrecoverable Host Bus Adapter Error (probe64:mpt0:1:1:0): Retrying command (probe64:mpt0:1:1:0): INQUIRY. CDB: 12 00 00 00 24 00 (probe64:mpt0:1:1:0): CAM status: Unrecoverable Host Bus Adapter Error (probe64:mpt0:1:1:0): Error 5, Retries exhausted (probe1:mpt0:1:1:0): INQUIRY. CDB: 12 00 00 00 24 00 (probe1:mpt0:1:1:0): CAM status: Unrecoverable Host Bus Adapter Error (probe1:mpt0:1:1:0): Retrying command (probe1:mpt0:1:1:0): INQUIRY. CDB: 12 00 00 00 24 00 (probe1:mpt0:1:1:0): CAM status: Unrecoverable Host Bus Adapter Error (probe1:mpt0:1:1:0): Retrying command (probe1:mpt0:1:1:0): INQUIRY. CDB: 12 00 00 00 24 00 (probe1:mpt0:1:1:0): CAM status: Unrecoverable Host Bus Adapter Error (probe1:mpt0:1:1:0): Retrying command (probe1:mpt0:1:1:0): INQUIRY. CDB: 12 00 00 00 24 00 (probe1:mpt0:1:1:0): CAM status: Unrecoverable Host Bus Adapter Error (probe1:mpt0:1:1:0): Retrying command (probe1:mpt0:1:1:0): INQUIRY. CDB: 12 00 00 00 24 00 (probe1:mpt0:1:1:0): CAM status: Unrecoverable Host Bus Adapter Error (probe1:mpt0:1:1:0): Error 5, Retries exhausted da0 at mpt0 bus 0 scbus0 target 0 lun 0 ses0 at mpt0 bus 0 scbus0 target 8 lun 0 pass2 at mpt0 bus 1 scbus1 target 0 lun 0 I can try to narrow down when this regression was introduced, but I fig- ured I'd report it in case somebody has an "ah-hah" moment from seeing it. Also, there has always been an issue with passthru on these controllers - as you can see above, there are 2 physical disks attached to the controller, used as a mirror volume. But only one of the members appears as a passN de- vice, which means that the other one can't be monitored with smartmontools. If I'm remembering correctly, a volume with more than 2 drives creates a passN device for all but one of the drives. Terry Kennedy http://www.glaver.org New York, NY USA