Date: Mon, 12 Mar 2012 13:03:23 -0400 (EDT) From: Jonathan Stewart <jonathan@kc8onw.net> To: FreeBSD-gnats-submit@FreeBSD.org Subject: kern/165982: MPT instability drive resets and losses on FreeBSD 9-stable r232224 Message-ID: <201203121703.q2CH3NeF002640@storage.kc8onw.net> Resent-Message-ID: <201203121710.q2CHAAch011891@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
>Number: 165982 >Category: kern >Synopsis: MPT instability drive resets and losses on FreeBSD 9-stable r232224 >Confidential: no >Severity: serious >Priority: medium >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Mon Mar 12 17:10:10 UTC 2012 >Closed-Date: >Last-Modified: >Originator: Jonathan Stewart >Release: FreeBSD 9.0-STABLE amd64 >Organization: >Environment: System: FreeBSD storage.kc8onw.net 9.0-STABLE FreeBSD 9.0-STABLE #9 r232224: Thu Mar 1 14:07:11 EST 2012 root@storage.kc8onw.net:/usr/obj/usr/src/sys/STORAGE amd64 hostb0@pci0:0:0:0: class=0x060000 card=0x062415d9 chip=0x01088086 rev=0x09 hdr=0x00 vendor = 'Intel Corporation' device = 'Xeon E3-1200 Processor Family DRAM Controller' class = bridge subclass = HOST-PCI pcib1@pci0:0:1:0: class=0x060400 card=0x062415d9 chip=0x01018086 rev=0x09 hdr=0x01 vendor = 'Intel Corporation' device = 'Xeon E3-1200/2nd Generation Core Processor Family PCI Express Root Port' class = bridge subclass = PCI-PCI em0@pci0:0:25:0: class=0x020000 card=0x150215d9 chip=0x15028086 rev=0x05 hdr=0x00 vendor = 'Intel Corporation' device = '82579LM Gigabit Network Connection' class = network subclass = ethernet ehci0@pci0:0:26:0: class=0x0c0320 card=0x062415d9 chip=0x1c2d8086 rev=0x05 hdr=0x00 vendor = 'Intel Corporation' device = '6 Series/C200 Series Chipset Family USB Enhanced Host Controller' class = serial bus subclass = USB pcib2@pci0:0:28:0: class=0x060400 card=0x062415d9 chip=0x1c108086 rev=0xb5 hdr=0x01 vendor = 'Intel Corporation' device = '6 Series/C200 Series Chipset Family PCI Express Root Port 1' class = bridge subclass = PCI-PCI pcib3@pci0:0:28:4: class=0x060400 card=0x062415d9 chip=0x1c188086 rev=0xb5 hdr=0x01 vendor = 'Intel Corporation' device = '6 Series/C200 Series Chipset Family PCI Express Root Port 5' class = bridge subclass = PCI-PCI ehci1@pci0:0:29:0: class=0x0c0320 card=0x062415d9 chip=0x1c268086 rev=0x05 hdr=0x00 vendor = 'Intel Corporation' device = '6 Series/C200 Series Chipset Family USB Enhanced Host Controller' class = serial bus subclass = USB pcib4@pci0:0:30:0: class=0x060401 card=0x062415d9 chip=0x244e8086 rev=0xa5 hdr=0x01 vendor = 'Intel Corporation' device = '82801 PCI Bridge' class = bridge subclass = PCI-PCI isab0@pci0:0:31:0: class=0x060100 card=0x062415d9 chip=0x1c548086 rev=0x05 hdr=0x00 vendor = 'Intel Corporation' device = 'C204 Chipset Family LPC Controller' class = bridge subclass = PCI-ISA ahci0@pci0:0:31:2: class=0x010601 card=0x062415d9 chip=0x1c028086 rev=0x05 hdr=0x00 vendor = 'Intel Corporation' device = '6 Series/C200 Series Chipset Family 6 port SATA AHCI Controller' class = mass storage subclass = SATA none0@pci0:0:31:3: class=0x0c0500 card=0x062415d9 chip=0x1c228086 rev=0x05 hdr=0x00 vendor = 'Intel Corporation' device = '6 Series/C200 Series Chipset Family SMBus Controller' class = serial bus subclass = SMBus mpt0@pci0:1:0:0: class=0x010000 card=0x31401000 chip=0x00581000 rev=0x08 hdr=0x00 vendor = 'LSI Logic / Symbios Logic' device = 'SAS1068E PCI-Express Fusion-MPT SAS' class = mass storage subclass = SCSI em1@pci0:2:0:0: class=0x020000 card=0x10838086 chip=0x10b98086 rev=0x06 hdr=0x00 vendor = 'Intel Corporation' device = '82572EI Gigabit Ethernet Controller (Copper)' class = network subclass = ethernet em2@pci0:3:0:0: class=0x020000 card=0x000015d9 chip=0x10d38086 rev=0x00 hdr=0x00 vendor = 'Intel Corporation' device = '82574L Gigabit Network Connection' class = network subclass = ethernet vgapci0@pci0:4:3:0: class=0x030000 card=0x062415d9 chip=0x0532102b rev=0x0a hdr=0x00 vendor = 'Matrox Graphics, Inc.' device = 'MGA G200eW WPCM450' class = display subclass = VGA >Description: I upgraded to 9-stable and around the same time had drive failures. Now when doing heavy I/O to drives attached to the mpt controller I get errors such as (da3:mpt0:0:14:0): WRITE(10). CDB: 2a 0 42 0 45 f0 0 0 8 0 (da3:mpt0:0:14:0): CAM status: SCSI Status Error (da3:mpt0:0:14:0): SCSI status: Check Condition (da3:mpt0:0:14:0): SCSI sense: MEDIUM ERROR asc:14,1 (Record not found) and (da7:mpt0:0:13:0): SCSI status error (da7:mpt0:0:13:0): WRITE(10). CDB: 2a 0 2a 0 a6 10 0 0 28 0 (da7:mpt0:0:13:0): CAM status: SCSI Status Error (da7:mpt0:0:13:0): SCSI status: Check Condition (da7:mpt0:0:13:0): SCSI sense: ABORTED COMMAND asc:0,0 (No additional sense information) (da7:mpt0:0:13:0): Retrying command (per sense data) and (da7:mpt0:0:13:0): SCSI status error (da7:mpt0:0:13:0): READ(10). CDB: 28 0 29 e d7 0 0 0 38 0 (da7:mpt0:0:13:0): CAM status: SCSI Status Error (da7:mpt0:0:13:0): SCSI status: Check Condition (da7:mpt0:0:13:0): SCSI sense: UNIT ATTENTION asc:29,0 (Power on, reset, or bus device reset occurred) (da7:mpt0:0:13:0): Retrying command (per sense data) (da7:mpt0:0:13:0): CAM status 0x18 (da7:mpt0:0:13:0): Retrying command (da7:mpt0:0:13:0): CAM status 0x18 and (da7:mpt0:0:13:0): SCSI status error (da7:mpt0:0:13:0): WRITE(10). CDB: 2a 0 5c 0 2c b0 0 0 8 0 (da7:mpt0:0:13:0): CAM status: SCSI Status Error (da7:mpt0:0:13:0): SCSI status: Check Condition (da7:mpt0:0:13:0): SCSI sense: ABORTED COMMAND asc:0,0 (No additional sense information) (da7:mpt0:0:13:0): Error 5, Retries exhausted mpt0: request 0xffffff8001a68060:30428 timed out for ccb 0xfffffe00072d6800 (req->ccb 0xfffffe00072d6800) mpt0: request 0xffffff8001a73cd0:30429 timed out for ccb 0xfffffe0007453800 (req->ccb 0xfffffe0007453800) mpt0: attempting to abort req 0xffffff8001a68060:30428 function 0 mpt0: request 0xffffff8001a682a0:30430 timed out for ccb 0xfffffe001202c800 (req->ccb 0xfffffe001202c800) mpt0: request 0xffffff8001a745d0:30431 timed out for ccb 0xfffffe0007408000 (req->ccb 0xfffffe0007408000) mpt0: request 0xffffff8001a72da0:30432 timed out for ccb 0xfffffe0006842800 (req->ccb 0xfffffe0006842800) mpt0: request 0xffffff8001a73a00:30433 timed out for ccb 0xfffffe0007fcd000 (req->ccb 0xfffffe0007fcd000) mpt0: request 0xffffff8001a66710:30434 timed out for ccb 0xfffffe000740a000 (req->ccb 0xfffffe000740a000) mpt0: mpt_wait_req(1) timed out mpt0: mpt_recover_commands: abort timed-out. Resetting controller mpt0: mpt_cam_event: 0x80 mpt0: Unhandled Event Notify Frame. Event 0xffffff80 (ACK not required). mpt0: completing timedout/aborted req 0xffffff8001a68060:30428 mpt0: completing timedout/aborted req 0xffffff8001a73cd0:30429 mpt0: completing timedout/aborted req 0xffffff8001a682a0:30430 mpt0: completing timedout/aborted req 0xffffff8001a745d0:30431 mpt0: completing timedout/aborted req 0xffffff8001a72da0:30432 mpt0: completing timedout/aborted req 0xffffff8001a73a00:30433 mpt0: completing timedout/aborted req 0xffffff8001a66710:30434 (da7:mpt0:0:13:0): Bus Reset issued (da7:mpt0:0:13:0): Retrying command and finally (da7:mpt0:0:13:0): SCSI status error (da7:mpt0:0:13:0): READ(10). CDB: 28 0 29 e d6 f8 0 0 8 0 (da7:mpt0:0:13:0): CAM status: SCSI Status Error (da7:mpt0:0:13:0): SCSI status: Check Condition (da7:mpt0:0:13:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error) (da7:mpt0:0:13:0): Info: 0x290ed6f8 (da7:mpt0:0:13:0): Error 5, Unretryable error mpt0: request 0xffffff8001a6bff0:30904 timed out for ccb 0xfffffe0007453800 (req->ccb 0xfffffe0007453800) mpt0: attempting to abort req 0xffffff8001a6bff0:30904 function 0 mpt0: mpt_send_handshake_cmd: db ignored mpt0: soft reset failed: device not running mpt0: WARNING - Failed hard reset! Trying to initialize anyway. mpt0: mpt_cam_event: 0xff mpt0: Unhandled Event Notify Frame. Event 0xffffffff (ACK not required). mpt0: completing timedout/aborted req 0xffffff8001a6bff0:30904 (da0:mpt0:0:1:0): Bus Reset issued (da0:mpt0:0:1:0): Retrying command Eventually the system gets into a state where no disk I/O happens at all and multiple drives are lost and I have to reset it. >How-To-Repeat: Place a heavy I/O load on an MTP controller with SATA drives. >Fix: >Release-Note: >Audit-Trail: >Unformatted:
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201203121703.q2CH3NeF002640>