Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 22 Sep 2020 06:39:18 +0000
From:      bugzilla-noreply@freebsd.org
To:        bugs@FreeBSD.org
Subject:   [Bug 229745] ahcich: CAM status: Command timeout
Message-ID:  <bug-229745-227-kNNPUrV8Zw@https.bugs.freebsd.org/bugzilla/>
In-Reply-To: <bug-229745-227@https.bugs.freebsd.org/bugzilla/>
References:  <bug-229745-227@https.bugs.freebsd.org/bugzilla/>

next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D229745

Nicolas Richeton <nicolas.richeton@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |nicolas.richeton@gmail.com

--- Comment #55 from Nicolas Richeton <nicolas.richeton@gmail.com> ---
Hello,=20

On FreeBSD 11.3-RELEASE-p11 (FreeNAS) :

I got the same issue with a ASMedia ASM1062 AHCI SATA controller. (2 SATA
ports; PCIe x1 card on PCIe2.0 connector)=20
2 drives connected, configured as a ZFS mirror pool. I faced this issue when
switching from 2 HDD to 2 SSD.

It seems related to the volume of data flowing through the controller:=20

- When 2 HDD (SATA2) are used : everything is fine.

- When 2 SSD (SATA3) are used : they are detected correctly, but when I sta=
rt a
copy/zfs scrub, I get a lot of :=20

Sep 18 16:54:21 nas (ada1:ahcich1:0:0:0): READ_FPDMA_QUEUED. ACB: 60 b0 70 =
f1
cc 40 02 00 00 00 00 00
Sep 18 16:54:21 nas (ada1:ahcich1:0:0:0): CAM status: Uncorrectable parity/=
CRC
error
Sep 18 16:54:21 nas (ada1:ahcich1:0:0:0): Retrying command
Sep 18 16:54:51 nas ahcich1: Timeout on slot 5 port 0
Sep 18 16:54:51 nas ahcich1: is 00000000 cs 00000000 ss 00000020 rs 00000020
tfd 40 serr 00000000 cmd 0004c517

Sep 18 16:54:51 nas (ada1:ahcich1:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 30 d0=
 6d
e0 40 4f 00 00 00 00 00
Sep 18 16:54:51 nas (ada1:ahcich1:0:0:0): CAM status: Command timeout
Sep 18 16:54:51 nas (ada1:ahcich1:0:0:0): Retrying command
Sep 18 16:55:21 nas ahcich1: Timeout on slot 14 port 0

And sometimes :=20
Sep 13 14:10:06 nas (aprobe0:ahcich1:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00=
 00
40 00 00 00 00 00 00
Sep 13 14:10:06 nas (aprobe0:ahcich1:0:0:0): CAM status: Command timeout
Sep 13 14:10:06 nas (aprobe0:ahcich1:0:0:0): Retrying command

In extreme cases, I loose the drive, the pool goes degraded, and I have to
reboot to bring the disk back.=20

There are more messages if the io speed is high : copy through network gives
some messages, scrub gives a lot of messages and progress pauses (and can
result of the lost of one disk).
It can also affect the other drive (ada0), but more errors on the second one
(ada1)

I changed the SSD =3D> same issue with SAMSUNG 860 EVO and Crucial MX500.
I changed the cables =3D> same issue=20
BIOS is up to date (HP micro server Gen8)

- When I plug one of the SSD on another SATA2-only port on the motherboard =
(so
1 SSD SATA2 and 1 SSD SATA3 on ASMedia controller) =3D> everything is fine =
when I
do a scrub, probably because ZFS is waiting for the slower drive =3D> data =
flow
is smaller

- When I do a ZFS replace : HDD + HDD->SSD (3 drives connected during the
replace - 2 on ASMedia controller and 1 on motherboard) : there are no erro=
rs
(HDD are limiting the speed). Issues start when the pool is SSD-only, on SA=
TA3.=20

Config :
Sep 13 15:11:29 nas ahci0: <ASMedia ASM1062 AHCI SATA controller> port
0x5000-0x5007,0x5008-0x500b,0x5010-0x5017,0x5018-0x501b,0x5020-0x503f mem
0xfbff0000-0xfbff01ff irq 16 at device 0.0 on pci1
Sep 13 15:11:29 nas ahci0: AHCI v1.20 with 2 6Gbps ports, Port Multiplier
supported
Sep 13 15:11:29 nas ahci0: quirks=3D0xc00000<NOCCS,NOAUX>
Sep 13 15:11:29 nas ahcich0: <AHCI channel> at channel 0 on ahci0
Sep 13 15:11:29 nas ahcich1: <AHCI channel> at channel 1 on ahci0
Sep 13 15:11:29 nas ahci1: <Intel Cougar Point (RAID) AHCI SATA controller>
port 0x1000-0x1007,0x1008-0x100b,0x1010-0x1017,0x1018-0x101b,0x1020-0x103f =
mem
0xfacd0000-0xfacd07ff irq 17 at device 31.2 on pci0
Sep 13 15:11:29 nas ahci1: AHCI v1.30 with 6 6Gbps ports, Port Multiplier
supported
Sep 13 15:11:29 nas ahcich2: <AHCI channel> at channel 0 on ahci1
Sep 13 15:11:29 nas ahcich3: <AHCI channel> at channel 1 on ahci1
Sep 13 15:11:29 nas ahcich4: <AHCI channel> at channel 2 on ahci1
Sep 13 15:11:29 nas ahcich5: <AHCI channel> at channel 3 on ahci1
Sep 13 15:11:29 nas ahcich6: <AHCI channel> at channel 4 on ahci1
Sep 13 15:11:29 nas ahcich7: <AHCI channel> at channel 5 on ahci1
Sep 13 15:11:29 nas ahciem0: <AHCI enclosure management bridge> on ahci1

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-229745-227-kNNPUrV8Zw>