Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 20 Dec 2017 11:09:52 -0500
From:      Ken Merry <ken@freebsd.org>
To:        Rebecca Cran <rebecca@bluestop.org>
Cc:        Alan Somers <asomers@freebsd.org>, FreeBSD-scsi <freebsd-scsi@freebsd.org>
Subject:   Re: FreeBSD 11 not sending repeated TURs until good status returned?
Message-ID:  <9209CA38-750D-4966-911F-342092309DDF@freebsd.org>
In-Reply-To: <30772d0e-2456-a90e-54a2-7575987b25b4@bluestop.org>
References:  <87ad0e0a-0183-d123-d7f3-2735c8cf854e@bluestop.org> <CAOtMX2jkxQ__2VYMsrj111Wza67ONQrrMPSmMPKFgugP%2BoYd-g@mail.gmail.com> <30772d0e-2456-a90e-54a2-7575987b25b4@bluestop.org>

next in thread | previous in thread | raw e-mail | index | archive | help

> On Dec 19, 2017, at 7:53 PM, Rebecca Cran <rebecca@bluestop.org> =
wrote:
>=20
> On 12/19/2017 05:29 PM, Alan Somers wrote:
>=20
>>=20
>> What's the problem exactly?  Does FreeBSD poll the device or not?   =
Does FreeBSD give up too soon, or poll with the wrong command, or what?  =
And if you don't mind me asking, what sort of drive is this that takes =
so long to come ready?
>=20
> FreeBSD thinks the device is ready before it really is, and ends up =
issuing read commands that fail, resulting in the device being removed.
> The drive is a SAS SSD, and I don't know why it takes longer than most =
to become read.
>=20

I have seen this behavior on some HGST SSDs.  I haven=E2=80=99t had a =
chance to fully chase it down.

The polling code is in there and is active in this case.  You can tell =
because of this message:

>=20
>   Polling device for readiness


It will send a TUR every half second for a minute to wait for the device =
to become ready, and then retry the read if the TURs succeeded.  I =
*think* (I=E2=80=99d have to look more closely), it=E2=80=99ll retry the =
READ four more times, and will go through the 1 minute TUR sequence each =
time.

But the mpssas_prepare_remove message indicates that this disk (or =
another one) is getting removed by the controller.

IMO, the sense data probably means the SSD is doing something wrong.  =
They should become ready before they turn on the SAS port.  The =
initiator is going to try sending commands as soon as the port comes =
active.  And if an SSD can=E2=80=99t come ready in a minute (spinning =
drives take ~10 seconds to spin up), something is wrong.

We=E2=80=99ll probably need full logs to get a better idea of what is =
going on.

Ken
=E2=80=94=20
Ken Merry
ken@FreeBSD.ORG




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?9209CA38-750D-4966-911F-342092309DDF>