Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 13 Apr 2022 05:14:57 +0000 (UTC)
From:      mahesh mv <maheshm_v@yahoo.com>
To:        Chris <bsd-lists@bsdforge.com>
Cc:        "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org>
Subject:   Re: xhci USB transaction error and subsequent recovery mechanism on Freebsd stable/12
Message-ID:  <1601830847.251013.1649826897803@mail.yahoo.com>
In-Reply-To: <5fefe57150f9efab867a775722b9d71b@bsdforge.com>
References:  <1524993805.98701.1649776236883.ref@mail.yahoo.com> <1524993805.98701.1649776236883@mail.yahoo.com> <5fefe57150f9efab867a775722b9d71b@bsdforge.com>

next in thread | previous in thread | raw e-mail | index | archive | help
------=_Part_251012_182260728.1649826897801
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

 Hi,
Thank you for the inputs. The drive is formatted as GPT with an ESP/UFS par=
titions.
Thanks,Mahesh

    On Wednesday, 13 April 2022, 02:57:45 GMT+5:30, Chris <bsd-lists@bsdfor=
ge.com> wrote: =20
=20
 On 2022-04-12 08:10, mahesh mv wrote:
> Hi all,
>=20
> =C2=A0
>=20
> Need you help regarding an urgent issue where we are observing an issue w=
ith
> Freebsd stable/12. The DATA0/DATA1 are out of sync with respect to EP and=
=20
> the
> system experiences the
>=20
> READ(10) errors. The READ(10) error recovers with in couple of retries mo=
st=20
> of the
> times but few cases we have observed that the read retries gets exhausted=
=20
> and
> =C2=A0system moves
>=20
> to unusable state (continuous g_vfs_done() errors) . We are using Junos b=
ut=20
> the
> xhci driver etc.. are all pristine stable 12 drivers no Juniper specific=
=20
> changes.
> =C2=A0This issue was never observed with Linux kernel 5.4.2 on the same H=
W.
> =C2=A0Errors Seen on console
>=20
> =C2=A0
>=20
> (da0:umass-sim0:0:0:0): READ(10). CDB: 28 00 00 28 cf 28 00 00 40 00
>=20
> (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
>=20
> (da0:umass-sim0:0:0:0): Retrying command, 3 more tries remain
>=20
> (da0:umass-sim0:0:0:0): READ(10). CDB: 28 00 00 28 cf 28 00 00 40 00
>=20
> (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
>=20
> (da0:umass-sim0:0:0:0): Retrying command, 2 more tries remain
>=20
> FreeBSD/arm (Amnesiac) (ttyu0)
>=20
> login:
>=20
> I can share the USB traces taken at the USB device if required.
> Thanks,Mahesh
I just replaced a drive 2 days ago that exhibited the same behavior. I=20
haven't (yet)
checked the replaced drive yet for cause. But what I chose to do was as=20
follows.
Get a new (known dependable) drive. Add it to the system and dump the data =
on=20
the
failing disk to the new drive. At least you'll have a safe copy of it.
You didn't say how the drive(s) are formatted/laid out. Are you using UFS/G=
PT=20
or
ZFS?
How you proceed after making a safe copy will depend on how you manage your=
=20
disks.
UFS/GPT?: simply remove the failing the disk, and change the entry in=20
fdtab(5) to
point to the new disk.
ZFS. It should be enough to simply replace the failing disk with one at lea=
st=20
the
size of the failing one and resilver.

HTH

--Chris
 =20
------=_Part_251012_182260728.1649826897801
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<html><head></head><body><div class=3D"ydp8b7e2091yahoo-style-wrap" style=
=3D"font-family:Helvetica Neue, Helvetica, Arial, sans-serif;font-size:13px=
;"><div></div>
        <div dir=3D"ltr" data-setdir=3D"false">Hi,</div><div dir=3D"ltr" da=
ta-setdir=3D"false"><br></div><div dir=3D"ltr" data-setdir=3D"false">Thank =
you for the inputs. The drive is formatted as GPT with an ESP/UFS partition=
s.</div><div dir=3D"ltr" data-setdir=3D"false"><br></div><div dir=3D"ltr" d=
ata-setdir=3D"false">Thanks,</div><div dir=3D"ltr" data-setdir=3D"false">Ma=
hesh</div><div dir=3D"ltr" data-setdir=3D"false"><br></div><div><br></div>
       =20
        </div><div id=3D"ydp1d9310b1yahoo_quoted_0633315197" class=3D"ydp1d=
9310b1yahoo_quoted">
            <div style=3D"font-family:'Helvetica Neue', Helvetica, Arial, s=
ans-serif;font-size:13px;color:#26282a;">
               =20
                <div>
                    On Wednesday, 13 April 2022, 02:57:45 GMT+5:30, Chris &=
lt;bsd-lists@bsdforge.com&gt; wrote:
                </div>
                <div><br></div>
                <div><br></div>
                <div><div dir=3D"ltr">On 2022-04-12 08:10, mahesh mv wrote:=
<br clear=3D"none">&gt; Hi all,<br clear=3D"none">&gt; <br clear=3D"none">&=
gt; &nbsp;<br clear=3D"none">&gt; <br clear=3D"none">&gt; Need you help reg=
arding an urgent issue where we are observing an issue with<br clear=3D"non=
e">&gt; Freebsd stable/12. The DATA0/DATA1 are out of sync with respect to =
EP and <br clear=3D"none">&gt; the<br clear=3D"none">&gt; system experience=
s the<br clear=3D"none">&gt; <br clear=3D"none">&gt; READ(10) errors. The R=
EAD(10) error recovers with in couple of retries most <br clear=3D"none">&g=
t; of the<br clear=3D"none">&gt; times but few cases we have observed that =
the read retries gets exhausted <br clear=3D"none">&gt; and<br clear=3D"non=
e">&gt; &nbsp;system moves<br clear=3D"none">&gt; <br clear=3D"none">&gt; t=
o unusable state (continuous g_vfs_done() errors) . We are using Junos but =
<br clear=3D"none">&gt; the<br clear=3D"none">&gt; xhci driver etc.. are al=
l pristine stable 12 drivers no Juniper specific <br clear=3D"none">&gt; ch=
anges.<br clear=3D"none">&gt; &nbsp;This issue was never observed with Linu=
x kernel 5.4.2 on the same HW.<br clear=3D"none">&gt; &nbsp;Errors Seen on =
console<br clear=3D"none">&gt; <br clear=3D"none">&gt; &nbsp;<br clear=3D"n=
one">&gt; <br clear=3D"none">&gt; (da0:umass-sim0:0:0:0): READ(10). CDB: 28=
 00 00 28 cf 28 00 00 40 00<br clear=3D"none">&gt; <br clear=3D"none">&gt; =
(da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error<br =
clear=3D"none">&gt; <br clear=3D"none">&gt; (da0:umass-sim0:0:0:0): Retryin=
g command, 3 more tries remain<br clear=3D"none">&gt; <br clear=3D"none">&g=
t; (da0:umass-sim0:0:0:0): READ(10). CDB: 28 00 00 28 cf 28 00 00 40 00<br =
clear=3D"none">&gt; <br clear=3D"none">&gt; (da0:umass-sim0:0:0:0): CAM sta=
tus: CCB request completed with an error<br clear=3D"none">&gt; <br clear=
=3D"none">&gt; (da0:umass-sim0:0:0:0): Retrying command, 2 more tries remai=
n<br clear=3D"none">&gt; <br clear=3D"none">&gt; FreeBSD/arm (Amnesiac) (tt=
yu0)<br clear=3D"none">&gt; <br clear=3D"none">&gt; login:<br clear=3D"none=
">&gt; <br clear=3D"none">&gt; I can share the USB traces taken at the USB =
device if required.<br clear=3D"none">&gt; Thanks,Mahesh<br clear=3D"none">=
I just replaced a drive 2 days ago that exhibited the same behavior. I <br =
clear=3D"none">haven't (yet)<br clear=3D"none">checked the replaced drive y=
et for cause. But what I chose to do was as <br clear=3D"none">follows.<br =
clear=3D"none">Get a new (known dependable) drive. Add it to the system and=
 dump the data on <br clear=3D"none">the<br clear=3D"none">failing disk to =
the new drive. At least you'll have a safe copy of it.<br clear=3D"none">Yo=
u didn't say how the drive(s) are formatted/laid out. Are you using UFS/GPT=
 <br clear=3D"none">or<br clear=3D"none">ZFS?<br clear=3D"none">How you pro=
ceed after making a safe copy will depend on how you manage your <br clear=
=3D"none">disks.<br clear=3D"none">UFS/GPT?: simply remove the failing the =
disk, and change the entry in <br clear=3D"none">fdtab(5) to<br clear=3D"no=
ne">point to the new disk.<br clear=3D"none">ZFS. It should be enough to si=
mply replace the failing disk with one at least <div class=3D"ydp1d9310b1yq=
t2490421151" id=3D"ydp1d9310b1yqtfd27184"><br clear=3D"none">the</div><br c=
lear=3D"none">size of the failing one and resilver.<br clear=3D"none"><br c=
lear=3D"none">HTH<br clear=3D"none"><br clear=3D"none">--Chris<div class=3D=
"ydp1d9310b1yqt2490421151" id=3D"ydp1d9310b1yqtfd11869"><br clear=3D"none">=
</div></div></div>
            </div>
        </div></body></html>
------=_Part_251012_182260728.1649826897801--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1601830847.251013.1649826897803>