Date: Wed, 13 Apr 2022 05:14:57 +0000 (UTC) From: mahesh mv <maheshm_v@yahoo.com> To: Chris <bsd-lists@bsdforge.com> Cc: "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org> Subject: Re: xhci USB transaction error and subsequent recovery mechanism on Freebsd stable/12 Message-ID: <1601830847.251013.1649826897803@mail.yahoo.com> In-Reply-To: <5fefe57150f9efab867a775722b9d71b@bsdforge.com> References: <1524993805.98701.1649776236883.ref@mail.yahoo.com> <1524993805.98701.1649776236883@mail.yahoo.com> <5fefe57150f9efab867a775722b9d71b@bsdforge.com>
next in thread | previous in thread | raw e-mail | index | archive | help
[-- Attachment #1 --]
Hi,
Thank you for the inputs. The drive is formatted as GPT with an ESP/UFS partitions.
Thanks,Mahesh
On Wednesday, 13 April 2022, 02:57:45 GMT+5:30, Chris <bsd-lists@bsdforge.com> wrote:
On 2022-04-12 08:10, mahesh mv wrote:
> Hi all,
>
>
>
> Need you help regarding an urgent issue where we are observing an issue with
> Freebsd stable/12. The DATA0/DATA1 are out of sync with respect to EP and
> the
> system experiences the
>
> READ(10) errors. The READ(10) error recovers with in couple of retries most
> of the
> times but few cases we have observed that the read retries gets exhausted
> and
> system moves
>
> to unusable state (continuous g_vfs_done() errors) . We are using Junos but
> the
> xhci driver etc.. are all pristine stable 12 drivers no Juniper specific
> changes.
> This issue was never observed with Linux kernel 5.4.2 on the same HW.
> Errors Seen on console
>
>
>
> (da0:umass-sim0:0:0:0): READ(10). CDB: 28 00 00 28 cf 28 00 00 40 00
>
> (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
>
> (da0:umass-sim0:0:0:0): Retrying command, 3 more tries remain
>
> (da0:umass-sim0:0:0:0): READ(10). CDB: 28 00 00 28 cf 28 00 00 40 00
>
> (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
>
> (da0:umass-sim0:0:0:0): Retrying command, 2 more tries remain
>
> FreeBSD/arm (Amnesiac) (ttyu0)
>
> login:
>
> I can share the USB traces taken at the USB device if required.
> Thanks,Mahesh
I just replaced a drive 2 days ago that exhibited the same behavior. I
haven't (yet)
checked the replaced drive yet for cause. But what I chose to do was as
follows.
Get a new (known dependable) drive. Add it to the system and dump the data on
the
failing disk to the new drive. At least you'll have a safe copy of it.
You didn't say how the drive(s) are formatted/laid out. Are you using UFS/GPT
or
ZFS?
How you proceed after making a safe copy will depend on how you manage your
disks.
UFS/GPT?: simply remove the failing the disk, and change the entry in
fdtab(5) to
point to the new disk.
ZFS. It should be enough to simply replace the failing disk with one at least
the
size of the failing one and resilver.
HTH
--Chris
[-- Attachment #2 --]
<html><head></head><body><div class="ydp8b7e2091yahoo-style-wrap" style="font-family:Helvetica Neue, Helvetica, Arial, sans-serif;font-size:13px;"><div></div>
<div dir="ltr" data-setdir="false">Hi,</div><div dir="ltr" data-setdir="false"><br></div><div dir="ltr" data-setdir="false">Thank you for the inputs. The drive is formatted as GPT with an ESP/UFS partitions.</div><div dir="ltr" data-setdir="false"><br></div><div dir="ltr" data-setdir="false">Thanks,</div><div dir="ltr" data-setdir="false">Mahesh</div><div dir="ltr" data-setdir="false"><br></div><div><br></div>
</div><div id="ydp1d9310b1yahoo_quoted_0633315197" class="ydp1d9310b1yahoo_quoted">
<div style="font-family:'Helvetica Neue', Helvetica, Arial, sans-serif;font-size:13px;color:#26282a;">
<div>
On Wednesday, 13 April 2022, 02:57:45 GMT+5:30, Chris <bsd-lists@bsdforge.com> wrote:
</div>
<div><br></div>
<div><br></div>
<div><div dir="ltr">On 2022-04-12 08:10, mahesh mv wrote:<br clear="none">> Hi all,<br clear="none">> <br clear="none">> <br clear="none">> <br clear="none">> Need you help regarding an urgent issue where we are observing an issue with<br clear="none">> Freebsd stable/12. The DATA0/DATA1 are out of sync with respect to EP and <br clear="none">> the<br clear="none">> system experiences the<br clear="none">> <br clear="none">> READ(10) errors. The READ(10) error recovers with in couple of retries most <br clear="none">> of the<br clear="none">> times but few cases we have observed that the read retries gets exhausted <br clear="none">> and<br clear="none">> system moves<br clear="none">> <br clear="none">> to unusable state (continuous g_vfs_done() errors) . We are using Junos but <br clear="none">> the<br clear="none">> xhci driver etc.. are all pristine stable 12 drivers no Juniper specific <br clear="none">> changes.<br clear="none">> This issue was never observed with Linux kernel 5.4.2 on the same HW.<br clear="none">> Errors Seen on console<br clear="none">> <br clear="none">> <br clear="none">> <br clear="none">> (da0:umass-sim0:0:0:0): READ(10). CDB: 28 00 00 28 cf 28 00 00 40 00<br clear="none">> <br clear="none">> (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error<br clear="none">> <br clear="none">> (da0:umass-sim0:0:0:0): Retrying command, 3 more tries remain<br clear="none">> <br clear="none">> (da0:umass-sim0:0:0:0): READ(10). CDB: 28 00 00 28 cf 28 00 00 40 00<br clear="none">> <br clear="none">> (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error<br clear="none">> <br clear="none">> (da0:umass-sim0:0:0:0): Retrying command, 2 more tries remain<br clear="none">> <br clear="none">> FreeBSD/arm (Amnesiac) (ttyu0)<br clear="none">> <br clear="none">> login:<br clear="none">> <br clear="none">> I can share the USB traces taken at the USB device if required.<br clear="none">> Thanks,Mahesh<br clear="none">I just replaced a drive 2 days ago that exhibited the same behavior. I <br clear="none">haven't (yet)<br clear="none">checked the replaced drive yet for cause. But what I chose to do was as <br clear="none">follows.<br clear="none">Get a new (known dependable) drive. Add it to the system and dump the data on <br clear="none">the<br clear="none">failing disk to the new drive. At least you'll have a safe copy of it.<br clear="none">You didn't say how the drive(s) are formatted/laid out. Are you using UFS/GPT <br clear="none">or<br clear="none">ZFS?<br clear="none">How you proceed after making a safe copy will depend on how you manage your <br clear="none">disks.<br clear="none">UFS/GPT?: simply remove the failing the disk, and change the entry in <br clear="none">fdtab(5) to<br clear="none">point to the new disk.<br clear="none">ZFS. It should be enough to simply replace the failing disk with one at least <div class="ydp1d9310b1yqt2490421151" id="ydp1d9310b1yqtfd27184"><br clear="none">the</div><br clear="none">size of the failing one and resilver.<br clear="none"><br clear="none">HTH<br clear="none"><br clear="none">--Chris<div class="ydp1d9310b1yqt2490421151" id="ydp1d9310b1yqtfd11869"><br clear="none"></div></div></div>
</div>
</div></body></html>
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1601830847.251013.1649826897803>
