FreeBSD Mail Archives

Date:      Wed, 13 Apr 2022 05:14:57 +0000 (UTC)
From:      mahesh mv <maheshm_v@yahoo.com>
To:        Chris <bsd-lists@bsdforge.com>
Cc:        "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org>
Subject:   Re: xhci USB transaction error and subsequent recovery mechanism on Freebsd stable/12
Message-ID:  <1601830847.251013.1649826897803@mail.yahoo.com>
In-Reply-To: <5fefe57150f9efab867a775722b9d71b@bsdforge.com>
References:  <1524993805.98701.1649776236883.ref@mail.yahoo.com> <1524993805.98701.1649776236883@mail.yahoo.com> <5fefe57150f9efab867a775722b9d71b@bsdforge.com>


[-- Attachment #1 --]
 Hi,
Thank you for the inputs. The drive is formatted as GPT with an ESP/UFS partitions.
Thanks,Mahesh

    On Wednesday, 13 April 2022, 02:57:45 GMT+5:30, Chris <bsd-lists@bsdforge.com> wrote:  
 
 On 2022-04-12 08:10, mahesh mv wrote:
> Hi all,
> 
>  
> 
> Need you help regarding an urgent issue where we are observing an issue with
> Freebsd stable/12. The DATA0/DATA1 are out of sync with respect to EP and 
> the
> system experiences the
> 
> READ(10) errors. The READ(10) error recovers with in couple of retries most 
> of the
> times but few cases we have observed that the read retries gets exhausted 
> and
>  system moves
> 
> to unusable state (continuous g_vfs_done() errors) . We are using Junos but 
> the
> xhci driver etc.. are all pristine stable 12 drivers no Juniper specific 
> changes.
>  This issue was never observed with Linux kernel 5.4.2 on the same HW.
>  Errors Seen on console
> 
>  
> 
> (da0:umass-sim0:0:0:0): READ(10). CDB: 28 00 00 28 cf 28 00 00 40 00
> 
> (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
> 
> (da0:umass-sim0:0:0:0): Retrying command, 3 more tries remain
> 
> (da0:umass-sim0:0:0:0): READ(10). CDB: 28 00 00 28 cf 28 00 00 40 00
> 
> (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error
> 
> (da0:umass-sim0:0:0:0): Retrying command, 2 more tries remain
> 
> FreeBSD/arm (Amnesiac) (ttyu0)
> 
> login:
> 
> I can share the USB traces taken at the USB device if required.
> Thanks,Mahesh
I just replaced a drive 2 days ago that exhibited the same behavior. I 
haven't (yet)
checked the replaced drive yet for cause. But what I chose to do was as 
follows.
Get a new (known dependable) drive. Add it to the system and dump the data on 
the
failing disk to the new drive. At least you'll have a safe copy of it.
You didn't say how the drive(s) are formatted/laid out. Are you using UFS/GPT 
or
ZFS?
How you proceed after making a safe copy will depend on how you manage your 
disks.
UFS/GPT?: simply remove the failing the disk, and change the entry in 
fdtab(5) to
point to the new disk.
ZFS. It should be enough to simply replace the failing disk with one at least 
the
size of the failing one and resilver.

HTH

--Chris
  
[-- Attachment #2 --]
<html><head></head><body><div class="ydp8b7e2091yahoo-style-wrap" style="font-family:Helvetica Neue, Helvetica, Arial, sans-serif;font-size:13px;"><div></div>
        <div dir="ltr" data-setdir="false">Hi,</div><div dir="ltr" data-setdir="false"><br></div><div dir="ltr" data-setdir="false">Thank you for the inputs. The drive is formatted as GPT with an ESP/UFS partitions.</div><div dir="ltr" data-setdir="false"><br></div><div dir="ltr" data-setdir="false">Thanks,</div><div dir="ltr" data-setdir="false">Mahesh</div><div dir="ltr" data-setdir="false"><br></div><div><br></div>
        
        </div><div id="ydp1d9310b1yahoo_quoted_0633315197" class="ydp1d9310b1yahoo_quoted">
            <div style="font-family:'Helvetica Neue', Helvetica, Arial, sans-serif;font-size:13px;color:#26282a;">
                
                <div>
                    On Wednesday, 13 April 2022, 02:57:45 GMT+5:30, Chris &lt;bsd-lists@bsdforge.com&gt; wrote:
                </div>
                <div><br></div>
                <div><br></div>
                <div><div dir="ltr">On 2022-04-12 08:10, mahesh mv wrote:<br clear="none">&gt; Hi all,<br clear="none">&gt; <br clear="none">&gt; &nbsp;<br clear="none">&gt; <br clear="none">&gt; Need you help regarding an urgent issue where we are observing an issue with<br clear="none">&gt; Freebsd stable/12. The DATA0/DATA1 are out of sync with respect to EP and <br clear="none">&gt; the<br clear="none">&gt; system experiences the<br clear="none">&gt; <br clear="none">&gt; READ(10) errors. The READ(10) error recovers with in couple of retries most <br clear="none">&gt; of the<br clear="none">&gt; times but few cases we have observed that the read retries gets exhausted <br clear="none">&gt; and<br clear="none">&gt; &nbsp;system moves<br clear="none">&gt; <br clear="none">&gt; to unusable state (continuous g_vfs_done() errors) . We are using Junos but <br clear="none">&gt; the<br clear="none">&gt; xhci driver etc.. are all pristine stable 12 drivers no Juniper specific <br clear="none">&gt; changes.<br clear="none">&gt; &nbsp;This issue was never observed with Linux kernel 5.4.2 on the same HW.<br clear="none">&gt; &nbsp;Errors Seen on console<br clear="none">&gt; <br clear="none">&gt; &nbsp;<br clear="none">&gt; <br clear="none">&gt; (da0:umass-sim0:0:0:0): READ(10). CDB: 28 00 00 28 cf 28 00 00 40 00<br clear="none">&gt; <br clear="none">&gt; (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error<br clear="none">&gt; <br clear="none">&gt; (da0:umass-sim0:0:0:0): Retrying command, 3 more tries remain<br clear="none">&gt; <br clear="none">&gt; (da0:umass-sim0:0:0:0): READ(10). CDB: 28 00 00 28 cf 28 00 00 40 00<br clear="none">&gt; <br clear="none">&gt; (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error<br clear="none">&gt; <br clear="none">&gt; (da0:umass-sim0:0:0:0): Retrying command, 2 more tries remain<br clear="none">&gt; <br clear="none">&gt; FreeBSD/arm (Amnesiac) (ttyu0)<br clear="none">&gt; <br clear="none">&gt; login:<br clear="none">&gt; <br clear="none">&gt; I can share the USB traces taken at the USB device if required.<br clear="none">&gt; Thanks,Mahesh<br clear="none">I just replaced a drive 2 days ago that exhibited the same behavior. I <br clear="none">haven't (yet)<br clear="none">checked the replaced drive yet for cause. But what I chose to do was as <br clear="none">follows.<br clear="none">Get a new (known dependable) drive. Add it to the system and dump the data on <br clear="none">the<br clear="none">failing disk to the new drive. At least you'll have a safe copy of it.<br clear="none">You didn't say how the drive(s) are formatted/laid out. Are you using UFS/GPT <br clear="none">or<br clear="none">ZFS?<br clear="none">How you proceed after making a safe copy will depend on how you manage your <br clear="none">disks.<br clear="none">UFS/GPT?: simply remove the failing the disk, and change the entry in <br clear="none">fdtab(5) to<br clear="none">point to the new disk.<br clear="none">ZFS. It should be enough to simply replace the failing disk with one at least <div class="ydp1d9310b1yqt2490421151" id="ydp1d9310b1yqtfd27184"><br clear="none">the</div><br clear="none">size of the failing one and resilver.<br clear="none"><br clear="none">HTH<br clear="none"><br clear="none">--Chris<div class="ydp1d9310b1yqt2490421151" id="ydp1d9310b1yqtfd11869"><br clear="none"></div></div></div>
            </div>
        </div></body></html>

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1601830847.251013.1649826897803>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation