Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 25 Jun 2024 04:04:06 +0000
From:      bugzilla-noreply@freebsd.org
To:        bugs@FreeBSD.org
Subject:   [Bug 279978] After commit 25375b1415, any errors in device connected to ahci etc. results in Unretryable error
Message-ID:  <bug-279978-227@https.bugs.freebsd.org/bugzilla/>

next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D279978

            Bug ID: 279978
           Summary: After commit 25375b1415, any errors in device
                    connected to ahci etc. results in Unretryable error
           Product: Base System
           Version: 14.1-RELEASE
          Hardware: Any
                OS: Any
            Status: New
          Severity: Affects Some People
          Priority: ---
         Component: kern
          Assignee: bugs@FreeBSD.org
          Reporter: aono@cc.osaka-kyoiku.ac.jp

I have a (half-broken) HDD (ada2, connected to ahci1) with a FreeBSD-14.1 (=
p0)
server in my office.

> kernel: CPU: Intel(R) Core(TM) i7 CPU         920  @ 2.67GHz (2672.84-MHz=
 K8-class CPU)
> kernel:   Origin=3D"GenuineIntel"  Id=3D0x106a4  Family=3D0x6  Model=3D0x=
1a  Stepping=3D4
> kernel: ahci1: <Intel ICH10 AHCI SATA controller> port 0x7c00-0x7c07,0x78=
80-0x7883,0x7800-0x7807,0x7480-0x7483,0x7400-0x741f mem 0xf7ffc000-0xf7ffc7=
ff irq 20 at device 31.2 on pci0
> kernel: ahci1: AHCI v1.20 with 6 3Gbps ports, Port Multiplier supported
> kernel: ahcich4: <AHCI channel> at channel 2 on ahci1
> kernel: ahciem0: <AHCI enclosure management bridge> on ahci1
> kernel: ses0 at ahciem0 bus 0 scbus9 target 0 lun 0
> kernel: ses0: <AHCI SGPIO Enclosure 2.00 0001> SEMB S-E-S 2.00 device
> kernel: ses0: SEMB SES Device
> kernel: ses0: ada2,pass2 in 'Slot 02', SATA Slot: scbus5 target 0
> kernel: ada2 at ahcich4 bus 0 scbus5 target 0 lun 0
> kernel: ada2: <WDC WD60EFRX-68L0BN1 82.00A82> ACS-2 ATA SATA 3.x device
> kernel: ada2: Serial Number WD-WX41DA5LVRR4
> kernel: ada2: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
> kernel: ada2: Command Queueing enabled
> kernel: ada2: 5723166MB (11721045168 512 byte sectors)
> kernel: ada2: quirks=3D0x1<4K>

When running read/write bad sector using dd (with 'sysctl
kern.geom.debugflags=3D16'),
Unretryable error occurs and cannot access to ada2 until I use
'camcontrol reset ada2'.

> kernel: (ada2:ahcich4:0:0:0): READ_FPDMA_QUEUED. ACB: 60 01 68 da 57 40 b=
3 00 00 00 00 00
> kernel: (ada2:ahcich4:0:0:0): CAM status: Auto-Sense Retrieval Failed
> kernel: (ada2:ahcich4:0:0:0): Error 5, Unretryable error

When on FreeBSD-13.x, this error is retryable. (Following entries are
past logs, sector/ACB differs.)

> kernel: (ada2:ahcich4:0:0:0): READ_FPDMA_QUEUED. ACB: 60 00 e8 1b df 40 1=
f 01 00 08 00 00
> kernel: (ada2:ahcich4:0:0:0): CAM status: ATA Status Error
> kernel: (ada2:ahcich4:0:0:0): ATA status: 41 (DRDY ERR), error: 40 (UNC )
> kernel: (ada2:ahcich4:0:0:0): RES: 41 40 b0 1c df 00 1f 01 00 00 00
> kernel: (ada2:ahcich4:0:0:0): Retrying command, 3 more tries remain

In commit 25375b1415, we changed as follows (/sys/dev/ahci/ahci.c only,
probably this also affects to siis/mvs):

diff --git a/sys/dev/ahci/ahci.c b/sys/dev/ahci/ahci.c
index 12e6ee8102da..d62a043eb2ab 100644
--- a/sys/dev/ahci/ahci.c
+++ b/sys/dev/ahci/ahci.c
@@ -2178,7 +2178,8 @@ completeall:
                ahci_reset(ch);
                return;
        }
-       ccb->ccb_h =3D ch->hold[i]->ccb_h;        /* Reuse old header. */
+       xpt_setup_ccb(&ccb->ccb_h, ch->hold[i]->ccb_h.path,
+           ch->hold[i]->ccb_h.pinfo.priority);
        if (ccb->ccb_h.func_code =3D=3D XPT_ATA_IO) {
                /* READ LOG */
                ccb->ccb_h.recovery_type =3D RECOVERY_READ_LOG;

Commit message say 'only field I see used from all the header is target_id.'
But we need func_code in 'if' statement in NEXT line.
func_code is always same value (probably 0), so 'if' statement
never match condition (XPT_ATA_IO in above code), we always do
'REQUEST SENSE' in 'else' block. This is problematic.

Copying more CCB header (at least func_code) or 'if' condition change
(ex. 'if(ch->hold[i]->ccb.h.func_code =3D=3D XPT_ATA_IO) { ...')
would solve this issue. I modified adding xpt_merge_ccb()
after xpt_setup_ccb() (booting with modified kernel seems to work fine),
but I'm not sure if this is a right code.

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-279978-227>