Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 22 Aug 2014 12:56:36 -0600
From:      Warner Losh <imp@bsdimp.com>
To:        Neel Natu <neelnatu@gmail.com>
Cc:        "svn-src-head@freebsd.org" <svn-src-head@freebsd.org>, "svn-src-all@freebsd.org" <svn-src-all@freebsd.org>, "src-committers@freebsd.org" <src-committers@freebsd.org>, Warner Losh <imp@freebsd.org>
Subject:   Re: svn commit: r270249 - head/sys/cam/ata
Message-ID:  <B2070E5E-749B-461F-8F2D-F3E23129F535@bsdimp.com>
In-Reply-To: <CAFgRE9HhqK77EXbs2HvW=fye6Vs_d=yrydho8F0mAf0AFnHE%2Bg@mail.gmail.com>
References:  <201408202258.s7KMwDh3073409@svn.freebsd.org> <CAFgRE9GvOMp4EmryGVJvPdZiUcKU0cZ3aajTfrEu8TJkAk2d-g@mail.gmail.com> <0DAF2357-4BBA-4D5B-8F17-D61845BACDA5@bsdimp.com> <CAFgRE9HA29K3NFwRoTRWf-UHq-NWSt=jGN0hX=aAwrdV5Qbqkg@mail.gmail.com> <118A680A-E4E4-4FEF-9C9C-44771F89A2D7@bsdimp.com> <CAFgRE9HhqK77EXbs2HvW=fye6Vs_d=yrydho8F0mAf0AFnHE%2Bg@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help

--Apple-Mail=_484F65C8-572B-4C8D-8AC3-164FA0965F65
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=windows-1252


On Aug 22, 2014, at 12:07 PM, Neel Natu <neelnatu@gmail.com> wrote:

> Hi Warner,
>=20
> On Fri, Aug 22, 2014 at 6:13 AM, Warner Losh <imp@bsdimp.com> wrote:
>>=20
>> On Aug 21, 2014, at 11:58 PM, Neel Natu <neelnatu@gmail.com> wrote:
>>=20
>>> Hi Warner,
>>>=20
>>> On Thu, Aug 21, 2014 at 10:34 PM, Warner Losh <imp@bsdimp.com> =
wrote:
>>>>=20
>>>> On Aug 21, 2014, at 10:31 PM, Neel Natu <neelnatu@gmail.com> wrote:
>>>>=20
>>>>> Hi Warner,
>>>>>=20
>>>>> On Wed, Aug 20, 2014 at 3:58 PM, Warner Losh <imp@freebsd.org> =
wrote:
>>>>>> Author: imp
>>>>>> Date: Wed Aug 20 22:58:12 2014
>>>>>> New Revision: 270249
>>>>>> URL: http://svnweb.freebsd.org/changeset/base/270249
>>>>>>=20
>>>>>> Log:
>>>>>> Turns out that IDENTIFY DEVICE and IDENTIFY PACKET DEVICE return =
data
>>>>>> that's only mostly similar. Specifically word 78 bits are defined =
for
>>>>>> IDENTIFY DEVICE as
>>>>>>      5 Supports Hardware Feature Control
>>>>>> while a IDENTIFY PACKET DEVICE defines them as
>>>>>>      5 Asynchronous notification supported
>>>>>> Therefore, only pay attention to bit 5 when we're talking to =
ATAPI
>>>>>> devices (we don't use the hardware feature control at this time).
>>>>>> Ignore it for ATA devices. Remove kludge that papered over this =
issue
>>>>>> for Samsung SATA SSDs, since Micron drives also have the bit set =
and
>>>>>> the error was caused by this bad interpretation of the spec =
(which is
>>>>>> quite easy to do, since bits aren't normally overlapping like =
this).
>>>>>>=20
>>>>>> Modified:
>>>>>> head/sys/cam/ata/ata_xpt.c
>>>>>>=20
>>>>>> Modified: head/sys/cam/ata/ata_xpt.c
>>>>>> =
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D
>>>>>> --- head/sys/cam/ata/ata_xpt.c  Wed Aug 20 22:39:26 2014        =
(r270248)
>>>>>> +++ head/sys/cam/ata/ata_xpt.c  Wed Aug 20 22:58:12 2014        =
(r270249)
>>>>>> @@ -458,12 +458,18 @@ negotiate:
>>>>>>                  0, 0x02);
>>>>>>              break;
>>>>>>      case PROBE_SETAN:
>>>>>> -               /* Remember what transport thinks about AEN. */
>>>>>> -               if (softc->caps & CTS_SATA_CAPS_H_AN)
>>>>>> +               /*
>>>>>> +                * Only ATAPI defines this bit to mean AEN, but =
remember
>>>>>> +                * what transport thinks about AEN.
>>>>>> +                */
>>>>>> +               if ((softc->caps & CTS_SATA_CAPS_H_AN) &&
>>>>>> +                   periph->path->device->protocol =3D=3D =
PROTO_ATAPI)
>>>>>>                      path->device->inq_flags |=3D SID_AEN;
>>>>>>              else
>>>>>>                      path->device->inq_flags &=3D ~SID_AEN;
>>>>>>              xpt_async(AC_GETDEV_CHANGED, path, NULL);
>>>>>> +               if (periph->path->device->protocol !=3D =
PROTO_ATAPI)
>>>>>> +                       break;
>>>>>>              cam_fill_ataio(ataio,
>>>>>>                  1,
>>>>>>                  probedone,
>>>>>> @@ -750,14 +756,6 @@ out:
>>>>>>                      goto noerror;
>>>>>>=20
>>>>>>              /*
>>>>>> -                * Some Samsung SSDs report supported =
Asynchronous Notification,
>>>>>> -                * but return ABORT on attempt to enable it.
>>>>>> -                */
>>>>>> -               } else if (softc->action =3D=3D PROBE_SETAN &&
>>>>>> -                   status =3D=3D CAM_ATA_STATUS_ERROR) {
>>>>>> -                       goto noerror;
>>>>>> -
>>>>>> -               /*
>>>>>>               * SES and SAF-TE SEPs have different IDENTIFY =
commands,
>>>>>>               * but SATA specification doesn't tell how to =
identify them.
>>>>>>               * Until better way found, just try another if first =
fail.
>>>>>>=20
>>>>>=20
>>>>> This change causes a panic for me on boot. Here is the boot log:
>>>>>=20
>>>>> ahci0: <Intel Patsburg AHCI SATA controller> port
>>>>> =
0xf050-0xf057,0xf040-0xf043,0xf030-0xf037,0xf020-0xf023,0xf000-0xf01f
>>>>> mem 0xfbb21000-0xfbb217ff irq 18 at device 31.2 on pci0
>>>>> ahci0: AHCI v1.30 with 6 6Gbps ports, Port Multiplier not =
supported
>>>>> ahcich0: <AHCI channel> at channel 0 on ahci0
>>>>> ahcich1: <AHCI channel> at channel 1 on ahci0
>>>>> ahcich2: <AHCI channel> at channel 2 on ahci0
>>>>> ahcich3: <AHCI channel> at channel 3 on ahci0
>>>>> ahcich4: <AHCI channel> at channel 4 on ahci0
>>>>> ahcich5: <AHCI channel> at channel 5 on ahci0
>>>>> ahciem0: <AHCI enclosure management bridge> on ahci0
>>>>> ...
>>>>> xpt_action_default: CCB type 0xdeadc0de not supported
>>>>> ...
>>>>> run_interrupt_driven_hooks: still waiting after 60 seconds for =
xpt_config
>>>>> run_interrupt_driven_hooks: still waiting after 120 seconds for =
xpt_config
>>>>> run_interrupt_driven_hooks: still waiting after 180 seconds for =
xpt_config
>>>>> run_interrupt_driven_hooks: still waiting after 240 seconds for =
xpt_config
>>>>> run_interrupt_driven_hooks: still waiting after 300 seconds for =
xpt_config
>>>>> panic: run_interrupt_driven_config_hooks: waited too long
>>>>> cpuid =3D 0
>>>>> KDB: stack backtrace:
>>>>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame =
0xffffffff81d92920
>>>>> kdb_backtrace() at kdb_backtrace+0x39/frame 0xffffffff81d929d0
>>>>> vpanic() at vpanic+0x189/frame 0xffffffff81d92a50
>>>>> kassert_panic() at kassert_panic+0x139/frame 0xffffffff81d92ac0
>>>>> boot_run_interrupt_driven_config_hooks() at
>>>>> boot_run_interrupt_driven_config_hooks+0x111/frame =
0xffffffff81d92b50
>>>>> mi_startup()fffff81d92b70
>>>>> btext() at btext+0x2c
>>>>> KDB: enter: panic
>>>>> [ thread pid 0 tid 100000 ]
>>>>> Stopped at      kdb_enter+0x3e: movq    $0,kdb_why
>>>>> db>
>>>>>=20
>>>>> The peripheral in question is a SATA attached CDROM:
>>>>>=20
>>>>> % camcontrol devlist
>>>>> <INTEL SSDSC2CW240A3 400i>         at scbus0 target 0 lun 0 =
(pass0,ada0)
>>>>> <ATAPI iHAS524   C LL23>           at scbus2 target 0 lun 0 =
(cd0,pass1)
>>>>> <WDC WD1000CHTZ-04JCPV0 04.06A00>  at scbus3 target 0 lun 0 =
(pass2,ada1)
>>>>> <Corsair Neutron GTX SSD M306>     at scbus4 target 0 lun 0 =
(pass3,ada2)
>>>>> <AHCI SGPIO Enclosure 1.00 0001>   at scbus6 target 0 lun 0 =
(ses0,pass4)
>>>>>=20
>>>>> pass1 at ahcich2 bus 0 scbus2 target 0 lun 0
>>>>> pass1: <ATAPI iHAS524   C LL23> Removable CD-ROM SCSI-0 device
>>>>> pass1: Serial Number 3524472 2N8225501140
>>>>> pass1: 150.000MB/s transfers (SATA 1.x, UDMA5, ATAPI 12bytes, PIO =
8192bytes)
>>>>>=20
>>>>> The following patch fixes the panic.
>>>>>=20
>>>>> Index: sys/cam/ata/ata_xpt.c
>>>>> =
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
>>>>> --- sys/cam/ata/ata_xpt.c       (revision 270249)
>>>>> +++ sys/cam/ata/ata_xpt.c       (working copy)
>>>>> @@ -468,7 +468,8 @@
>>>>>              else
>>>>>                      path->device->inq_flags &=3D ~SID_AEN;
>>>>>              xpt_async(AC_GETDEV_CHANGED, path, NULL);
>>>>> -               if (periph->path->device->protocol !=3D =
PROTO_ATAPI)
>>>>> +               if (periph->path->device->protocol !=3D =
PROTO_ATAPI &&
>>>>> +                   periph->path->device->protocol !=3D =
PROTO_SCSI)
>>>>>                      break;
>>>>>              cam_fill_ataio(ataio,
>>>>>                  1,
>>>>=20
>>>> I think the more proper test is =3D=3D PROTO_ATA elsewhere, since =
that=92s what
>>>> distinguishes the ATA_IDENTIFY from the ATAPI_IDENTIFY.
>>>>=20
>>>>> However, there seem to be a couple of issues with the original =
patch:
>>>>>=20
>>>>> 1. The 'periph->path->device->protocol' is not initialized to
>>>>> PROTO_ATAPI anywhere in the tree so the not-equal-to test is  a =
no-op.
>>>>=20
>>>> We test here to determine which identify command to send:
>>>>=20
>>>>               if (periph->path->device->protocol =3D=3D PROTO_ATA)
>>>>                       ata_28bit_cmd(ataio, ATA_ATA_IDENTIFY, 0, 0, =
0);
>>>>               else
>>>>                       ata_28bit_cmd(ataio, ATA_ATAPI_IDENTIFY, 0, =
0, 0);
>>>>=20
>>>> and that is working to send the right command.
>>>>=20
>>>=20
>>> Yes, but PROTO_ATA !=3D PROTO_ATAPI :-)
>>>=20
>>> Since we never initialize 'periph->path->device->protocol' to
>>> 'PROTO_ATAPI' in -current:
>>=20
>> But this code appears to:
>>=20
>>        case PROBE_RESET:
>>        {
>>                int sign =3D (done_ccb->ataio.res.lba_high << 8) +
>>                    done_ccb->ataio.res.lba_mid;
>>                CAM_DEBUG(path, CAM_DEBUG_PROBE,
>>                    ("SIGNATURE: %04x\n", sign));
>>                if (sign =3D=3D 0x0000 &&
>>                    done_ccb->ccb_h.target_id !=3D 15) {
>>                        path->device->protocol =3D PROTO_ATA;
>>                        PROBE_SET_ACTION(softc, PROBE_IDENTIFY);
>>                } else if (sign =3D=3D 0x9669 &&
>>                    done_ccb->ccb_h.target_id =3D=3D 15) {
>>                        /* Report SIM that PM is present. */
>>                        bzero(&cts, sizeof(cts));
>>                        xpt_setup_ccb(&cts.ccb_h, path, =
CAM_PRIORITY_NONE);
>>                        cts.ccb_h.func_code =3D XPT_SET_TRAN_SETTINGS;
>>                        cts.type =3D CTS_TYPE_CURRENT_SETTINGS;
>>                        cts.xport_specific.sata.pm_present =3D 1;
>>                        cts.xport_specific.sata.valid =3D =
CTS_SATA_VALID_PM;
>>                        xpt_action((union ccb *)&cts);
>>                        path->device->protocol =3D PROTO_SATAPM;
>>                        PROBE_SET_ACTION(softc, PROBE_PM_PID);
>>                } else if (sign =3D=3D 0xc33c &&
>>                    done_ccb->ccb_h.target_id !=3D 15) {
>>                        path->device->protocol =3D PROTO_SEMB;
>>                        PROBE_SET_ACTION(softc, PROBE_IDENTIFY_SES);
>>                } else if (sign =3D=3D 0xeb14 &&
>>                    done_ccb->ccb_h.target_id !=3D 15) {
>>                        path->device->protocol =3D PROTO_SCSI;
>>                        PROBE_SET_ACTION(softc, PROBE_IDENTIFY);
>>                } else {
>>                        if (done_ccb->ccb_h.target_id !=3D 15) {
>>                                xpt_print(path,
>>                                    "Unexpected signature 0x%04x\n", =
sign);
>>                        }
>>                        goto device_fail;
>>                }
>>=20
>> what am I missing?
>>=20
>=20
> In the snippet above 'protocol' is set to one of PROTO_ATA,
> PROTO_SATAPM, PROTO_SEMB or PROTO_SCSI - none of which is PROTO_ATAPI
> :-)
>=20
> $ find sys -type f -exec grep -nH -w PROTO_ATAPI  {} \;
> sys/cam/scsi/scsi_pass.c:355: if (cgd->protocol =3D=3D PROTO_SCSI ||
> cgd->protocol =3D=3D PROTO_ATAPI)
> sys/cam/cam_ccb.h:249: PROTO_ATAPI, /* AT Attachment Packetized =
Interface */

Uggg.  OK. I=92ll look...


> best
> Neel
>=20
>>> if (protocol !=3D PROTO_ATAPI) equates to if (1)
>>> if (protocol =3D=3D PROTO_ATAPI) equates to if (0)
>>>=20
>>> I was trying to say that any code that compares 'protocol' to
>>> PROTO_ATAPI probably deserves a second look (e.g., the original =
patch
>>> that triggered this panic).
>>=20
>> Yes, but I think you=92re analysis was incorrect on this point :)
>>=20
>>>>> 2. It seems not right to break out of switch in 'probestart()' =
without
>>>>> providing a way for 'probedone()' to be called. I believe that =
this
>>>>> stops the state machine from making forward progress and results =
in
>>>>> 'xpt_config()' not completing.
>>>>=20
>>>> That=92s a problem, you=92re right. Let me rework.
>>>>=20
>>>>> If you need more information to debug this some more or test a =
proper
>>>>> fix then I am happy to help.
>>>>=20
>>>> Please try the one included here. I think it will address things. =
I=92ve tried it on one system, and am trying it on others in parallel to =
sending this.
>>>>=20
>>>=20
>>> Yup, works fine. Thanks for the quick fix!
>>=20
>> Will push it in. Thanks.
>>=20
>> Warner


--Apple-Mail=_484F65C8-572B-4C8D-8AC3-164FA0965F65
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
	filename=signature.asc
Content-Type: application/pgp-signature;
	name=signature.asc
Content-Description: Message signed with OpenPGP using GPGMail

-----BEGIN PGP SIGNATURE-----
Comment: GPGTools - https://gpgtools.org

iQIcBAEBCgAGBQJT95JlAAoJEGwc0Sh9sBEAgn4P/Rhg/LsA2xDxdA5O/Ej3j0oJ
F5bQHo4cB+Hw3uUbT/tbu3BuUqnqiwRoMW2/MIxsY0EB/pVeHqFmwrNidVuWmUAd
Drsam7ndFNhpchcMXjgTtjXnvk4LdI88dTD+2A4pGY0OPWMxNTFONUGrh8S/cPIN
DFD4ise8jnv+/qj0P/Hi8yHOJshgu0UOXVIg6LLYXiVgOBOUzH2Dip1FVDT4j/Dj
ruJjQ1OnRhgRo5W7u1bfpR3RpI4BH3FoYxdt0GBgepwpUjU5lKpXPO4J2oFWbIGO
EVdy8lxBNbZGCNCQdrH/5E4TfgUXC7E+L9SROS+eno+SX+Jkoy7LeUoRt7DYiW4L
59gbXC0BM74mIU13KDegorIPxZLN4tryp933EihrtUGYq0tkeb+ZnHFdHCl1gyJ5
lBnwLwKz1362yJ2h/EEk+e+kcJe/g5OoaW5trSxtNcmV2v/2dmeTysiK+2FVf03m
mElhrqJO/YZ71+pEH66DA5jMaVh4O9ef60EuYpXCoMgg4Od1jD2tYsdpGhUE0nzf
gWnFiDR/3pTMKkmv9SQtSl0ra5L0HI5/YZM9gMn1oEVGxHZ9Kw0zIpMg4qTPLlrt
n7e+Udyb2iYNyezC8tfJaIIjnYoCQLB6bwxISQ4NKVP48J1pfLOYfZMRtsSkczGt
NfMnL9ifmwAGan1fCG2A
=Licb
-----END PGP SIGNATURE-----

--Apple-Mail=_484F65C8-572B-4C8D-8AC3-164FA0965F65--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?B2070E5E-749B-461F-8F2D-F3E23129F535>