Date: Fri, 22 Aug 2014 07:13:38 -0600 From: Warner Losh <imp@bsdimp.com> To: Neel Natu <neelnatu@gmail.com> Cc: "svn-src-head@freebsd.org" <svn-src-head@freebsd.org>, "svn-src-all@freebsd.org" <svn-src-all@freebsd.org>, "src-committers@freebsd.org" <src-committers@freebsd.org>, Warner Losh <imp@freebsd.org> Subject: Re: svn commit: r270249 - head/sys/cam/ata Message-ID: <118A680A-E4E4-4FEF-9C9C-44771F89A2D7@bsdimp.com> In-Reply-To: <CAFgRE9HA29K3NFwRoTRWf-UHq-NWSt=jGN0hX=aAwrdV5Qbqkg@mail.gmail.com> References: <201408202258.s7KMwDh3073409@svn.freebsd.org> <CAFgRE9GvOMp4EmryGVJvPdZiUcKU0cZ3aajTfrEu8TJkAk2d-g@mail.gmail.com> <0DAF2357-4BBA-4D5B-8F17-D61845BACDA5@bsdimp.com> <CAFgRE9HA29K3NFwRoTRWf-UHq-NWSt=jGN0hX=aAwrdV5Qbqkg@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
--Apple-Mail=_D756C39A-C97E-4CE5-B1D4-FC15F18D796C Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=windows-1252 On Aug 21, 2014, at 11:58 PM, Neel Natu <neelnatu@gmail.com> wrote: > Hi Warner, >=20 > On Thu, Aug 21, 2014 at 10:34 PM, Warner Losh <imp@bsdimp.com> wrote: >>=20 >> On Aug 21, 2014, at 10:31 PM, Neel Natu <neelnatu@gmail.com> wrote: >>=20 >>> Hi Warner, >>>=20 >>> On Wed, Aug 20, 2014 at 3:58 PM, Warner Losh <imp@freebsd.org> = wrote: >>>> Author: imp >>>> Date: Wed Aug 20 22:58:12 2014 >>>> New Revision: 270249 >>>> URL: http://svnweb.freebsd.org/changeset/base/270249 >>>>=20 >>>> Log: >>>> Turns out that IDENTIFY DEVICE and IDENTIFY PACKET DEVICE return = data >>>> that's only mostly similar. Specifically word 78 bits are defined = for >>>> IDENTIFY DEVICE as >>>> 5 Supports Hardware Feature Control >>>> while a IDENTIFY PACKET DEVICE defines them as >>>> 5 Asynchronous notification supported >>>> Therefore, only pay attention to bit 5 when we're talking to ATAPI >>>> devices (we don't use the hardware feature control at this time). >>>> Ignore it for ATA devices. Remove kludge that papered over this = issue >>>> for Samsung SATA SSDs, since Micron drives also have the bit set = and >>>> the error was caused by this bad interpretation of the spec (which = is >>>> quite easy to do, since bits aren't normally overlapping like = this). >>>>=20 >>>> Modified: >>>> head/sys/cam/ata/ata_xpt.c >>>>=20 >>>> Modified: head/sys/cam/ata/ata_xpt.c >>>> = =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D >>>> --- head/sys/cam/ata/ata_xpt.c Wed Aug 20 22:39:26 2014 = (r270248) >>>> +++ head/sys/cam/ata/ata_xpt.c Wed Aug 20 22:58:12 2014 = (r270249) >>>> @@ -458,12 +458,18 @@ negotiate: >>>> 0, 0x02); >>>> break; >>>> case PROBE_SETAN: >>>> - /* Remember what transport thinks about AEN. */ >>>> - if (softc->caps & CTS_SATA_CAPS_H_AN) >>>> + /* >>>> + * Only ATAPI defines this bit to mean AEN, but = remember >>>> + * what transport thinks about AEN. >>>> + */ >>>> + if ((softc->caps & CTS_SATA_CAPS_H_AN) && >>>> + periph->path->device->protocol =3D=3D = PROTO_ATAPI) >>>> path->device->inq_flags |=3D SID_AEN; >>>> else >>>> path->device->inq_flags &=3D ~SID_AEN; >>>> xpt_async(AC_GETDEV_CHANGED, path, NULL); >>>> + if (periph->path->device->protocol !=3D = PROTO_ATAPI) >>>> + break; >>>> cam_fill_ataio(ataio, >>>> 1, >>>> probedone, >>>> @@ -750,14 +756,6 @@ out: >>>> goto noerror; >>>>=20 >>>> /* >>>> - * Some Samsung SSDs report supported Asynchronous = Notification, >>>> - * but return ABORT on attempt to enable it. >>>> - */ >>>> - } else if (softc->action =3D=3D PROBE_SETAN && >>>> - status =3D=3D CAM_ATA_STATUS_ERROR) { >>>> - goto noerror; >>>> - >>>> - /* >>>> * SES and SAF-TE SEPs have different IDENTIFY = commands, >>>> * but SATA specification doesn't tell how to = identify them. >>>> * Until better way found, just try another if first = fail. >>>>=20 >>>=20 >>> This change causes a panic for me on boot. Here is the boot log: >>>=20 >>> ahci0: <Intel Patsburg AHCI SATA controller> port >>> = 0xf050-0xf057,0xf040-0xf043,0xf030-0xf037,0xf020-0xf023,0xf000-0xf01f >>> mem 0xfbb21000-0xfbb217ff irq 18 at device 31.2 on pci0 >>> ahci0: AHCI v1.30 with 6 6Gbps ports, Port Multiplier not supported >>> ahcich0: <AHCI channel> at channel 0 on ahci0 >>> ahcich1: <AHCI channel> at channel 1 on ahci0 >>> ahcich2: <AHCI channel> at channel 2 on ahci0 >>> ahcich3: <AHCI channel> at channel 3 on ahci0 >>> ahcich4: <AHCI channel> at channel 4 on ahci0 >>> ahcich5: <AHCI channel> at channel 5 on ahci0 >>> ahciem0: <AHCI enclosure management bridge> on ahci0 >>> ... >>> xpt_action_default: CCB type 0xdeadc0de not supported >>> ... >>> run_interrupt_driven_hooks: still waiting after 60 seconds for = xpt_config >>> run_interrupt_driven_hooks: still waiting after 120 seconds for = xpt_config >>> run_interrupt_driven_hooks: still waiting after 180 seconds for = xpt_config >>> run_interrupt_driven_hooks: still waiting after 240 seconds for = xpt_config >>> run_interrupt_driven_hooks: still waiting after 300 seconds for = xpt_config >>> panic: run_interrupt_driven_config_hooks: waited too long >>> cpuid =3D 0 >>> KDB: stack backtrace: >>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame = 0xffffffff81d92920 >>> kdb_backtrace() at kdb_backtrace+0x39/frame 0xffffffff81d929d0 >>> vpanic() at vpanic+0x189/frame 0xffffffff81d92a50 >>> kassert_panic() at kassert_panic+0x139/frame 0xffffffff81d92ac0 >>> boot_run_interrupt_driven_config_hooks() at >>> boot_run_interrupt_driven_config_hooks+0x111/frame = 0xffffffff81d92b50 >>> mi_startup()fffff81d92b70 >>> btext() at btext+0x2c >>> KDB: enter: panic >>> [ thread pid 0 tid 100000 ] >>> Stopped at kdb_enter+0x3e: movq $0,kdb_why >>> db> >>>=20 >>> The peripheral in question is a SATA attached CDROM: >>>=20 >>> % camcontrol devlist >>> <INTEL SSDSC2CW240A3 400i> at scbus0 target 0 lun 0 = (pass0,ada0) >>> <ATAPI iHAS524 C LL23> at scbus2 target 0 lun 0 = (cd0,pass1) >>> <WDC WD1000CHTZ-04JCPV0 04.06A00> at scbus3 target 0 lun 0 = (pass2,ada1) >>> <Corsair Neutron GTX SSD M306> at scbus4 target 0 lun 0 = (pass3,ada2) >>> <AHCI SGPIO Enclosure 1.00 0001> at scbus6 target 0 lun 0 = (ses0,pass4) >>>=20 >>> pass1 at ahcich2 bus 0 scbus2 target 0 lun 0 >>> pass1: <ATAPI iHAS524 C LL23> Removable CD-ROM SCSI-0 device >>> pass1: Serial Number 3524472 2N8225501140 >>> pass1: 150.000MB/s transfers (SATA 1.x, UDMA5, ATAPI 12bytes, PIO = 8192bytes) >>>=20 >>> The following patch fixes the panic. >>>=20 >>> Index: sys/cam/ata/ata_xpt.c >>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >>> --- sys/cam/ata/ata_xpt.c (revision 270249) >>> +++ sys/cam/ata/ata_xpt.c (working copy) >>> @@ -468,7 +468,8 @@ >>> else >>> path->device->inq_flags &=3D ~SID_AEN; >>> xpt_async(AC_GETDEV_CHANGED, path, NULL); >>> - if (periph->path->device->protocol !=3D PROTO_ATAPI) >>> + if (periph->path->device->protocol !=3D PROTO_ATAPI = && >>> + periph->path->device->protocol !=3D PROTO_SCSI) >>> break; >>> cam_fill_ataio(ataio, >>> 1, >>=20 >> I think the more proper test is =3D=3D PROTO_ATA elsewhere, since = that=92s what >> distinguishes the ATA_IDENTIFY from the ATAPI_IDENTIFY. >>=20 >>> However, there seem to be a couple of issues with the original = patch: >>>=20 >>> 1. The 'periph->path->device->protocol' is not initialized to >>> PROTO_ATAPI anywhere in the tree so the not-equal-to test is a = no-op. >>=20 >> We test here to determine which identify command to send: >>=20 >> if (periph->path->device->protocol =3D=3D PROTO_ATA) >> ata_28bit_cmd(ataio, ATA_ATA_IDENTIFY, 0, 0, = 0); >> else >> ata_28bit_cmd(ataio, ATA_ATAPI_IDENTIFY, 0, 0, = 0); >>=20 >> and that is working to send the right command. >>=20 >=20 > Yes, but PROTO_ATA !=3D PROTO_ATAPI :-) >=20 > Since we never initialize 'periph->path->device->protocol' to > 'PROTO_ATAPI' in -current: But this code appears to: case PROBE_RESET: { int sign =3D (done_ccb->ataio.res.lba_high << 8) + done_ccb->ataio.res.lba_mid; CAM_DEBUG(path, CAM_DEBUG_PROBE, ("SIGNATURE: %04x\n", sign)); if (sign =3D=3D 0x0000 && done_ccb->ccb_h.target_id !=3D 15) { path->device->protocol =3D PROTO_ATA; PROBE_SET_ACTION(softc, PROBE_IDENTIFY); } else if (sign =3D=3D 0x9669 && done_ccb->ccb_h.target_id =3D=3D 15) { /* Report SIM that PM is present. */ bzero(&cts, sizeof(cts)); xpt_setup_ccb(&cts.ccb_h, path, = CAM_PRIORITY_NONE); cts.ccb_h.func_code =3D XPT_SET_TRAN_SETTINGS; cts.type =3D CTS_TYPE_CURRENT_SETTINGS; cts.xport_specific.sata.pm_present =3D 1; cts.xport_specific.sata.valid =3D = CTS_SATA_VALID_PM; xpt_action((union ccb *)&cts); path->device->protocol =3D PROTO_SATAPM; PROBE_SET_ACTION(softc, PROBE_PM_PID); } else if (sign =3D=3D 0xc33c && done_ccb->ccb_h.target_id !=3D 15) { path->device->protocol =3D PROTO_SEMB; PROBE_SET_ACTION(softc, PROBE_IDENTIFY_SES); } else if (sign =3D=3D 0xeb14 && done_ccb->ccb_h.target_id !=3D 15) { path->device->protocol =3D PROTO_SCSI; PROBE_SET_ACTION(softc, PROBE_IDENTIFY); } else { if (done_ccb->ccb_h.target_id !=3D 15) { xpt_print(path, "Unexpected signature 0x%04x\n", = sign); } goto device_fail; } =20 what am I missing? > if (protocol !=3D PROTO_ATAPI) equates to if (1) > if (protocol =3D=3D PROTO_ATAPI) equates to if (0) >=20 > I was trying to say that any code that compares 'protocol' to > PROTO_ATAPI probably deserves a second look (e.g., the original patch > that triggered this panic). Yes, but I think you=92re analysis was incorrect on this point :) >>> 2. It seems not right to break out of switch in 'probestart()' = without >>> providing a way for 'probedone()' to be called. I believe that this >>> stops the state machine from making forward progress and results in >>> 'xpt_config()' not completing. >>=20 >> That=92s a problem, you=92re right. Let me rework. >>=20 >>> If you need more information to debug this some more or test a = proper >>> fix then I am happy to help. >>=20 >> Please try the one included here. I think it will address things. = I=92ve tried it on one system, and am trying it on others in parallel to = sending this. >>=20 >=20 > Yup, works fine. Thanks for the quick fix! Will push it in. Thanks. Warner --Apple-Mail=_D756C39A-C97E-4CE5-B1D4-FC15F18D796C Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename=signature.asc Content-Type: application/pgp-signature; name=signature.asc Content-Description: Message signed with OpenPGP using GPGMail -----BEGIN PGP SIGNATURE----- Comment: GPGTools - https://gpgtools.org iQIcBAEBCgAGBQJT90ICAAoJEGwc0Sh9sBEA8jcP/0H4en1mQqKXPCEG2qOoIGqv R56TfUe+OFqfXsyyPfmur55gvxJLpap7WBiAOQMj5JNnl3/wSwwpv2kvMqT3kXaU MfejYtOhKdbpR13v5SCL5wnIl9hak9I2SmuF+q9UO8igHu9bUFIK4IzwGL4z7Ebh BvLSYVPmpeP+UBnmi7aCJwS10/GhGwjBUQexfFx1dDyp4l85onB9fXeVu4SyDEzW 2D1eHWVnL8mEHxGFLmTsbuxZ3c6rhWjTvhlaTUhdu+7gPKdxKGkTW2H8e6gzwTeR mfknejkw2KUK7aUYWf0BZc87WYkLHGrj8zD0HV6/y2M7E+2bS4gBnmFvELTYaZaY 0X7mTnUQwFw9T+N1/etUgrKqgUM7T0sWsAWyYBSmXyTfxQQFMWgu4QTx7jlqNM0L d+IEE0WMtFHieEB7pJJ50rrCxE9XVHHOOdQmqEa3GJg1K/W8QQufADhIXJXUF4a5 ayIDqkjjYNXog0sq1OT35akFnn8Ir1lJrAVgOXli2SH4q+H0ESojjjIjo8dxcPCM UkXYiMOElu3gBzJG2zihSUcnranqe45D+RvUl1quuv1EYM9NNEfTHqLsP04+kk1y 6JZyfu8vlY9qShJKodPdPiqWcMiAFn6yvIIeGgbrfRw8S1pqoDmWkG5WFr3qNSSK cG7pVBccVjxj+AzsPTIZ =35y7 -----END PGP SIGNATURE----- --Apple-Mail=_D756C39A-C97E-4CE5-B1D4-FC15F18D796C--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?118A680A-E4E4-4FEF-9C9C-44771F89A2D7>