From owner-svn-src-head@FreeBSD.ORG Fri Aug 22 18:07:57 2014 Return-Path: Delivered-To: svn-src-head@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 1C63692A; Fri, 22 Aug 2014 18:07:57 +0000 (UTC) Received: from mail-qg0-x22a.google.com (mail-qg0-x22a.google.com [IPv6:2607:f8b0:400d:c04::22a]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 9C6B33233; Fri, 22 Aug 2014 18:07:56 +0000 (UTC) Received: by mail-qg0-f42.google.com with SMTP id j5so10854150qga.1 for ; Fri, 22 Aug 2014 11:07:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=8htiSGRWrKRC4xEtwR2p7gjEg1IePvqVrm4S+NYnOLQ=; b=iJM/Sb+mODo6EsqEkkTHU8T7+efKfq4cySz4zpdALC/zgGbCz3LZTZ36QwU9URgaVA /JeXrOMCf9py5JYppaax2OAVc15BcoOdYlUyUeHfrBd13RJVZ6arUoajm3yG/tmbIs94 P1HEhGfv21yhKMNrs1NufrsHaKNCxgv+4Q5F+mAXCzakVJq+nPAOnE4K7dTC50wWcs4z YpRG3uf2q3hG4/cz6m1FJ+MdEgKMAf5RRRj8tZyAY80anjFlnEkwa7jkemDzgidkh+iO tSipHxawnyEGcHpdKja8Qb5rIKYWys4wgz/P/zZrlzlU+SmtFgQ/BfgVhhnLXVf8wjiT st1w== MIME-Version: 1.0 X-Received: by 10.229.97.67 with SMTP id k3mr10444860qcn.1.1408730875393; Fri, 22 Aug 2014 11:07:55 -0700 (PDT) Received: by 10.140.84.80 with HTTP; Fri, 22 Aug 2014 11:07:55 -0700 (PDT) In-Reply-To: <118A680A-E4E4-4FEF-9C9C-44771F89A2D7@bsdimp.com> References: <201408202258.s7KMwDh3073409@svn.freebsd.org> <0DAF2357-4BBA-4D5B-8F17-D61845BACDA5@bsdimp.com> <118A680A-E4E4-4FEF-9C9C-44771F89A2D7@bsdimp.com> Date: Fri, 22 Aug 2014 11:07:55 -0700 Message-ID: Subject: Re: svn commit: r270249 - head/sys/cam/ata From: Neel Natu To: Warner Losh Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: "svn-src-head@freebsd.org" , "svn-src-all@freebsd.org" , "src-committers@freebsd.org" , Warner Losh X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Aug 2014 18:07:57 -0000 Hi Warner, On Fri, Aug 22, 2014 at 6:13 AM, Warner Losh wrote: > > On Aug 21, 2014, at 11:58 PM, Neel Natu wrote: > >> Hi Warner, >> >> On Thu, Aug 21, 2014 at 10:34 PM, Warner Losh wrote: >>> >>> On Aug 21, 2014, at 10:31 PM, Neel Natu wrote: >>> >>>> Hi Warner, >>>> >>>> On Wed, Aug 20, 2014 at 3:58 PM, Warner Losh wrote: >>>>> Author: imp >>>>> Date: Wed Aug 20 22:58:12 2014 >>>>> New Revision: 270249 >>>>> URL: http://svnweb.freebsd.org/changeset/base/270249 >>>>> >>>>> Log: >>>>> Turns out that IDENTIFY DEVICE and IDENTIFY PACKET DEVICE return data >>>>> that's only mostly similar. Specifically word 78 bits are defined for >>>>> IDENTIFY DEVICE as >>>>> 5 Supports Hardware Feature Control >>>>> while a IDENTIFY PACKET DEVICE defines them as >>>>> 5 Asynchronous notification supported >>>>> Therefore, only pay attention to bit 5 when we're talking to ATAPI >>>>> devices (we don't use the hardware feature control at this time). >>>>> Ignore it for ATA devices. Remove kludge that papered over this issue >>>>> for Samsung SATA SSDs, since Micron drives also have the bit set and >>>>> the error was caused by this bad interpretation of the spec (which is >>>>> quite easy to do, since bits aren't normally overlapping like this). >>>>> >>>>> Modified: >>>>> head/sys/cam/ata/ata_xpt.c >>>>> >>>>> Modified: head/sys/cam/ata/ata_xpt.c >>>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D >>>>> --- head/sys/cam/ata/ata_xpt.c Wed Aug 20 22:39:26 2014 (r270= 248) >>>>> +++ head/sys/cam/ata/ata_xpt.c Wed Aug 20 22:58:12 2014 (r270= 249) >>>>> @@ -458,12 +458,18 @@ negotiate: >>>>> 0, 0x02); >>>>> break; >>>>> case PROBE_SETAN: >>>>> - /* Remember what transport thinks about AEN. */ >>>>> - if (softc->caps & CTS_SATA_CAPS_H_AN) >>>>> + /* >>>>> + * Only ATAPI defines this bit to mean AEN, but remem= ber >>>>> + * what transport thinks about AEN. >>>>> + */ >>>>> + if ((softc->caps & CTS_SATA_CAPS_H_AN) && >>>>> + periph->path->device->protocol =3D=3D PROTO_ATAPI= ) >>>>> path->device->inq_flags |=3D SID_AEN; >>>>> else >>>>> path->device->inq_flags &=3D ~SID_AEN; >>>>> xpt_async(AC_GETDEV_CHANGED, path, NULL); >>>>> + if (periph->path->device->protocol !=3D PROTO_ATAPI) >>>>> + break; >>>>> cam_fill_ataio(ataio, >>>>> 1, >>>>> probedone, >>>>> @@ -750,14 +756,6 @@ out: >>>>> goto noerror; >>>>> >>>>> /* >>>>> - * Some Samsung SSDs report supported Asynchronous No= tification, >>>>> - * but return ABORT on attempt to enable it. >>>>> - */ >>>>> - } else if (softc->action =3D=3D PROBE_SETAN && >>>>> - status =3D=3D CAM_ATA_STATUS_ERROR) { >>>>> - goto noerror; >>>>> - >>>>> - /* >>>>> * SES and SAF-TE SEPs have different IDENTIFY commands= , >>>>> * but SATA specification doesn't tell how to identify = them. >>>>> * Until better way found, just try another if first fa= il. >>>>> >>>> >>>> This change causes a panic for me on boot. Here is the boot log: >>>> >>>> ahci0: port >>>> 0xf050-0xf057,0xf040-0xf043,0xf030-0xf037,0xf020-0xf023,0xf000-0xf01f >>>> mem 0xfbb21000-0xfbb217ff irq 18 at device 31.2 on pci0 >>>> ahci0: AHCI v1.30 with 6 6Gbps ports, Port Multiplier not supported >>>> ahcich0: at channel 0 on ahci0 >>>> ahcich1: at channel 1 on ahci0 >>>> ahcich2: at channel 2 on ahci0 >>>> ahcich3: at channel 3 on ahci0 >>>> ahcich4: at channel 4 on ahci0 >>>> ahcich5: at channel 5 on ahci0 >>>> ahciem0: on ahci0 >>>> ... >>>> xpt_action_default: CCB type 0xdeadc0de not supported >>>> ... >>>> run_interrupt_driven_hooks: still waiting after 60 seconds for xpt_con= fig >>>> run_interrupt_driven_hooks: still waiting after 120 seconds for xpt_co= nfig >>>> run_interrupt_driven_hooks: still waiting after 180 seconds for xpt_co= nfig >>>> run_interrupt_driven_hooks: still waiting after 240 seconds for xpt_co= nfig >>>> run_interrupt_driven_hooks: still waiting after 300 seconds for xpt_co= nfig >>>> panic: run_interrupt_driven_config_hooks: waited too long >>>> cpuid =3D 0 >>>> KDB: stack backtrace: >>>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xffffffff= 81d92920 >>>> kdb_backtrace() at kdb_backtrace+0x39/frame 0xffffffff81d929d0 >>>> vpanic() at vpanic+0x189/frame 0xffffffff81d92a50 >>>> kassert_panic() at kassert_panic+0x139/frame 0xffffffff81d92ac0 >>>> boot_run_interrupt_driven_config_hooks() at >>>> boot_run_interrupt_driven_config_hooks+0x111/frame 0xffffffff81d92b50 >>>> mi_startup()fffff81d92b70 >>>> btext() at btext+0x2c >>>> KDB: enter: panic >>>> [ thread pid 0 tid 100000 ] >>>> Stopped at kdb_enter+0x3e: movq $0,kdb_why >>>> db> >>>> >>>> The peripheral in question is a SATA attached CDROM: >>>> >>>> % camcontrol devlist >>>> at scbus0 target 0 lun 0 (pass0,ada= 0) >>>> at scbus2 target 0 lun 0 (cd0,pass1= ) >>>> at scbus3 target 0 lun 0 (pass2,ada= 1) >>>> at scbus4 target 0 lun 0 (pass3,ada= 2) >>>> at scbus6 target 0 lun 0 (ses0,pass= 4) >>>> >>>> pass1 at ahcich2 bus 0 scbus2 target 0 lun 0 >>>> pass1: Removable CD-ROM SCSI-0 device >>>> pass1: Serial Number 3524472 2N8225501140 >>>> pass1: 150.000MB/s transfers (SATA 1.x, UDMA5, ATAPI 12bytes, PIO 8192= bytes) >>>> >>>> The following patch fixes the panic. >>>> >>>> Index: sys/cam/ata/ata_xpt.c >>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >>>> --- sys/cam/ata/ata_xpt.c (revision 270249) >>>> +++ sys/cam/ata/ata_xpt.c (working copy) >>>> @@ -468,7 +468,8 @@ >>>> else >>>> path->device->inq_flags &=3D ~SID_AEN; >>>> xpt_async(AC_GETDEV_CHANGED, path, NULL); >>>> - if (periph->path->device->protocol !=3D PROTO_ATAPI) >>>> + if (periph->path->device->protocol !=3D PROTO_ATAPI && >>>> + periph->path->device->protocol !=3D PROTO_SCSI) >>>> break; >>>> cam_fill_ataio(ataio, >>>> 1, >>> >>> I think the more proper test is =3D=3D PROTO_ATA elsewhere, since that= =E2=80=99s what >>> distinguishes the ATA_IDENTIFY from the ATAPI_IDENTIFY. >>> >>>> However, there seem to be a couple of issues with the original patch: >>>> >>>> 1. The 'periph->path->device->protocol' is not initialized to >>>> PROTO_ATAPI anywhere in the tree so the not-equal-to test is a no-op. >>> >>> We test here to determine which identify command to send: >>> >>> if (periph->path->device->protocol =3D=3D PROTO_ATA) >>> ata_28bit_cmd(ataio, ATA_ATA_IDENTIFY, 0, 0, 0); >>> else >>> ata_28bit_cmd(ataio, ATA_ATAPI_IDENTIFY, 0, 0, 0= ); >>> >>> and that is working to send the right command. >>> >> >> Yes, but PROTO_ATA !=3D PROTO_ATAPI :-) >> >> Since we never initialize 'periph->path->device->protocol' to >> 'PROTO_ATAPI' in -current: > > But this code appears to: > > case PROBE_RESET: > { > int sign =3D (done_ccb->ataio.res.lba_high << 8) + > done_ccb->ataio.res.lba_mid; > CAM_DEBUG(path, CAM_DEBUG_PROBE, > ("SIGNATURE: %04x\n", sign)); > if (sign =3D=3D 0x0000 && > done_ccb->ccb_h.target_id !=3D 15) { > path->device->protocol =3D PROTO_ATA; > PROBE_SET_ACTION(softc, PROBE_IDENTIFY); > } else if (sign =3D=3D 0x9669 && > done_ccb->ccb_h.target_id =3D=3D 15) { > /* Report SIM that PM is present. */ > bzero(&cts, sizeof(cts)); > xpt_setup_ccb(&cts.ccb_h, path, CAM_PRIORITY_NONE= ); > cts.ccb_h.func_code =3D XPT_SET_TRAN_SETTINGS; > cts.type =3D CTS_TYPE_CURRENT_SETTINGS; > cts.xport_specific.sata.pm_present =3D 1; > cts.xport_specific.sata.valid =3D CTS_SATA_VALID_= PM; > xpt_action((union ccb *)&cts); > path->device->protocol =3D PROTO_SATAPM; > PROBE_SET_ACTION(softc, PROBE_PM_PID); > } else if (sign =3D=3D 0xc33c && > done_ccb->ccb_h.target_id !=3D 15) { > path->device->protocol =3D PROTO_SEMB; > PROBE_SET_ACTION(softc, PROBE_IDENTIFY_SES); > } else if (sign =3D=3D 0xeb14 && > done_ccb->ccb_h.target_id !=3D 15) { > path->device->protocol =3D PROTO_SCSI; > PROBE_SET_ACTION(softc, PROBE_IDENTIFY); > } else { > if (done_ccb->ccb_h.target_id !=3D 15) { > xpt_print(path, > "Unexpected signature 0x%04x\n", sign= ); > } > goto device_fail; > } > > what am I missing? > In the snippet above 'protocol' is set to one of PROTO_ATA, PROTO_SATAPM, PROTO_SEMB or PROTO_SCSI - none of which is PROTO_ATAPI :-) $ find sys -type f -exec grep -nH -w PROTO_ATAPI {} \; sys/cam/scsi/scsi_pass.c:355: if (cgd->protocol =3D=3D PROTO_SCSI || cgd->protocol =3D=3D PROTO_ATAPI) sys/cam/cam_ccb.h:249: PROTO_ATAPI, /* AT Attachment Packetized Interface *= / best Neel >> if (protocol !=3D PROTO_ATAPI) equates to if (1) >> if (protocol =3D=3D PROTO_ATAPI) equates to if (0) >> >> I was trying to say that any code that compares 'protocol' to >> PROTO_ATAPI probably deserves a second look (e.g., the original patch >> that triggered this panic). > > Yes, but I think you=E2=80=99re analysis was incorrect on this point :) > >>>> 2. It seems not right to break out of switch in 'probestart()' without >>>> providing a way for 'probedone()' to be called. I believe that this >>>> stops the state machine from making forward progress and results in >>>> 'xpt_config()' not completing. >>> >>> That=E2=80=99s a problem, you=E2=80=99re right. Let me rework. >>> >>>> If you need more information to debug this some more or test a proper >>>> fix then I am happy to help. >>> >>> Please try the one included here. I think it will address things. I=E2= =80=99ve tried it on one system, and am trying it on others in parallel to = sending this. >>> >> >> Yup, works fine. Thanks for the quick fix! > > Will push it in. Thanks. > > Warner