Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 2 May 2013 18:09:29 +0100
From:      "Steven Hartland" <killing@multiplay.co.uk>
To:        "Kenneth D. Merry" <ken@freebsd.org>
Cc:        svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org, trasz@freebsd.org
Subject:   Re: svn commit: r249939 - head/sys/cam/scsi
Message-ID:  <8323FBAA5BD84C7D948801DB8E15D12D@multiplay.co.uk>
References:  <201304261617.r3QGH58Q048395@svn.freebsd.org> <20130429225641.GA1375@nargothrond.kdm.org> <F701AE7B956C498AB89649768B748D81@multiplay.co.uk> <8C38141DA56442218C71EDCBA9790CF8@multiplay.co.uk> <20130501035832.GB79864@nargothrond.kdm.org>

next in thread | previous in thread | raw e-mail | index | archive | help

----- Original Message ----- 
From: "Kenneth D. Merry" <ken@freebsd.org>

> On Tue, Apr 30, 2013 at 04:16:49 +0100, Steven Hartland wrote:
>> ----- Original Message ----- 
>> From: "Steven Hartland"
>> >>On Fri, Apr 26, 2013 at 16:17:05 +0000, Steven Hartland wrote:
>> >>>Author: smh
>> >>>Date: Fri Apr 26 16:17:04 2013
>> >>>New Revision: 249939
>> >>>URL: http://svnweb.freebsd.org/changeset/base/249939
>> >>>
>> >>>Log:
>> >>>  Added available delete methods discovery during device probe, 
>> >>>  including the
>> >>>  maximum sizes for said methods, which are used when processing 
>> >>>  BIO_DELETE
>> >>>  requests. This includes updating UNMAP support discovery to be based on
>> >>>  SBC-3 T10/1799-D Revision 31 specification.
>> >>>  Added ATA TRIM support to cam scsi devices via ATA Pass-Through(16)
>> >>>  sys/cam/scsi/scsi_da.c:
>> >>>          - Added ATA Data Set Management TRIM support via ATA 
>> >>>          Pass-Through(16)
>> >>>            as a delete_method
>> >>>
>> >>
>> >>This adds a lot of unnecessary verbosity for devices that don't support 
>> >>ATA
>> >>passthrough.  For example:
>> >>
>> >>(da7:iscsi4:0:0:0): ATA COMMAND PASS THROUGH(16). CDB: 85 08 0a 00 00 02 
>> >>00 00 00 00 00 00 00 40 ec 00
>> >>(da7:iscsi4:0:0:0): CAM status: SCSI Status Error
>> >>(da7:iscsi4:0:0:0): SCSI status: Check Condition
>> >>(da7:iscsi4:0:0:0): Retrying command (per sense data)
>> >>(2:2:0:0): ATA COMMAND PASS THROUGH(16). CDB: 85 08 0a 00 00 02 00 00 00 
>> >>00 00 00 00 40 ec 00
>> >>(2:2:0:0): Tag: 0x00f6, Type: 1
>> >>(2:2:0:0): CTL Status: SCSI Error
>> >>(2:2:0:0): SCSI Status: Check Condition
>> >>(2:2:0:0): SCSI sense: ILLEGAL REQUEST asc:20,0 (Invalid command 
>> >>operation code)
>> >>(2:2:0:0): Command byte 0 is invalid
>> >>
>> >>(da8:iscsi4:0:0:1): ATA COMMAND PASS THROUGH(16). CDB: 85 08 0a 00 00 02 
>> >>00 00 00 00 00 00 00 40 ec 00
>> >>(da8:iscsi4:0:0:1): CAM status: SCSI Status Error
>> >>(da8:iscsi4:0:0:1): SCSI status: Check Condition
>> >>(da8:iscsi4:0:0:1): Retrying command (per sense data)
>> >>(da8:iscsi4:0:0:1): ATA COMMAND PASS THROUGH(16). CDB: 85 08 0a 00 00 02 
>> >>00 00 00 00 00 00 00 40 ec 00
>> >>(da8:iscsi4:0:0:1): CAM status: SCSI Status Error
>> >>(da8:iscsi4:0:0:1): SCSI status: Check Condition
>> >>(da8:iscsi4:0:0:1): Error 5, Retries exhausted
>> >>
>> >>That is with CTL and and trasz's new iSCSI initiator, but you should see 
>> >>it
>> >>with any CTL configuration.  (And probably with any controller or device
>> >>that doesn't support ATA passthrough.)
>> >>
>> >>So, please:
>> >>- Check for the presence of VPD page 0x89 before sending an ATA
>> >>  passthrough command.  The spec (sat3r03 in this case) says that
>> >>  it "shall" be implemented, so I think we can count on that.
>> >>- If the target returns an illegal request sense key, don't retry
>> >>  again.  The target will keep returning illegal request
>> >
>> >Thanks for the report Ken I'll check this on a card I know doesn't support
>> >pass-through.
>> 
>> I checked with an areca controller, which I know doesn't support pass16, and
>> all is quiet during boot / probe.
>> 
>> I can provoke output at the app layer with camcontrol identify da0, which is
>> to be expected, but nothing in /var/log/messages.
>> 
>> daerror handler in scsi_da.c definitely has SF_QUIET_IR so Illegal Request
>> should never be output.
> 
> You're correct, it should not print out any messages.
> 
>> I can't provoke this using standard CAM devices even when they don't support
>> ATA Pass-Through (16) so I think this is actually an issue with CTL:
>> 1. Being verbose when it shouldn't (ctl_process_done - line 12571?)
> 
> That part at least is by design -- it prints out rate-limited messages so
> you know when the initiator is sending commands CTL doesn't support.  It
> comes in handy for figuring out initiator problems sometimes.
> 
>> 2. Retrying Illegal Request commands when it shouldn't (I didn't experince
>> this on r250032)
> 
> CTL doesn't do any retries, it leaves that up to the initiator.
> 
> I finally tracked down the problem.  The issue is that the iSCSI initiator
> wasn't setting the CAM_AUTOSNS_VALID flag in the CCB status field.  As a
> result, only the CAM status and SCSI status fields were being printed.
> SF_QUIET_IR
> 
>> The attached patch implements a check for ATA Information VPD before
>> using ATA Pass-Through (16). It's had limited testing but I have
>> confirmed it eliminates the use of pass16 under CTL and still enables
>> ATA TRIM under on mpt for SATA disks, so looks promising.
>> 
>> If you could test and let me know if it works for you Ken that would
>> be appreciated.
> 
> Yes, it does work and eliminates the error message, thanks!  I think it is
> also a better way to go, just in case we run into a target that has an
> issue with the ATA passthrough CDB.

A slightly modified version of this is now in the current r250181
http://svnweb.freebsd.org/changeset/base/250181

>> The test case I used for CTL was:-
>> kldload ctl
>> ctladm create -b ramdisk -s 10485760
>> ctladm port -o on
>> 
>> As a side note while test "kldload ctl" caused kernel panic the once.
> 
> What was the panic message?

Unread portion of the kernel message buffer:
ctl: CAM Target Layer loaded

Fatal trap 9: general protection fault while in kernel mode
cpuid = 0; apic id = 00
instruction pointer     = 0x20:0xffffffff8029c860
stack pointer           = 0x28:0xffffff823b8dc840
frame pointer           = 0x28:0xffffff823b8dc910
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 4 (xpt_thrd)

> This doesn't look like a CTL-specific problem (not on the surface), but
> rather a more general CAM problem:
> 
>> #8  0xffffffff8073e1f2 in calltrap () at 
>> /usr/home/smh/freebsd/base/head/sys/amd64/amd64/exception.S:228
>> #9  0xffffffff8029c860 in cam_periph_alloc (periph_ctor=0xffffffff802af410 
>> <proberegister>, periph_oninvalidate=0xfffffe0019cfa200, periph_dtor=<value 
>> optimized out>, periph_start=0xfffffe0015980a90, name=<value optimized 
>> out>, type=2159638184,
>>    path=0xfffffe0015ad79a0, ac_callback=<value optimized out>, code=<value 
>>    optimized out>) at /usr/home/smh/freebsd/base/head/sys/cam/cam_periph.c:227
>> #10 0xffffffff802aed6b in scsi_scan_lun (request_ccb=<value optimized out>) 
>> at /usr/home/smh/freebsd/base/head/sys/cam/scsi/scsi_xpt.c:2266
>> #11 0xffffffff802b2859 in scsi_scan_bus (periph=0x0, 
>> request_ccb=0xfffffe00234df000) at 
>> /usr/home/smh/freebsd/base/head/sys/cam/scsi/scsi_xpt.c:1969
>> #12 0xffffffff802a66c5 in xpt_scanner_thread (dummy=<value optimized out>) 
>> at /usr/home/smh/freebsd/base/head/sys/cam/cam_xpt.c:2411
>> #13 0xffffffff8055cde4 in fork_exit (callout=0xffffffff802a6650 
>> <xpt_scanner_thread>, arg=0x0, frame=0xffffff823b8dcc00) at 
>> /usr/home/smh/freebsd/base/head/sys/kern/kern_fork.c:991
>> #14 0xffffffff8073e72e in fork_trampoline () at 
>> /usr/home/smh/freebsd/base/head/sys/amd64/amd64/exception.S:602
>> #15 0x0000000000000000 in ?? ()
> 
> If this is on a stock FreeBSD/head, it looks like it failed on this line in
> cam_periph_alloc():

Apart from those patches yes stock FreeBSD/head.

>        cur_periph = TAILQ_FIRST(&(*p_drv)->units);
> 
> Perhaps there's an issue with the ctl peripheral driver getting added to
> the list of peripheral drivers (via periphdriver_register()).  The topology
> lock isn't held in periphdriver_register(), so it's possible that the
> thread was in between the peripheral driver list traversal and the line
> above when the peripheral was added.
> 
> It's more likely that the scan request was generated as a result of the CTL
> ports going online.  And that means that the peripheral driver registration
> should have long since been done.
> 
> If you can reproduce the problem, you can probably figure out where the
> issue is.

I've not been able to reproduce I'm afraid, seems you cant kldunload ctl
so needs a reboot between loads, tried 3 times but nothing.

I did however notice the following after turning ports off, which may
indicated a reference leak:-

May  2 17:06:25 head kernel: (da2:ctl2cam0:0:1:0): lost device - 0 outstanding, 2 refs
May  2 17:06:25 head kernel: (pass6:ctl2cam0:0:1:0): lost device
May  2 17:06:25 head kernel: (pass6:ctl2cam0:0:1:0): removing device entry
May  2 17:06:25 head kernel: (da2:ctl2cam0:0:1:0): removing device entry

    Regards
    Steve

================================================
This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 

In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337
or return the E.mail to postmaster@multiplay.co.uk.




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?8323FBAA5BD84C7D948801DB8E15D12D>