Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 20 Feb 2013 17:43:52 +0200
From:      Alexander Motin <mav@FreeBSD.org>
To:        Joel Dahl <joel@freebsd.org>
Cc:        freebsd-current@freebsd.org, Hans Petter Selasky <hselasky@c2i.net>
Subject:   Re: HEAD memsticks broken? [USB/CAM Problems?]
Message-ID:  <5124EF38.7080302@FreeBSD.org>
In-Reply-To: <20130216100719.GB47553@jd.benders.se>
References:  <20130209073241.GN21730@jd.benders.se> <20130209230939.GQ21730@jd.benders.se> <20130211222105.GC838@jd.benders.se> <201302120851.18810.hselasky@c2i.net> <20130214193707.GD84888@jd.benders.se> <20130216100719.GB47553@jd.benders.se>

next in thread | previous in thread | raw e-mail | index | archive | help
On 16.02.2013 12:07, Joel Dahl wrote:
> On 14-02-2013 20:37, Joel Dahl wrote:
>> On 12-02-2013  8:51, Hans Petter Selasky wrote:
>>> On Monday 11 February 2013 23:21:05 Joel Dahl wrote:
>>>> On 10-02-2013  0:09, Joel Dahl wrote:
>>>>> On 09-02-2013 20:28, Alexander Motin wrote:
>>>>>> How long ago that HEAD was built? Could you get full dmesg? I don't
>>>>>> think that "PREVENT ALLOW MEDIUM REMOVAL" should cause device drop. "No
>>>>>> sense data present" also doesn't look right.
>>>>>
>>>>> As I mentioned earlier, I've tried several HEAD snapshots.
>>>>
>>>> Just a quick update on this: I've built quite a few releases now and
>>>> managed to track down the problem to somewhere between r235789 and
>>>> r237855. It'll probably take me another day or two before I know which
>>>> commit actually broke it.
>>>
>>> Hi,
>>>
>>> I don't see any relevant USB+UMASS patches for your issue in this interval, 
>>> but many patches in the SCSI/CAM area.
>>
>> I finally found it. A r237477 memstick boots fine. A r237478 memstick does not.
>>
>> 237478 is the following commit by mav@:
>>
>> ------------------------------------------------------------------------
>> r237478 | mav | 2012-06-23 14:32:53 +0200 (Sat, 23 Jun 2012) | 3 lines
>>
>> Add scsi_extract_sense_ccb() -- wrapper around scsi_extract_sense_len().
>> It allows to remove number of duplicate checks from several places.
>>
>> ------------------------------------------------------------------------
> 
> So, mav@ haven't replied yet so I did some more investigation. I collected
> all the USB sticks I had in the office (5 in total, all Kingston but different
> size and models) and tried a memstick installation with each stick. Turns out
> r237478 only breaks memstick installation in combination with certain USB
> sticks:
> 
> # Works:
> 
> da0: <Kingston DataTraveler 2.0 1.00> Removable Direct Access SCSI-2 device
> da0: 40.000MB/s transfers
> da0: 7664MB (15695872 512 byte sectors: 255H 63S/T 977C)
> 
> da0: <Kingston DataTraveler 2.0 PMAP> Removable Direct Access SCSI-0 device
> da0: 40.000MB/s transfers
> da0: 1906MB (3903488 512 byte sectors: 255H 63S/T 242C)
> 
> # Does not work:
> 
> da0: <Kingston DataTraveler G3 1.00> Removable Direct Access SCSI-2 device
> da0: 40.000MB/s transfers
> da0: 15295MB (31324160 512 byte sectors: 255H 63S/T 1949C)
> 
> da0: <Kingston DataTraveler G3 1.00> Removable Direct Access SCSI-0 device
> da0: 40.000MB/s transfers
> da0: 3690MB (7557704 512 byte sectors: 255H 63S/T 470C)
> 
> da0: <Kingston DataTraveler G3 1.00> Removable Direct Access SCSI-2 device
> da0: 40.000MB/s transfers
> da0: 1905MB (3903264 512 byte sectors: 255H 63S/T 242C)
> 
> It seems that only USB sticks labeled as "Kingston DataTraveler G3"
> are affected by r237478 (in my limited testing, at least). This particular
> model is what you get if you buy the cheapest Kingston model on the market
> right now.

I've reviewed that change once more and I see no flaws in it. My only
guess is that it changes something innocent or unrelated in request
order that confuses flash firmware, making it stuck and return errors
without SCSI sense information. In log provided I see that when device
first detected, it normally reports its size. But later, possibly after
some command (SYNCHRONIZE CACHE?, PREVENT ALLOW MEDIUM REMOVAL?), it
starts to behave wrong. Wrong answer to another READ CAPACITY request
causes "got CAM status 0xXX" message and following device loss.

Unfortunately I can't reproduce the problem. All USB sticks I have are
working fine without any problems with HEAD system. If I could, I would
try to log all commands sent to the stick to find one after which
problem begins. Commands could be logged either on CAM layer by running
`camcontrol debug -IPpc all` before plugging stick in and `camcontrol
debug off` after (you may want to do it in single-user mode or without
syslog running to avoid logging activity on other CAM disks), or
probably somehow on umass layer, or with usbdump on raw USB layer (in
last case some more knowledge will be needed to interpret result).

-- 
Alexander Motin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5124EF38.7080302>