From owner-freebsd-usb@FreeBSD.ORG  Mon Jan 24 11:27:31 2011
Return-Path: <owner-freebsd-usb@FreeBSD.ORG>
Delivered-To: freebsd-usb@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id B206E10656A4;
	Mon, 24 Jan 2011 11:27:31 +0000 (UTC)
	(envelope-from hselasky@c2i.net)
Received: from swip.net (mailfe04.c2i.net [212.247.154.98])
	by mx1.freebsd.org (Postfix) with ESMTP id 0BCC68FC22;
	Mon, 24 Jan 2011 11:27:30 +0000 (UTC)
X-Cloudmark-Score: 0.000000 []
X-Cloudmark-Analysis: v=1.1 cv=FCQkFjYgELNkj6Q2r2z7VPLsEezp8QkZcKORzHC3d6k=
	c=1 sm=1 a=P5NE3bt0QbgA:10 a=Q9fys5e9bTEA:10 a=CL8lFSKtTFcA:10
	a=i9M/sDlu2rpZ9XS819oYzg==:17 a=x0rStR0kPNbJMZ0BeEEA:9
	a=8_1rYk2IxaacTq1eWX4A:7 a=o0MkvM_mR23W_GsnFrY0jP3Zj_oA:4
	a=PUjeQqilurYA:10 a=FoDfC59nWgHHAGjj:21 a=OmT53Uu0yfSjWUJy:21
	a=i9M/sDlu2rpZ9XS819oYzg==:117
Received: from [188.126.198.129] (account mc467741@c2i.net HELO
	laptop002.hselasky.homeunix.org)
	by mailfe04.swip.net (CommuniGate Pro SMTP 5.2.19)
	with ESMTPA id 77615632; Mon, 24 Jan 2011 12:27:29 +0100
From: Hans Petter Selasky <hselasky@c2i.net>
To: CDP <dr.clau@gmail.com>
Date: Mon, 24 Jan 2011 12:27:36 +0100
User-Agent: KMail/1.13.5 (FreeBSD/8.2-PRERELEASE; KDE/4.4.5; amd64; ; )
References: <4D3CAE4E.2040407@gmail.com> <201101241034.07591.hselasky@c2i.net>
	<4D3D5DBF.3080600@gmail.com>
In-Reply-To: <4D3D5DBF.3080600@gmail.com>
X-Face: *nPdTl_}RuAI6^PVpA02T?$%Xa^>@hE0uyUIoiha$pC:9TVgl.Oq, NwSZ4V"|LR.+tj}g5
	%V,x^qOs~mnU3]Gn; cQLv&.N>TrxmSFf+p6(30a/{)KUU!s}w\IhQBj}[g}bj0I3^glmC(
	:AuzV9:.hESm-x4h240C`9=w
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="iso-8859-15"
Content-Transfer-Encoding: 7bit
Message-Id: <201101241227.36923.hselasky@c2i.net>
Cc: mav@freebsd.org, freebsd-usb@freebsd.org
Subject: Re: System lockups caused by USB external HDD
X-BeenThere: freebsd-usb@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: FreeBSD support for USB <freebsd-usb.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-usb>,
	<mailto:freebsd-usb-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-usb>
List-Post: <mailto:freebsd-usb@freebsd.org>
List-Help: <mailto:freebsd-usb-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-usb>,
	<mailto:freebsd-usb-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 24 Jan 2011 11:27:31 -0000

On Monday 24 January 2011 12:08:47 CDP wrote:
> On 01/24/11 11:34, Hans Petter Selasky wrote:
> > On Monday 24 January 2011 10:00:53 CDP wrote:
> >> On 01/24/11 01:56, Daniel O'Connor wrote:
> >>> On 24/01/2011, at 9:10, CDP wrote:
> >>>> g_vfs_done():da0s2[WRITE(offset=xxxxxxxxxxxx, length=16384)]error = 5
> >>>> [several more lines similar to the above]
> >>>> panic: softdep_move_dependencies: need merge code
> >>>> cpuid = 0
> >>>> KDB: stack backtrace:
> >>>> #0 0x... at kdb_backtrace+0x5e
> >>>> #1 0x... at panic+0x182
> >>> 
> >>> It looks like the disk is dying, or the FS is corrupt (the former might
> >>> cause the later).
> >>> 
> >>> Can you run smartctl on the disk? Unfortunately a lot of enclosures
> >>> reject SMART commands so you might not be able to :(
> >> 
> >> I have attached the output of smartctl -d sat -a /dev/da0. I didn't yet
> >> run a SMART long test for the simple reason that the disk is going into
> >> sleep mode and interrupts it. Haven't bothered to keep it alive for a
> >> long test but I might just do that.
> >> 
> >> Although, I doubt it's a disk failure, since I do backups on it without
> >> problems by using FreeBSD 7.3, on the same space where FreeBSD 8.x
> >> fails. And I am talking about over 150GB of data in one run, while
> >> 8.2-RC2 crashes after 5-10GB. I have experienced disk failure in the
> >> past, on SATA, and a few read/write errors never caused a system lockup.
> >> 
> >> My feeling is that enough traffic on USB causes the problem, and that
> >> this problem is only present in the new USB stack.
> >> Unfortunately downgrading to 7.x is not an option because there are
> >> things that won't work on this notebook.
> > 
> > If you run a simple test like this:
> > 
> > dd if=/dev/da0 of=/dev/null bs=65536
> > dd if=/dev/da0 of=/dev/null bs=16384
> > 
> > Do you then see any errors?
> > 
> > Do you have a spare USB memory stick which you could run similar write
> > tests on?
> 
> Both reads fail with I/O error, while writes to an unused partition seem
> to be fine (I interrupted the writes after a while):
> 
> % dd if=/dev/da0 of=/dev/null bs=65536
> dd: /dev/da0: Input/output error
> 191732+0 records in
> 191732+0 records out
> 12565348352 bytes transferred in 429.999272 secs (29221790 bytes/sec)
> 
> % dd if=/dev/da0 of=/dev/null bs=16384
> dd: /dev/da0: Input/output error
> 126427+0 records in
> 126427+0 records out
> 2071379968 bytes transferred in 169.431766 secs (12225452 bytes/sec)
> 
> # dd if=/dev/random of=/dev/da0s3 bs=65536
> ^C329378+0 records in
> 329377+0 records out
> 21586051072 bytes transferred in 1003.020293 secs (21521051 bytes/sec)
> 
> # dd if=/dev/random of=/dev/da0s3 bs=16384
> ^C679571+0 records in
> 679571+0 records out
> 11134091264 bytes transferred in 690.135793 secs (16133189 bytes/sec)
> 
> This is what I get in /var/log/messages when the I/O error occurs:
> (da0:umass-sim0:0:0:0): AutoSense failed
> 
> However, I experience no lockup. Maybe this situation is not handled
> correctly at another level ?

I haven't looked into the code of CAM or GEOM that much so I won't say too 
much about that. I believe the USB/umass is not to blame. What you could do is 
to add a conditional error printout in "umass_t_bbb_status_callback()" in 
/sys/dev/usb/storage/umass.c when the error happens. If that error is not a 
USB transport error, then we are most likely seeing a SCSI issue in layers 
above umass. Or if you have access to USB analyser use that. There is now also 
the option to trace USB from the kernel itself, but the feature is in its 
early development.

--HPS