Date: Fri, 7 Nov 2008 14:01:02 -0800 From: Jeremy Chadwick <koitsu@FreeBSD.org> To: Kevin Oberman <oberman@es.net> Cc: freebsd-stable@freebsd.org Subject: Re: Problem with USB drive errors in recent 7-Stable Message-ID: <20081107220102.GA14260@icarus.home.lan> In-Reply-To: <20081107212148.1A47245010@ptavv.es.net> References: <20081107212148.1A47245010@ptavv.es.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Nov 07, 2008 at 01:21:48PM -0800, Kevin Oberman wrote: > I recently started getting errors on a fairly new USB connected SATA > drive. Aside from the errors, the system was locking up as any process > attempting to access the drive would lock up in disk uninterruptible > wait ("D" in ps). I could not shut down the system and had to power it > off. (It's a laptop.) After a reboot, I tried to fsck it and that locked > up, too. I was able to recover by telling fsck to not fix the truncated > inode and fix everything else. Then I ran fsck again and it was > successful in fixing the inode. This happened several times. > > I then bought a new drive and got the identical behavior! It was not the > drive. I rolled my kernel back to 9/13/08 and tried again. This time it just > worked! No errors or lock up. > > I suspect that there are two issues. One results in the lock-up when the > disk had errors and the other caused the purported disk errors. The > latter has been introduced since 9/13/08. The kernel that produced the > errors was from 10/21. I also ran a kernel from 10/8 which did not cause > me problems, but I'm not sure that I used the USB drive with this > kernel. > > I'll be building a 10/8 kernel later, after I have backed up some data > from a failing drive (PATA, not USB, and SMART confirms that the this > disk is sick). I will try to track down exactly which change triggered > this ugly behavior, but that will take a number of kernel builds, so it > will take a while. > > Has anyone else seen this? Any ideas on what changes might be the most > likely cause. Could be USB, CAM, or something else, I guess. Funny you should post this today -- I just spent the past few days dealing with this problem, specifically the kernel being "stuck" when writing to a umass/da device (in my case, USB flash drives). When I say "stuck", I mean the kernel was still responsive: Ctrl-T would report statuses in processes (the states shown were all different) but the processes essentially had "hung". Ctrl-Alt-Esc on the console dropped me to a db> prompt, so it's not as if the machine had frozen/locked up; it was as if some part surrounding the storage subsystem was spinning in a loop. IP traffic still worked as well, but of course anything that accessed disks would hang. Rebooting the box via Ctrl-Alt-Del wouldn't work, because it would get stuck waiting for a bunch of PIDs to end. I switched the box to CURRENT (for a lot of reasons), and one of those was to try out the new USB4BSD (called "USB2" -- not to be confused with the USB2.0 protocol) stack. That simply induced a random kernel panic. However, HPS is fairly certain he found the issue, and it's with bus_dma(9) interaction. Here's the thread: http://lists.freebsd.org/pipermail/freebsd-current/2008-November/thread.html#235 http://lists.freebsd.org/pipermail/freebsd-current/2008-November/000220.html I have not yet tried his patches (I just woke up), but I will in a short while. So far I have a lot more faith in USB4BSD than I do the old stack, simply because there's active work going on in it. (It's ironic that I encountered this issue while working on a document describing how to put FreeBSD i386, amd64, and MS-DOS on a USB flash drive, so one could install FreeBSD from it, or boot MS-DOS for BIOS upgrades) -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB |
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20081107220102.GA14260>