Date: Wed, 31 Aug 2005 19:22:31 +0000 From: Ben Kaduk <minimarmot@gmail.com> To: Scott Long <scottl@samsco.org> Cc: freebsd-current@freebsd.org, Kyle Brooks <captinsmock@columbus.rr.com> Subject: Re: panic after removing usb flash drive Message-ID: <47d0403c05083112223e874ea8@mail.gmail.com> In-Reply-To: <4315CEEC.80100@samsco.org> References: <1125452228.740.3.camel@arbitor.homelinux.com> <47d0403c05083020044f6ac0be@mail.gmail.com> <4315CEEC.80100@samsco.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On 8/31/05, Scott Long <scottl@samsco.org> wrote: >=20 > Ben Kaduk wrote: > > On 8/31/05, Kyle Brooks <captinsmock@columbus.rr.com> wrote: > > > >>umass0: LEXAR MEDIA JUMPDRIVE2, rev 2.00/1.25, addr 2 > >>umass0: at uhub4 port 6 (addr 2) disconnected > >>panic: vm_fault: fault on nofault entry, addr: deadc000 > >> > >>kernel: > >> > >>FreeBSD 7.0-CURRENT #2: Mon Aug 29 00:39:21 UTC 2005 > >> > >>problem: > >> > >>kernel panics when usb flash drive is removed > >> > >>backtrace: > >> > >>#0 doadump () at pcpu.h:165 > >>#1 0xc068610e in boot (howto=3D260) > >>at /usr/src/sys/kern/kern_shutdown.c:397 > >>#2 0xc0685b92 in panic ( > >>fmt=3D0xc090e46c "vm_fault: fault on nofault entry, addr: %lx") > >>at /usr/src/sys/kern/kern_shutdown.c:553 > >>#3 0xc0812de1 in vm_fault (map=3D0xc1060000, vaddr=3D3735928832, > >>fault_type=3D2 '\002', fault_flags=3D0) > >>at /usr/src/sys/vm/vm_fault.c:884 > >>#4 0xc0888807 in trap_pfault (frame=3D0xe6a06bf0, usermode=3D0, > >>eva=3D3735929110) > >>at /usr/src/sys/i386/i386/trap.c:741 > >>#5 0xc0888d04 in trap (frame=3D > >>{tf_fs =3D 8, tf_es =3D -1063649240, tf_ds =3D 40, tf_edi =3D -99387596= 8, > >>tf_esi =3D -1014223872, tf_ebp =3D -425694000, tf_isp =3D -425694180, t= f_ebx =3D > >>-1063640044, tf_edx =3D -993875900, tf_ecx =3D 0, tf_eax =3D -559038242= , > >>tf_trapno =3D 12, tf_err =3D 2, tf_eip =3D -1069194040, tf_cs =3D 32, t= f_eflags > >>=3D 66050, tf_esp =3D -1063640032, tf_ss =3D 0}) > >>at /usr/src/sys/i386/i386/trap.c:442 > >>#6 0xc08745ba in calltrap () at /usr/src/sys/i386/i386/exception.s:139 > >>#7 0x00000008 in ?? () > >>#8 0xc09a0028 in atdma_acpi_driver_mod () > >>#9 0x00000028 in ?? () > >>#10 0xc4c2a800 in ?? () > >>#11 0xc38c2c00 in ?? () > >>#12 0xe6a06cd0 in ?? () > >>#13 0xe6a06c1c in ?? () > >>---Type <return> to continue, or q <return> to quit--- > >>#14 0xc09a2414 in xsoftc () > >>#15 0xc4c2a844 in ?? () > >>#16 0x00000000 in ?? () > >>#17 0xdeadc0de in ?? () > >>#18 0x0000000c in ?? () > >>#19 0x00000002 in ?? () > >>#20 0xc04564c8 in camisr (V_queue=3D0xc09a2414) > >>at /usr/src/sys/cam/cam_xpt.c:7066 > >>#21 0xc066f84e in ithread_loop (arg=3D0xc356fa80) > >>at /usr/src/sys/kern/kern_intr.c:545 > >>#22 0xc066e808 in fork_exit (callout=3D0xc066f665 <ithread_loop>, arg= =3D0x0, > >>frame=3D0x0) at /usr/src/sys/kern/kern_fork.c:789 > >>#23 0xc087461c in fork_trampoline () > >>at /usr/src/sys/i386/i386/exception.s:208 > >> > > > > This is the expected behaviour >=20 > Panics are not acceptable or expected behaviour in any situation, btw. >=20 > > if you didn't unmount the filesystem on the > > thumbdrive before removing it. There was some discussion on this a whil= e=20 > ago > > (but I don't seem to be able to find the exact posts), but the general= =20 > idea > > is that the kernel has no idea in what state the actual physical medium > > (disc) is/was in after being pulled, and may have some stale buffers=20 > holding > > data that got written to disk. It doesn't know what to do with this=20 > data, or > > how to treat requests to that device, so it panics. > > >=20 > I probably missed the earlier discussion that you are referring to, but > what you are saying here actually isn't true. There are a number of > problems: Sorry to be giving out bad information -- I really should have found the ol= d=20 discussions I remember before posting.=20 1) When the thumbdrive gets pulled, the umass driver gets told to > detach. It tries to detach itself from CAM, but things don't get torn > down correctly because there is an open reference to the target in CAM > (because there is a mounted filesystem on the device). umass truddles > along anyways and goes away, leaving lots of dangling pointers in CAM > that blow up on the next attempted I/O access. >=20 > Part of the problem here is that the umass driver is architected wrong. > It creates a SIM, bus, and target instance for every umass device that > gets inserted. When the device gets pulled, it tries to tear down > each of those instances all at once. CAM simply wasn't designed for > this. It was designed for the SIMs and buses to be long-lived objects > where only the targets (and luns) come and go. Making umass fit this > model would invlove turning it into two logical drivers. One would be > a SIM that would attach to the root hub instance of each USB controller > and would treat the USB bus as a CAM bus. The other would be a target > driver that gets created and destroyed on a per-device basis as those > devices come and go. When a umass device gets plugged in, the USB > framework would tell the apprpriate SIM to create a target instance. > When the device gets pulled, the framework would tell the SIM to detach > and destroy the target. No dangling pointers would be left behind by > the SIM going away. I have some prototype work in progress on this. >=20 > 2) Some filesystems, UFS in particular, assume that an I/O will never > fail. Instead of checking the error status of the buf on completion, > they just continue on and assume that everything is fine. If the > VM is trying to page in a vnode, for example, it'll think that > the operation succeeded, and then really bad things will happen. I'm > not sure if the same problem exists in MSDOSFS because I don't have > any DOS filesystems except on USB, and the problem with umass stands > in the way of further testing. In luei of fixing umass, I might have to > create a synthetic md device to hold a msdos filesystem so that I can > test how it behaves. >=20 > 3) It's unknown if the VM system knows how to rationally deal with > failed I/O or how to propagate that kind of failure to the rest of the > kernel and/or applications. What happens if you mmap a file, and then > the device holding the file goes away? How do you let the application > know that its mmap is now invalid? Send it a Sig11, maybe? How should > the vnode pager deal with failure? There are lots of interesting > problems here. >=20 > In any case, the panic posted in the grandparent message implicates CAM > and umass, which is what I would expect. There may be more layers of > problems underneath it. >=20 > Scott >=20 Thanks for the in-depth explanation. I will search the archives tonight to= =20 find the old discussion and see where I was misreading things. Ben Kaduk
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?47d0403c05083112223e874ea8>