Date: Tue, 22 Apr 2008 09:49:22 +1000 (EST) From: Bruce Evans <brde@optusnet.com.au> To: Dominic Fandrey <kamikaze@bsdforen.de> Cc: freebsd-bugs@freebsd.org, gavin@freebsd.org Subject: Re: kern/122961: write operation on msdosfs file system causes panic Message-ID: <20080422084732.H63563@delplex.bde.org> In-Reply-To: <480CC6F4.1000200@bsdforen.de> References: <200804211445.m3LEjNh6018941@freefall.freebsd.org> <480CC6F4.1000200@bsdforen.de>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 21 Apr 2008, Dominic Fandrey wrote: > gavin@FreeBSD.org wrote: >> To submitter: are you able to connect the USB stick to a machine >> running Windows and run chkdsk, to confirm that the filesystem >> is not invalid? (Although we should ideally be resiliant to >> corrupt filesystems, if it still panics after a chkdisk then it's >> a more serious problem...) >> > > I have already checked the stick under windows. Chkdisk did not find any > problems, but the panic still occurs. > > The problem started after I updated RELENG_7 on my machine this weekend. The > previous RELENG_7 build was ~2 months old. This seems to be a bug in usb (umass) or the particular usb drive. msdosfs now uses the drive's advertised max i/o size (mp->mnt_iosize_max) to implement vfs clustering, but mnt_iosize_max seems to be broken for some drives. This is only a theory because bug reporters never repond to requests for more info. Note that there are lots of bugs in the initialization of mp->mnt_iosize_max. It is always MAXPHYS (128K), but few drives support this. Goem bogusly splits up large i/o's into units that the drive claims to support (d_maxsize). d_maxsize is bogusly initialized to the fixed value of DFLTPHYS (64K) in many drivers including da. Bad things then happen if a scsi drive doesn't actually support d_maxsize = 64K. To check that this is the bug, mount msdosfs with -o noclusterr,noclusterw under RELENG_7 or later (the bug also affects RELENG_6, but these mount options are broken in RELENG_6). Then write and read some files, using write() and not mmap(). (Use, dd or cp a file larger than 8M. cp always uses mmap() for files smaller than 8M (a good pessimization if the file is not in the buffer cache), and the nocluster* mount options don't affect mmap() for any file system (another bug), and there is no option to prevent cp using mmap().). Then remount without nocluster* and repeat. The bug should only affect the repeat. > # mount > /dev/ufs/2root on / (ufs, local) > devfs on /dev (devfs, local) > /dev/ufs/2tmp on /tmp (ufs, local, soft-updates) > /dev/ufs/2usr on /usr (ufs, NFS exported, local, soft-updates) > /dev/ufs/2var on /var (ufs, local, soft-updates) > pid874@mobileKamikaze:/var/run/automounter.amd.mnt on > /var/run/automounter.amd.mnt (nfs) > /dev/msdosfs/APRIL RYAN on > /var/run/automounter.mnt/msdosfs/bb8a40b99a061c33a35f4e7275d1842a (msdosfs, > local, noatime, noexec) The labels obfuscate the device type for all mountpoints very well. Your backtrace showed a panic in mmap(). mmap() actually uses the support for vfs clustering (VOP_BMAP()), not vfs clustering itself, to determine the size of the largest contiguous i/o that is possible. It's possible that the bug only affects mmap(), but I doubt it. Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080422084732.H63563>