From owner-freebsd-current@FreeBSD.ORG Mon Nov 17 14:13:46 2003 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3489F16A4CE; Mon, 17 Nov 2003 14:13:46 -0800 (PST) Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0E63843FAF; Mon, 17 Nov 2003 14:13:45 -0800 (PST) (envelope-from robert@fledge.watson.org) Received: from fledge.watson.org (localhost [127.0.0.1]) by fledge.watson.org (8.12.9p2/8.12.9) with ESMTP id hAHMBeMg075467; Mon, 17 Nov 2003 17:11:40 -0500 (EST) (envelope-from robert@fledge.watson.org) Received: from localhost (robert@localhost)hAHMBeCV075464; Mon, 17 Nov 2003 17:11:40 -0500 (EST) (envelope-from robert@fledge.watson.org) Date: Mon, 17 Nov 2003 17:11:40 -0500 (EST) From: Robert Watson X-Sender: robert@fledge.watson.org To: Eric Anholt In-Reply-To: <1042837289.668.10.camel@leguin> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: Greg Lehey cc: freebsd-current@FreeBSD.ORG cc: chancedj@yahoo.com Subject: Re: vinum error: statfs related? X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Nov 2003 22:13:46 -0000 On Fri, 17 Jan 2003, Eric Anholt wrote: > I'm getting the same (no drives/subdisks/plexes/volumes found) trying to > upgrade from a Nov 11 kernel/userland to Nov 16th kernel. I tried > seeing if using a Nov 16th vinum binary would load them, but after doing > a stop/start, the system paniced, and it seems my swap is too small to > dump on. Kernel was built using configure MYKERNEL; cd > ../compile/MYKERNEL; make depend all install instead of buildkernel. DDB > enabled but no invariants/witness, not sure what else from my config > might be applicable. I'm able to trigger this warning simply by starting and stopping Vinum without a Vinum configuration: ttyp0: crash2# vinum start ** no drives found: No such file or directory crash2# vinum stop vinum unloaded console: vinum: loaded vinum: no drives found vinum: exiting with malloc table inconsistency at 0xc2053c00 from vinumio.c:755 vinum: unloaded I attempted to experiment some with Vinum today. After fixing a bug in the vinum user tool to stop trying to create device nodes and directories in devfs, it seemed to come up OK (fix committed). I documented the bug that vinum won't work with storage devices with sector sizes other than DEV_BSIZE (512) in the vinum.8 man page, since I don't have time to fix it today. I created a malloc md-backed vinum array with seeming ease, but was unable to newfs the result: ttyp0: crash2# mdconfig -a -t malloc -s 1m md0 crash2# mdconfig -a -t malloc -s 1m md1 crash2# mdconfig -a -t malloc -s 1m md2 crash2# vinum vinum -> concat /dev/md0 /dev/md1 /dev/md2 vinum -> quit crash2# newfs /dev/vinum/vinum0 /dev/vinum/vinum0: 2.6MB (5348 sectors) block size 16384, fragment size 2048 using 4 cylinder groups of 0.66MB, 42 blks, 128 inodes. super-block backups (for fsck -b #) at: 160, 1504, 2848, 4192 cg 0: bad magic number console: vinum: loaded vinum: drive vinumdrive0 is up vinum: drive vinumdrive1 is up vinum: drive vinumdrive2 is up vinum: vinum0.p0.s0 is up vinum: vinum0.p0.s1 is up vinum: vinum0.p0.s2 is up vinum: vinum0.p0 is up vinum: vinum0 is up So clearly UFS is unhappy with something about the array. I tried reading/writing stuff to/from the array with pretty mixed results: ttyp0: crash2# diskinfo /dev/vinum/vinum0 /dev/vinum/vinum0 512 2738688 5349 crash2# dd if=/dev/random of=/data.file bs=512 count=5349 5349+0 records in 5349+0 records out 2738688 bytes transferred in 2.520634 secs (1086508 bytes/sec) crash2# dd if=/data.file of=/dev/vinum/vinum0 bs=512 count=5349 5349+0 records in 5349+0 records out 2738688 bytes transferred in 2.464483 secs (1111263 bytes/sec) crash2# dd if=/dev/vinum/vinum0 of=/data.file2 bs=512 count=5349 5349+0 records in 5349+0 records out 2738688 bytes transferred in 2.467386 secs (1109955 bytes/sec) crash2# ls -l /data.f* -rw-r--r-- 1 root wheel 2738688 Nov 17 17:02 /data.file -rw-r--r-- 1 root wheel 2738688 Nov 17 17:03 /data.file2 crash2# md5 /data.file* MD5 (/data.file) = ce76d17b337f70c1d4d53b48cf08f906 MD5 (/data.file2) = b1d08e0fe52ecff364a894edf43caef2 The reason for the somewhat long copy times is that / for this box is out of NFS. To be sure, I ran this a second time: MD5 (/data.file.3) = d0c9d71cfacedc70358be028f0c346dd MD5 (/data.file.4) = 0ea319da8e68550c2ebf91e6b1618976 It sounds like there's a serious problem with Vinum right now. I took a look through the vinum data structures, and I couldn't see any obvious problems that could have stemmed from the statfs() change: specifically, I didn't see any data structures that would have changed size as a result of the change. So I'm guessing it was some other similarly timed change, but I'm not sure what. It's interesting to observe that I didn't get the malloc failure when I unloaded Vinum after the above tests: it appears to occur as a result of a configuration difficulty (such as a failure to find one), and so may actually be a red herring for the underlying problem. Or at least, an independent bug/feature. I'm heading home for the day, when I head home, I'll try changing around the testing procedure to attempt to identify what exactly is getting corrupted in my dd tests. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects robert@fledge.watson.org Network Associates Laboratories