From owner-freebsd-geom@FreeBSD.ORG Mon Aug 13 01:06:47 2007 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 74BA316A417 for ; Mon, 13 Aug 2007 01:06:47 +0000 (UTC) (envelope-from gcubfg-freebsd-geom@m.gmane.org) Received: from ciao.gmane.org (main.gmane.org [80.91.229.2]) by mx1.freebsd.org (Postfix) with ESMTP id F11FA13C45D for ; Mon, 13 Aug 2007 01:06:46 +0000 (UTC) (envelope-from gcubfg-freebsd-geom@m.gmane.org) Received: from list by ciao.gmane.org with local (Exim 4.43) id 1IKOOE-0002Op-NL for freebsd-geom@freebsd.org; Mon, 13 Aug 2007 03:06:38 +0200 Received: from 89-172-59-239.adsl.net.t-com.hr ([89.172.59.239]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 13 Aug 2007 03:06:38 +0200 Received: from ivoras by 89-172-59-239.adsl.net.t-com.hr with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 13 Aug 2007 03:06:38 +0200 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-geom@freebsd.org From: Ivan Voras Date: Mon, 13 Aug 2007 03:06:16 +0200 Lines: 128 Message-ID: Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enig995DCC39F335D2C2CF8092D9" X-Complaints-To: usenet@sea.gmane.org X-Gmane-NNTP-Posting-Host: 89-172-59-239.adsl.net.t-com.hr User-Agent: Thunderbird 1.5.0.12 (Windows/20070509) X-Enigmail-Version: 0.94.3.0 Sender: news Subject: Gvirstor "newfs" problem - help needed X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Aug 2007 01:06:47 -0000 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig995DCC39F335D2C2CF8092D9 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hi, I have a problem with gvirstor and it looks like I need help (or at least fresh ideas) with it. About a month ago Pawel found that he couldn't newfs huge gvirstor devices - newfs fails when rereading the superblock it has written a moment ago (the read signature apparently doesn't match). I confirmed this and found that it also happens on not-so-large gvirstor devices. One specific case is a device of 1100000 MB with 2048 KB-sized chunks. This is what's happening: - The following is the dump of IO operations from newfs. It first writes some data, probably including the superblock: Jul 18 05:12:01 gv7 kernel: GEOM_VIRSTOR[10]: Mapped W(1153467153920, 512) to (550015,2096640,512) (((note: the above log entry says someone is writing on virtual offset 1153467153920, size 512, and this is mapped to virtual chunk #550015, in-chunk offset 2096640))) Jul 18 05:12:01 gv7 kernel: GEOM_VIRSTOR[10]: Allocated chunk 1 on da0 for virstor/bla Jul 18 05:12:01 gv7 kernel: GEOM_VIRSTOR[10]: Mapped R(8192, 8192) to (0,8192,8192) Jul 18 05:12:01 gv7 kernel: GEOM_VIRSTOR[10]: Mapped W(65536, 8192) to (0,65536,8192) Jul 18 05:12:01 gv7 kernel: GEOM_VIRSTOR[10]: Mapped W(81920, 65536) to (0,81920,65536) Jul 18 05:12:01 gv7 kernel: GEOM_VIRSTOR[10]: Mapped W(192774144, 65536) to (91,1933312,65536) The important operation is the semi-last one: a write to position 81920 of size 65536. - Following this, there's a long list of writes of UFS metadata of size 65536, scattered across the drive in increasing offset and not important. This IO looks like: Jul 18 05:12:01 gv7 kernel: GEOM_VIRSTOR[10]: Allocated chunk 3 on da0 for virstor/bla Jul 18 05:12:01 gv7 kernel: GEOM_VIRSTOR[10]: Mapped W(385466368, 65536) to (183,1687552,65536) Jul 18 05:12:01 gv7 kernel: GEOM_VIRSTOR[10]: Allocated chunk 4 on da0 for virstor/bla Jul 18 05:12:01 gv7 kernel: GEOM_VIRSTOR[10]: Mapped W(578158592, 65536) to (275,1441792,65536) - Finally, newfs reads some of the metadata it has written at the start: Jul 18 05:12:45 gv7 kernel: GEOM_VIRSTOR[10]: Mapped R(98304, 16384) to (0,98304,16384) THIS request returns something newfs doesn't like. This data is within the position written by newfs at position 81920. BUT, I cannot reproduce this with my test programs. I made a test program specifically to s(t)imulate this exact behaviour. It writes the block as written by newfs and then reads subblocks from it, eventually creating the same situation: Jul 18 05:14:57 gv7 kernel: GEOM_VIRSTOR[10]: Mapped W(81920, 65536) to (0,81920,65536) =2E.. Jul 18 05:14:57 gv7 kernel: GEOM_VIRSTOR[10]: Mapped R(98304, 16384) to (0,98304,16384) The data written is generated by /dev/random and compared byte for byte with the read data. There are no errors. Between read and write operations in the test program, the program writes to random high locations on the drive, allocating new chunks like newfs does. I've also made several more-or-less generic tests that test "edge" cases like on the border of chunks, random IO, etc, and all pass. The development tree is at http://ivoras.sharanet.org/stuff/gvirstor-devel.tg= z . This tree works only on 7-CURRENT, and has a lot of debugging messages turned on. To run it, run "make", "make so" and "make tests", load the resulting .ko, and make a symlink of the .so file into /lib/geom. To provoke the bug, create a virstor device like this: =2E/gvirstor label -s 1100000 -m 2048 bla /dev/da0 then, simply run "newfs" on it: newfs /dev/virstor/bla To make this work, you'll need lots of space on the physical device (da0 in the above example), something like 50 GB. After newfs finishes writing cgs all over the device, it will read the first superblock and fail in verifying the signature. I've also tried this with ext2's mkfs and fsck, and the same symptom happens, so I don't think there's a bug in newfs: cg 0: bad magic number There are now 5 small test programs in the above source tgz, but all of the tests pass without problems. I'm out of ideas currently, and I'm looking for ideas and advice on how to proceed (I'd like to provoke this bug in a small, managable test case). --------------enig995DCC39F335D2C2CF8092D9 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGv66OldnAQVacBcgRAjFeAKCMPvksJ2BN0O+cfKLQDjJV03VvqQCfZacN wldaF1C5+PbNgV0LV2+W5CI= =R18O -----END PGP SIGNATURE----- --------------enig995DCC39F335D2C2CF8092D9--