From owner-freebsd-fs@freebsd.org Wed Jul 5 09:13:55 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6A33ADA6005 for ; Wed, 5 Jul 2017 09:13:55 +0000 (UTC) (envelope-from wjw@digiware.nl) Received: from smtp.digiware.nl (smtp.digiware.nl [176.74.240.9]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 0F28578B08 for ; Wed, 5 Jul 2017 09:13:54 +0000 (UTC) (envelope-from wjw@digiware.nl) Received: from router.digiware.nl (localhost.digiware.nl [127.0.0.1]) by smtp.digiware.nl (Postfix) with ESMTP id 7F9EE44567; Wed, 5 Jul 2017 11:13:51 +0200 (CEST) X-Virus-Scanned: amavisd-new at digiware.com Received: from smtp.digiware.nl ([127.0.0.1]) by router.digiware.nl (router.digiware.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id lOM7WKFzloqu; Wed, 5 Jul 2017 11:13:50 +0200 (CEST) Received: from [192.168.10.67] (opteron [192.168.10.67]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.digiware.nl (Postfix) with ESMTPSA id 0799744566; Wed, 5 Jul 2017 11:13:50 +0200 (CEST) Subject: Re: newfs returns cg 0: bad magic number To: Bruce Evans , Konstantin Belousov Cc: FreeBSD Filesystems References: <20170705051458.GU1935@kib.kiev.ua> <20170705154533.M1171@besplex.bde.org> From: Willem Jan Withagen Message-ID: <42b0b86f-5c6b-a0f0-dea4-29f5118c0070@digiware.nl> Date: Wed, 5 Jul 2017 11:13:48 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 MIME-Version: 1.0 In-Reply-To: <20170705154533.M1171@besplex.bde.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Jul 2017 09:13:55 -0000 On 5-7-2017 08:55, Bruce Evans wrote: > On Wed, 5 Jul 2017, Konstantin Belousov wrote: > >> On Wed, Jul 05, 2017 at 02:00:43AM +0200, Willem Jan Withagen wrote: >>> Hi, >>> >>> I'm able to create a Ceph RBD backed ggate disk, in /dev/ggate0. >>> It looks like I can: >>> run dd on it >>> gpart the disk >>> create a zpool on it >>> >>> But when I try to create a UFS file system on it, newfs complains >>> straight from the bat. >>> >>> # sudo newfs -E /dev/ggate0p1 >>> /dev/ggate0p1: 1022.0MB (2093056 sectors) block size 32768, fragment >>> size 4096 >>> using 4 cylinder groups of 255.53MB, 8177 blks, 32768 inodes. >>> Erasing sectors [128...2093055] >>> super-block backups (for fsck_ffs -b #) at: >>> 192, 523520, 1046848, 1570176 >>> cg 0: bad magic number >>> >>> Googling returns that this is on and off a problem with new devices, but >>> there is no generic suggestion on how to debug this.... >>> >>> Any/all suggestions are welcome, >> Typically this error means that the drive returns wrong data, not the >> bytes that were written to it and expected to be read. > > This might be for writing to a nonexistent sector. Checking for write > errors was broken by libufs, so some write errors are only sometimes > detected as a side effect of reading back garbage. I have not tested yet, but this sounds like it could be on the track, since just dd-ing a file with random data to the image, and then running (shorted): # write test + dd 'if=/dev/urandom' 'of=/tmp/tmp.LIM3lwne/data' 'bs=1M' 'count=64' + dd 'if=/tmp/tmp.LIM3lwne/data' 'of=/dev/ggate3' 'bs=1M' [ "`dd if=${DATA} bs=1M | md5`" = "`rbd -p ${POOL} --no-progress export + [ d19e721c75d36bf2787c273768e3b5a2 '=' d19e721c75d36bf2787c273768e3b5a2 ] So unless resulats are all in a cache between userspace and the rbd-ggate, it seems that the data is correctly hitting the platers. I'll go and test if your fix is also my fix. Thanx, --WjW > I use the following quick fix (the patch also fixes some style bugs). > > X Index: mkfs.c > X =================================================================== > X RCS file: /home/ncvs/src/sbin/newfs/mkfs.c,v > X retrieving revision 1.85 > X diff -u -1 -r1.85 mkfs.c > X --- mkfs.c 9 Apr 2004 19:58:33 -0000 1.85 > X +++ mkfs.c 7 Apr 2005 23:51:56 -0000 > X @@ -437,16 +441,19 @@ > X if (!Nflag && Oflag != 1) { > X - i = bread(&disk, SBLOCK_UFS1 / disk.d_bsize, chdummy, > SBLOCKSIZE); > X + i = bread(&disk, SBLOCK_UFS1 / disk.d_bsize, chdummy, > X + SBLOCKSIZE); > X if (i == -1) > X - err(1, "can't read old UFS1 superblock: %s", disk.d_error); > X - > X + err(1, "can't read old UFS1 superblock: %s", > X + disk.d_error); > X if (fsdummy.fs_magic == FS_UFS1_MAGIC) { > X fsdummy.fs_magic = 0; > X - bwrite(&disk, SBLOCK_UFS1 / disk.d_bsize, chdummy, > SBLOCKSIZE); > X + bwrite(&disk, SBLOCK_UFS1 / disk.d_bsize, chdummy, > X + SBLOCKSIZE); > X for (i = 0; i < fsdummy.fs_ncg; i++) > X - bwrite(&disk, fsbtodb(&fsdummy, cgsblock(&fsdummy, i)), > X - chdummy, SBLOCKSIZE); > X + bwrite(&disk, > X + fsbtodb(&fsdummy, cgsblock(&fsdummy, i)), > X + chdummy, SBLOCKSIZE); > X } > X } > X - if (!Nflag) > X - sbwrite(&disk, 0); > X + if (!Nflag && sbwrite(&disk, 0) != 0) > X + err(1, "sbwrite: %s", disk.d_error); > X if (Eflag == 1) { > X @@ -518,4 +525,4 @@ > X } > X - if (!Nflag) > X - sbwrite(&disk, 0); > X + if (!Nflag && sbwrite(&disk, 0) != 0) > X + err(1, "sbwrite: %s", disk.d_error); > X for (i = 0; i < sblock.fs_cssize; i += sblock.fs_bsize) > > libufs broke the error handling for the most important writes -- to > the superblock. Error handling is still done almost correctly in > wtfs(), and most writes are still done using wtfs() which is now > just a wrapper which adds error handling to libufs's bwrite(3), but > writes to superblock are (were) now done internally by libufs's > sbwrite(3) which (like most of libufs) is too hard to use. > > Note that -current needs a slightly different fix. Part of libufs > being too hard to use is that it is a library so it can't just exit > for errors. It returns errors in the string disk.d_error and the > fix uses that for newfs, unlike for most other calls to sbwrite(3). > However, newfs no longer uses sbwrite(3). It uses a wrapper > do_sbwrite() which reduces to pwrite(2). The wrapper doesn't set > d_error, so it is incompatible with sbwrite(3). > > This is an example that libufs is even harder to use than might first > appear. The version with the do_sbwrite() wrapper fixes a previous > version which replaced bwrite(3) instead of wrapping it. bwrite() > in the application conflicted with bwrite(3) in libufs, since libufs > is not designed to have its internals replaced by inconsistent parts > like that. Apparently, a special case is only needed for superblock > writes, and do_sbwrite() does that, and since libufs doesn't call any > sbwrite() function internally there is no need to replace sbwrite(3); > sbwrite(3) is just useless for its main application. All that the > bwrite(3) and sbwrite(3) library functions do is handle the block > size implicitly in a way that makes them harder to use than just > multiplying by the block size like wtfs() used to do and do_sbwrite() > now does.