From owner-freebsd-fs@freebsd.org Thu Jul 6 22:12:58 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 97BDBD93518 for ; Thu, 6 Jul 2017 22:12:58 +0000 (UTC) (envelope-from wjw@digiware.nl) Received: from smtp.digiware.nl (smtp.digiware.nl [IPv6:2001:4cb8:90:ffff::3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 46AAE81C9B for ; Thu, 6 Jul 2017 22:12:57 +0000 (UTC) (envelope-from wjw@digiware.nl) Received: from router.digiware.nl (localhost.digiware.nl [127.0.0.1]) by smtp.digiware.nl (Postfix) with ESMTP id 91A3344CAE; Fri, 7 Jul 2017 00:12:53 +0200 (CEST) X-Virus-Scanned: amavisd-new at digiware.com Received: from smtp.digiware.nl ([127.0.0.1]) by router.digiware.nl (router.digiware.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id kcZzME5zwK8F; Fri, 7 Jul 2017 00:12:52 +0200 (CEST) Received: from [192.168.10.67] (opteron [192.168.10.67]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.digiware.nl (Postfix) with ESMTPSA id 2F3BF44CAC; Fri, 7 Jul 2017 00:12:52 +0200 (CEST) Subject: Re: newfs returns cg 0: bad magic number To: Bruce Evans , Konstantin Belousov Cc: FreeBSD Filesystems References: <20170705051458.GU1935@kib.kiev.ua> <20170705154533.M1171@besplex.bde.org> From: Willem Jan Withagen Message-ID: <9fe3ec97-60ea-e9e6-fb65-9b163d64ac45@digiware.nl> Date: Fri, 7 Jul 2017 00:12:49 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 MIME-Version: 1.0 In-Reply-To: <20170705154533.M1171@besplex.bde.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Jul 2017 22:12:58 -0000 On 5-7-2017 08:55, Bruce Evans wrote: > On Wed, 5 Jul 2017, Konstantin Belousov wrote: > >> On Wed, Jul 05, 2017 at 02:00:43AM +0200, Willem Jan Withagen wrote: >>> Hi, >>> >>> I'm able to create a Ceph RBD backed ggate disk, in /dev/ggate0. >>> It looks like I can: >>> run dd on it >>> gpart the disk >>> create a zpool on it >>> >>> But when I try to create a UFS file system on it, newfs complains >>> straight from the bat. >>> >>> # sudo newfs -E /dev/ggate0p1 >>> /dev/ggate0p1: 1022.0MB (2093056 sectors) block size 32768, fragment >>> size 4096 >>> using 4 cylinder groups of 255.53MB, 8177 blks, 32768 inodes. >>> Erasing sectors [128...2093055] >>> super-block backups (for fsck_ffs -b #) at: >>> 192, 523520, 1046848, 1570176 >>> cg 0: bad magic number >>> >>> Googling returns that this is on and off a problem with new devices, but >>> there is no generic suggestion on how to debug this.... >>> >>> Any/all suggestions are welcome, >> Typically this error means that the drive returns wrong data, not the >> bytes that were written to it and expected to be read. > > This might be for writing to a nonexistent sector. Checking for write > errors was broken by libufs, so some write errors are only sometimes > detected as a side effect of reading back garbage. > > I use the following quick fix (the patch also fixes some style bugs). > > X Index: mkfs.c > X =================================================================== > X RCS file: /home/ncvs/src/sbin/newfs/mkfs.c,v > X retrieving revision 1.85 > X diff -u -1 -r1.85 mkfs.c > X --- mkfs.c 9 Apr 2004 19:58:33 -0000 1.85 > X +++ mkfs.c 7 Apr 2005 23:51:56 -0000 > X @@ -437,16 +441,19 @@ > X if (!Nflag && Oflag != 1) { > X - i = bread(&disk, SBLOCK_UFS1 / disk.d_bsize, chdummy, > SBLOCKSIZE); > X + i = bread(&disk, SBLOCK_UFS1 / disk.d_bsize, chdummy, > X + SBLOCKSIZE); > X if (i == -1) > X - err(1, "can't read old UFS1 superblock: %s", disk.d_error); > X - > X + err(1, "can't read old UFS1 superblock: %s", > X + disk.d_error); > X if (fsdummy.fs_magic == FS_UFS1_MAGIC) { > X fsdummy.fs_magic = 0; > X - bwrite(&disk, SBLOCK_UFS1 / disk.d_bsize, chdummy, > SBLOCKSIZE); > X + bwrite(&disk, SBLOCK_UFS1 / disk.d_bsize, chdummy, > X + SBLOCKSIZE); > X for (i = 0; i < fsdummy.fs_ncg; i++) > X - bwrite(&disk, fsbtodb(&fsdummy, cgsblock(&fsdummy, i)), > X - chdummy, SBLOCKSIZE); > X + bwrite(&disk, > X + fsbtodb(&fsdummy, cgsblock(&fsdummy, i)), > X + chdummy, SBLOCKSIZE); > X } > X } > X - if (!Nflag) > X - sbwrite(&disk, 0); > X + if (!Nflag && sbwrite(&disk, 0) != 0) > X + err(1, "sbwrite: %s", disk.d_error); > X if (Eflag == 1) { > X @@ -518,4 +525,4 @@ > X } > X - if (!Nflag) > X - sbwrite(&disk, 0); > X + if (!Nflag && sbwrite(&disk, 0) != 0) > X + err(1, "sbwrite: %s", disk.d_error); > X for (i = 0; i < sblock.fs_cssize; i += sblock.fs_bsize) > > libufs broke the error handling for the most important writes -- to > the superblock. Error handling is still done almost correctly in > wtfs(), and most writes are still done using wtfs() which is now > just a wrapper which adds error handling to libufs's bwrite(3), but > writes to superblock are (were) now done internally by libufs's > sbwrite(3) which (like most of libufs) is too hard to use. > > Note that -current needs a slightly different fix. Part of libufs > being too hard to use is that it is a library so it can't just exit > for errors. It returns errors in the string disk.d_error and the > fix uses that for newfs, unlike for most other calls to sbwrite(3). > However, newfs no longer uses sbwrite(3). It uses a wrapper > do_sbwrite() which reduces to pwrite(2). The wrapper doesn't set > d_error, so it is incompatible with sbwrite(3). > > This is an example that libufs is even harder to use than might first > appear. The version with the do_sbwrite() wrapper fixes a previous > version which replaced bwrite(3) instead of wrapping it. bwrite() > in the application conflicted with bwrite(3) in libufs, since libufs > is not designed to have its internals replaced by inconsistent parts > like that. Apparently, a special case is only needed for superblock > writes, and do_sbwrite() does that, and since libufs doesn't call any > sbwrite() function internally there is no need to replace sbwrite(3); > sbwrite(3) is just useless for its main application. All that the > bwrite(3) and sbwrite(3) library functions do is handle the block > size implicitly in a way that makes them harder to use than just > multiplying by the block size like wtfs() used to do and do_sbwrite() > now does. This is where the trouble originates: /usr/srcs/11/src/lib/libufs/sblock.c:148 /* * Write superblock summary information. */ blks = howmany(fs->fs_cssize, fs->fs_fsize); space = (uint8_t *)disk->d_sbcsum; for (i = 0; i < blks; i += fs->fs_frag) { But: (gdb) p disk->d_sbcsum $19 = (struct csum *) 0x0 and this pointer is later on used to write: for (i = 0; i < blks; i += fs->fs_frag) { size = fs->fs_bsize; if (i + fs->fs_frag > blks) size = (blks - i) * fs->fs_fsize; if (bwrite(disk, fsbtodb(fs, fs->fs_csaddr + i), space, size) == -1) { ERROR(disk, "Failed to write sb summary information"); return (-1); } space += size; } But the bwrite returns error because the called pwrite() tries to write 4096 bytes from a null pointer. And that it does not like. Now the question is: why isn't d_sbcsum not filled out? Note that the disk is filled with random data. I've been looking for quite some time, but I just don't get it. Where should the superblock come from if a whole disk is being used? (so there no MBR or gpart written. Dangerously dedicated) --WjW