From owner-freebsd-fs@freebsd.org  Thu Jul  6 22:12:58 2017
Return-Path: <owner-freebsd-fs@freebsd.org>
Delivered-To: freebsd-fs@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 97BDBD93518
 for <freebsd-fs@mailman.ysv.freebsd.org>; Thu,  6 Jul 2017 22:12:58 +0000 (UTC)
 (envelope-from wjw@digiware.nl)
Received: from smtp.digiware.nl (smtp.digiware.nl [IPv6:2001:4cb8:90:ffff::3])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256
 bits)) (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 46AAE81C9B
 for <freebsd-fs@freebsd.org>; Thu,  6 Jul 2017 22:12:57 +0000 (UTC)
 (envelope-from wjw@digiware.nl)
Received: from router.digiware.nl (localhost.digiware.nl [127.0.0.1])
 by smtp.digiware.nl (Postfix) with ESMTP id 91A3344CAE;
 Fri,  7 Jul 2017 00:12:53 +0200 (CEST)
X-Virus-Scanned: amavisd-new at digiware.com
Received: from smtp.digiware.nl ([127.0.0.1])
 by router.digiware.nl (router.digiware.nl [127.0.0.1]) (amavisd-new,
 port 10024)
 with ESMTP id kcZzME5zwK8F; Fri,  7 Jul 2017 00:12:52 +0200 (CEST)
Received: from [192.168.10.67] (opteron [192.168.10.67])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (No client certificate requested)
 by smtp.digiware.nl (Postfix) with ESMTPSA id 2F3BF44CAC;
 Fri,  7 Jul 2017 00:12:52 +0200 (CEST)
Subject: Re: newfs returns cg 0: bad magic number
To: Bruce Evans <brde@optusnet.com.au>,
 Konstantin Belousov <kostikbel@gmail.com>
Cc: FreeBSD Filesystems <freebsd-fs@freebsd.org>
References: <c98c813c-c393-9ba5-5c70-b9575fe59553@digiware.nl>
 <20170705051458.GU1935@kib.kiev.ua> <20170705154533.M1171@besplex.bde.org>
From: Willem Jan Withagen <wjw@digiware.nl>
Message-ID: <9fe3ec97-60ea-e9e6-fb65-9b163d64ac45@digiware.nl>
Date: Fri, 7 Jul 2017 00:12:49 +0200
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101
 Thunderbird/52.2.1
MIME-Version: 1.0
In-Reply-To: <20170705154533.M1171@besplex.bde.org>
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 06 Jul 2017 22:12:58 -0000

On 5-7-2017 08:55, Bruce Evans wrote:
> On Wed, 5 Jul 2017, Konstantin Belousov wrote:
> 
>> On Wed, Jul 05, 2017 at 02:00:43AM +0200, Willem Jan Withagen wrote:
>>> Hi,
>>>
>>> I'm able to create a Ceph RBD backed ggate disk, in /dev/ggate0.
>>> It looks like I can:
>>>     run dd on it
>>>     gpart the disk
>>>     create a zpool on it
>>>
>>> But when I try to create a UFS file system on it, newfs complains
>>> straight from the bat.
>>>
>>> # sudo newfs -E /dev/ggate0p1
>>> /dev/ggate0p1: 1022.0MB (2093056 sectors) block size 32768, fragment
>>> size 4096
>>>         using 4 cylinder groups of 255.53MB, 8177 blks, 32768 inodes.
>>> Erasing sectors [128...2093055]
>>> super-block backups (for fsck_ffs -b #) at:
>>>  192, 523520, 1046848, 1570176
>>> cg 0: bad magic number
>>>
>>> Googling returns that this is on and off a problem with new devices, but
>>> there is no generic suggestion on how to debug this....
>>>
>>> Any/all suggestions are welcome,
>> Typically this error means that the drive returns wrong data, not the
>> bytes that were written to it and expected to be read.
> 
> This might be for writing to a nonexistent sector.  Checking for write
> errors was broken by libufs, so some write errors are only sometimes
> detected as a side effect of reading back garbage.
> 
> I use the following quick fix (the patch also fixes some style bugs).
> 
> X Index: mkfs.c
> X ===================================================================
> X RCS file: /home/ncvs/src/sbin/newfs/mkfs.c,v
> X retrieving revision 1.85
> X diff -u -1 -r1.85 mkfs.c
> X --- mkfs.c    9 Apr 2004 19:58:33 -0000    1.85
> X +++ mkfs.c    7 Apr 2005 23:51:56 -0000
> X @@ -437,16 +441,19 @@
> X      if (!Nflag && Oflag != 1) {
> X -        i = bread(&disk, SBLOCK_UFS1 / disk.d_bsize, chdummy,
> SBLOCKSIZE);
> X +        i = bread(&disk, SBLOCK_UFS1 / disk.d_bsize, chdummy,
> X +            SBLOCKSIZE);
> X          if (i == -1)
> X -            err(1, "can't read old UFS1 superblock: %s", disk.d_error);
> X -
> X +            err(1, "can't read old UFS1 superblock: %s",
> X +                disk.d_error);
> X          if (fsdummy.fs_magic == FS_UFS1_MAGIC) {
> X              fsdummy.fs_magic = 0;
> X -            bwrite(&disk, SBLOCK_UFS1 / disk.d_bsize, chdummy,
> SBLOCKSIZE);
> X +            bwrite(&disk, SBLOCK_UFS1 / disk.d_bsize, chdummy,
> X +                SBLOCKSIZE);
> X              for (i = 0; i < fsdummy.fs_ncg; i++)
> X -                bwrite(&disk, fsbtodb(&fsdummy, cgsblock(&fsdummy, i)),
> X -                        chdummy, SBLOCKSIZE);
> X +                bwrite(&disk,
> X +                    fsbtodb(&fsdummy, cgsblock(&fsdummy, i)),
> X +                    chdummy, SBLOCKSIZE);
> X          }
> X      }
> X -    if (!Nflag)
> X -        sbwrite(&disk, 0);
> X +    if (!Nflag && sbwrite(&disk, 0) != 0)
> X +        err(1, "sbwrite: %s", disk.d_error);
> X      if (Eflag == 1) {
> X @@ -518,4 +525,4 @@
> X      }
> X -    if (!Nflag)
> X -        sbwrite(&disk, 0);
> X +    if (!Nflag && sbwrite(&disk, 0) != 0)
> X +        err(1, "sbwrite: %s", disk.d_error);
> X      for (i = 0; i < sblock.fs_cssize; i += sblock.fs_bsize)
> 
> libufs broke the error handling for the most important writes -- to
> the superblock.  Error handling is still done almost correctly in
> wtfs(), and most writes are still done using wtfs() which is now
> just a wrapper which adds error handling to libufs's bwrite(3), but
> writes to superblock are (were) now done internally by libufs's
> sbwrite(3) which (like most of libufs) is too hard to use.
> 
> Note that -current needs a slightly different fix.  Part of libufs
> being too hard to use is that it is a library so it can't just exit
> for errors.  It returns errors in the string disk.d_error and the
> fix uses that for newfs, unlike for most other calls to sbwrite(3).
> However, newfs no longer uses sbwrite(3).  It uses a wrapper
> do_sbwrite() which reduces to pwrite(2).  The wrapper doesn't set
> d_error, so it is incompatible with sbwrite(3).
> 
> This is an example that libufs is even harder to use than might first
> appear.  The version with the do_sbwrite() wrapper fixes a previous
> version which replaced bwrite(3) instead of wrapping it.  bwrite()
> in the application conflicted with bwrite(3) in libufs, since libufs
> is not designed to have its internals replaced by inconsistent parts
> like that.  Apparently, a special case is only needed for superblock
> writes, and do_sbwrite() does that, and since libufs doesn't call any
> sbwrite() function internally there is no need to replace sbwrite(3);
> sbwrite(3) is just useless for its main application.  All that the
> bwrite(3) and sbwrite(3) library functions do is handle the block
> size implicitly in a way that makes them harder to use than just
> multiplying by the block size like wtfs() used to do and do_sbwrite()
> now does.

This is where the trouble originates:
/usr/srcs/11/src/lib/libufs/sblock.c:148
        /*
         * Write superblock summary information.
         */
        blks = howmany(fs->fs_cssize, fs->fs_fsize);
        space = (uint8_t *)disk->d_sbcsum;
        for (i = 0; i < blks; i += fs->fs_frag) {

But:

(gdb) p disk->d_sbcsum
$19 = (struct csum *) 0x0

and this pointer is later on used to write:
    for (i = 0; i < blks; i += fs->fs_frag) {
        size = fs->fs_bsize;
        if (i + fs->fs_frag > blks)
            size = (blks - i) * fs->fs_fsize;
        if (bwrite(disk, fsbtodb(fs, fs->fs_csaddr + i), space, size)
            == -1) {
            ERROR(disk, "Failed to write sb summary information");
            return (-1);
        }
        space += size;
    }

But the bwrite returns error because the called pwrite() tries to write
4096 bytes from a null pointer. And that it does not like.

Now the question is: why isn't d_sbcsum not filled out?
Note that the disk is filled with random data.

I've been looking for quite some time, but I just don't get it.
Where should the superblock come from if a whole disk is being used?
(so there no MBR or gpart written. Dangerously dedicated)

--WjW