From owner-svn-src-all@FreeBSD.ORG Sun Feb 13 12:31:01 2011 Return-Path: Delivered-To: svn-src-all@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 975D5106566B; Sun, 13 Feb 2011 12:31:01 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail09.syd.optusnet.com.au (mail09.syd.optusnet.com.au [211.29.132.190]) by mx1.freebsd.org (Postfix) with ESMTP id F30DC8FC1B; Sun, 13 Feb 2011 12:31:00 +0000 (UTC) Received: from c122-107-114-89.carlnfd1.nsw.optusnet.com.au (c122-107-114-89.carlnfd1.nsw.optusnet.com.au [122.107.114.89]) by mail09.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id p1DCUu7F015281 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sun, 13 Feb 2011 23:30:57 +1100 Date: Sun, 13 Feb 2011 23:30:56 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Konstantin Belousov In-Reply-To: <201102121317.p1CDHE9v002742@svn.freebsd.org> Message-ID: <20110213215436.L1544@besplex.bde.org> References: <201102121317.p1CDHE9v002742@svn.freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: svn-src-head@FreeBSD.org, svn-src-all@FreeBSD.org, src-committers@FreeBSD.org Subject: Re: svn commit: r218604 - head/sbin/fsck_ffs X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 13 Feb 2011 12:31:01 -0000 On Sat, 12 Feb 2011, Konstantin Belousov wrote: > Log: > In checker, read journal by sectors. > > Due to UFS insistence to pretend that device sector size is 512 bytes, Do you mean that ffs_fsck does this? UFS doesn't exist, and ffs in the kernel didn't use device sectors at all until recently (it used DEV_BSIZE units, but those are not sectors but are the units for bread() and friends). newfs uses a hack to do this. When handling kernel sizes in units of DEV_BSIZE (not a sector size!), ffs_fsck should use DEV_BSIZE (or maybe just fsbtodb()) instead of `dev_bsize' or `sectorsize' and not pretend that DEV_BSIZE (or the corresponding superblock value for the conversion macro) is the sector size. > sector size is obtained from ioctl(DIOCGSECTORSIZE) for real devices, > and from the label otherwise. The file images without label have to > be made with 512 sector size. > Modified: head/sbin/fsck_ffs/fsck.h > ============================================================================== > --- head/sbin/fsck_ffs/fsck.h Sat Feb 12 13:12:45 2011 (r218603) > +++ head/sbin/fsck_ffs/fsck.h Sat Feb 12 13:17:14 2011 (r218604) > @@ -268,6 +268,7 @@ char snapname[BUFSIZ]; /* when doing sna > char *cdevname; /* name of device being checked */ > long dev_bsize; /* computed value of DEV_BSIZE */ > long secsize; /* actual disk sector size */ > +long real_dev_bsize; If `secsize' is the actual disk sector size, then we don't need another variable giving the real sector size. The new variable has no comment. What is the difference between a sector size and a dev_bsize? There is enough confusion between DEV_BSIZE and sector sizes already. DEV_BSIZE has come to means nothing to do with sectors and little to do with devices or block sizes. It is just a minimal i/o sizes for use in old interfaces (ones that didn't want to use more than 32 bits for disk addresses, but now use 64 bits anyway, so that they can address 72 bits after multiplication by DEV_BSIZE). These variables are named and documented slightly better in mkfs: % extern int sectorsize; /* bytes/sector */ % extern int realsectorsize; /* bytes/sector in hardware*/ At lease the fake sector size variable doesn't claim to be actual, and the real sector size variable doesn't have extra underscores and claims to be a sector size variable. > Modified: head/sbin/fsck_ffs/setup.c > ============================================================================== > --- head/sbin/fsck_ffs/setup.c Sat Feb 12 13:12:45 2011 (r218603) > +++ head/sbin/fsck_ffs/setup.c Sat Feb 12 13:17:14 2011 (r218604) > @@ -446,7 +446,7 @@ sblock_init(void) > if (sblk.b_un.b_buf == NULL || asblk.b_un.b_buf == NULL) > errx(EEXIT, "cannot allocate space for superblock"); > if ((lp = getdisklabel(NULL, fsreadfd))) > - dev_bsize = secsize = lp->d_secsize; > + real_dev_bsize = dev_bsize = secsize = lp->d_secsize; > else > dev_bsize = secsize = DEV_BSIZE; > } Both the variables are real enough here, provided getdisklabel() works. Otherwise, you are in trouble and should fail instead of defaulting the size unless the size is set by another means. newfs seems to fail, and I'm sure newfs_msdos does fail if the sector size is unknown. > Modified: head/sbin/fsck_ffs/suj.c > ============================================================================== > --- head/sbin/fsck_ffs/suj.c Sat Feb 12 13:12:45 2011 (r218603) > +++ head/sbin/fsck_ffs/suj.c Sat Feb 12 13:17:14 2011 (r218604) > @@ -28,6 +28,7 @@ > __FBSDID("$FreeBSD$"); > > #include > +#include > #include > #include > #include > @@ -201,6 +202,11 @@ opendisk(const char *devnam) > disk->d_error); > } > fs = &disk->d_fs; > + if (real_dev_bsize == 0 && ioctl(disk->d_fd, DIOCGSECTORSIZE, > + &real_dev_bsize) == -1) > + real_dev_bsize = secsize; > + if (debug) > + printf("dev_bsize %ld\n", real_dev_bsize); > } > > /* If this gives a different "real_dev_bsize", and this is different from the sector size in the label or defaulted-to, then the latter is presumably wrong, and we're left with a `secsize' that is not the "actual sector size". It is a bug to replace the sector size in the label with the one obtained here. The former should be correct, and it is the only one under user control. It might need to be changed if one given by the ioctl is wrong or just if it is fake. (Labels only have 32-bit disk addresses, so fake sector sizes are needed to address more than 41 bits if the physical sector size has is 512.) newfs doesn't have the precedence that I want: - sectorsize set on the command line (-S option) has precedence. [Bug; -S 0 is needed to cancel any previous setting of sectorsize on the command line by -S or maybe by -T, but it is treated as an error.] fsck_ffs unfortunately doesn't seem to have any -S or -T option. These used to be unneeded since sector size weren't used, and labels worked and contain enough info to locate the superblock where there is more info. - then try setting sectorsize using the ioctl if sectorsize is not already set. [Bug: no error checking for this ioctl. If it fails, then it may clobber sectorsize, which breaks the next step.] - then set sectorsize from the label if there is a label and sectorsize is not already set - fail if sectorsize is not set Then the hacks involving realsectorsize: - set it to sectorsize - if sectorsize != DEV_BSIZE, change it to DEV_BSIZE and fix up related quantities (I hope the change to the partition table is never written). Now `sectorsize' is no longer "actual", though it was actual before the hacks. We continue with the actual sector size recorded in realsectorsize. > @@ -2262,7 +2268,7 @@ suj_build(void) > rec = (union jrec *)seg->ss_blk; > for (i = 0; i < seg->ss_rec.jsr_cnt; off += JREC_SIZE, rec++) { > /* skip the segrec. */ > - if ((off % DEV_BSIZE) == 0) > + if ((off % real_dev_bsize) == 0) > continue; > switch (rec->rec_jrefrec.jr_op) { > case JOP_ADDREF: > ... This is all sort of backwards. DEV_BSIZE is the real DEV_BSIZE. Old code that uses units of DEV_BSIZE spells this as dev_bsize although it is constant. New code that uses units of sectors spelled this as DEV_BSIZE although the correct size is variable; now the new code spelled this as real_dev_bsize, which is variable but unrelated to dev_bsize = DEV_BSIZE :-). I think using fragments instead of sectors would work better. For a start, you can get the fragment size from the superblock and not have to worry about labels of ioctls. There is a problem locating the superblock. The kernel and various utilities only try reading the superblock with size SBSIZE (8K). This fails if the sector size is not a divisor of 8K. They also only try reading at various offsets which might not be a multiple of the sector size, but which are larger than 8K so they are more likely to be a multiple. I fixed this problem in fsck_msdosfs where it is larger (fsck_msdosfs assumes that the sector size is a divisor of 512, but sector sizes like 2K and 4K are now common relative to sizes of 16K). Hmm, the initialization of dev_bsize is even more obfuscated than I remembered: % /* % * Compute block size that the file system is based on, % * according to fsbtodb, and adjust superblock block number % * so we can tell if this is an alternate later. % */ % super *= dev_bsize; We may have needed a fake sector size to read the superblock. I forget if we got a non-fake one from the label or ioctl before here. We should have just tried all reasonable power of 2 sizes between 1 and 1M (I use 64K down to DOSBOOTBLOCKSIZE = 512 in fsck_msdosfs). % dev_bsize = sblock.fs_fsize / fsbtodb(&sblock, 1); This is an obfuscated way of spelling DEV_BSIZE. Although fsbtodb() converts to blocks with fixed size DEV_BSIZE, it needs a variable shift count to handle the variable fragment size. This shift count is fs_fsbtodb, which was initialized in newfs to ilog2(fs_fsize / sectorsize). But sectorsize was fake there (always DEV_BSIZE). Suppose for example than fs_fsize is the default (2K). Then fs_fsbtodb = ilog2(4) = 2, so fsbtodb(&sbblock, 1) is 4 and dev_bsize = 2K / 4 = DEV_BSIZE again. It is always DEV_BSIZE again, since the fake sectorsize in newfs is always DEV_BSIZE. The fakery is extended to fsck_ffs by the above assignment. The above assignment is necessary if the kernel is changed to not fake things so much, or to fake them differently (newfs can't can change it's i/o methods but can't change the generated fs_fsbtodb etc. unless the kernel changes. All it could do much better is to make the conversions between different block sizes (real and virtual either way) more explicit). However, things are already confusing enough with a fixed DEV_BSIZE. Bruce