Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 16 Dec 2018 04:17:10 +1100 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Edward Tomasz Napierala <trasz@freebsd.org>
Cc:        src-committers@freebsd.org, svn-src-all@freebsd.org,  svn-src-head@freebsd.org
Subject:   Re: svn commit: r342115 - head/sbin/fsck_ffs
Message-ID:  <20181216033655.G5660@besplex.bde.org>
In-Reply-To: <201812151136.wBFBaKV3058236@repo.freebsd.org>
References:  <201812151136.wBFBaKV3058236@repo.freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, 15 Dec 2018, Edward Tomasz Napierala wrote:

> Log:
>  Make fsck(8) use pread(2). This cuts the number of syscalls by half.
>
>  Reviewed by:	kib, mckusick

This cuts the error handling for reads by a bit more than half.

This also makes fsck use pwrite().  Cutting the error handling is almost a
feature for writes.

> Modified: head/sbin/fsck_ffs/fsutil.c
> ==============================================================================
> --- head/sbin/fsck_ffs/fsutil.c	Sat Dec 15 11:32:11 2018	(r342114)
> +++ head/sbin/fsck_ffs/fsutil.c	Sat Dec 15 11:36:20 2018	(r342115)
> @@ -576,9 +576,7 @@ blread(int fd, char *buf, ufs2_daddr_t blk, long size)
> 		slowio_start();
> 	totalreads++;
> 	diskreads++;
> -	if (lseek(fd, offset, 0) < 0)
> -		rwerror("SEEK BLK", blk);
> -	else if (read(fd, buf, (int)size) == size) {
> +	if (pread(fd, buf, (int)size, offset) == size) {
> 		if (bkgrdflag)
> 			slowio_end();
> 		return (0);

Just after here, the comment before the error handling says that the error
handling is simpler here because it is just for true read/write errors.

> @@ -595,14 +593,11 @@ blread(int fd, char *buf, ufs2_daddr_t blk, long size)
> 	} else
> 		rwerror("READ BLK", blk);
>
> -	if (lseek(fd, offset, 0) < 0)
> -		rwerror("SEEK BLK", blk);

This used to exit in many cases.  The seek error is unrecoverable.  (Usually
the previous lseek() fails before this is reached.)

> 	errs = 0;
> 	memset(buf, 0, (size_t)size);
> 	printf("THE FOLLOWING DISK SECTORS COULD NOT BE READ:");
> 	for (cp = buf, i = 0; i < size; i += secsize, cp += secsize) {
> -		if (read(fd, cp, (int)secsize) != secsize) {
> -			(void)lseek(fd, offset + i + secsize, 0);

This used to handle lseek() errors sloppily.

> +		if (pread(fd, cp, (int)secsize, offset + i) != secsize) {
> 			if (secsize != dev_bsize && dev_bsize != 1)
> 				printf(" %jd (%jd),",
> 				    (intmax_t)(blk * dev_bsize + i) / secsize,

This handles pread() sloppily.  It checks for errors, but proceeds to spew
errors for the whole block, just like the sloppy lseek() error handling
except now that was protected by 2 instances of non-sloppy lseek() error
handling for the start of the block.

> @@ -629,22 +624,16 @@ blwrite(int fd, char *buf, ufs2_daddr_t blk, ssize_t s
> 		return;
> 	offset = blk;
> 	offset *= dev_bsize;
> -	if (lseek(fd, offset, 0) < 0)
> -		rwerror("SEEK BLK", blk);
> -	else if (write(fd, buf, size) == size) {
> +	if (pwrite(fd, buf, size, offset) == size) {
> 		fsmodified = 1;
> 		return;
> 	}
> 	resolved = 0;
> 	rwerror("WRITE BLK", blk);
> -	if (lseek(fd, offset, 0) < 0)
> -		rwerror("SEEK BLK", blk);
> 	printf("THE FOLLOWING SECTORS COULD NOT BE WRITTEN:");
> 	for (cp = buf, i = 0; i < size; i += dev_bsize, cp += dev_bsize)
> -		if (write(fd, cp, dev_bsize) != dev_bsize) {
> -			(void)lseek(fd, offset + i + dev_bsize, 0);
> +		if (pwrite(fd, cp, dev_bsize, offset + i) != dev_bsize)
> 			printf(" %jd,", (intmax_t)blk + i / dev_bsize);
> -		}
> 	printf("\n");
> 	return;
> }

Not so similarly.  Everything is the same except now it is a feature to
not bail out on the first sign of a problem.  Only the error spew is bad
here.

Of course, lseek errors "can't happen".

It is interesting that writes for the error case are done with size
dev_bsize, while reads for the error case are done with size secsize.
This is very confusing.  Both sizes are initially initialized to
DEV_BSIZE, which cannot work on disks with large sectors, then
bootstrapped confusingly differently by probing for metadata.  There
is also real_dev_bsize, which suj.c sets to secsize.  The printf()s
for both claim to access SECTORs.

Writes for to attempt to recover from errors by retrying (only once
here) should use the smallest size that the hardware supports.  This
is the sector size unless sectors are virtual.  Hopefully dev_bsize ==
secsize for all writes, since it is too dangerous to write before you
know the sector size.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20181216033655.G5660>