Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 31 May 2016 13:03:01 +1000 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
Cc:        freebsd-bugs@freebsd.org
Subject:   Re: [Bug 189513] bsdlabel fails
Message-ID:  <20160531122254.Q1052@besplex.bde.org>
In-Reply-To: <bug-189513-8-8apsMYGYA3@https.bugs.freebsd.org/bugzilla/>
References:  <bug-189513-8@https.bugs.freebsd.org/bugzilla/> <bug-189513-8-8apsMYGYA3@https.bugs.freebsd.org/bugzilla/>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 30 May 2016 a bug hiding sysem wrote:

> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=189513
>
> software@bertram-scharpf.de changed:
>
>           What    |Removed                     |Added
> ----------------------------------------------------------------------------
>         Resolution|---                         |FIXED

I see no fix here.

>             Status|In Progress                 |Closed
>
> --- Comment #1 from software@bertram-scharpf.de ---
> bsdlabel is deprecated. Use gpart now.

No thanks.  Even bsdlabel is partly broken for me.  I use my version
of FreeBSD with disklabel to do critical labeling.

However, I debugged a problem in bsdlabel that looks like the same one.
Here is my mail to mav@ on on 3 May 2015 about that (with typos unfixed
and X quoting botched by mail programs):

Y I recently started using ahci, and a copule of days later got these alarming
Y errors (and long timeouts with the disk LED on) from the current version of
Y bsdlabel:
Y 
Y X ahcich0: Timeout on slot 4 port 0
Y X ahcich0: is 00000000 cs 007ffff0 ss 007ffff0 rs 007ffff0 tfd 40 serr 00000000 
Y cmd 0000c317
Y X (ada0:ahcich0:0:0:0): READ_FPDMA_QUEUED. ACB: 60 10 40 46 00 40 00 00 00 00 
Y 00 00
Y X (ada0:ahcich0:0:0:0): CAM status: Command timeout
Y X (ada0:ahcich0:0:0:0): Retrying command
Y X ahcich0: Timeout on slot 25 port 0
Y X ahcich0: is 00000000 cs fe00001f ss fe00001f rs fe00001f tfd 40 serr 00000000 
Y cmd 0000d817
Y X (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 04 20 f6 00 40 00 00 00 00 
Y 00 00
Y X (ada0:ahcich0:0:0:0): CAM status: Command timeout
Y X (ada0:ahcich0:0:0:0): Retrying command
Y 
Y The data is read correctly despite this (typically the read() that causes
Y this returns success and the next disk i/o has the error).
Y 
Y I feared it was a fairly new disk going bad, but no, old versions of
Y bsdlabel and disklabel don't have this bug (they have others) and
Y changing the alignment seems to fix it in tests.
Y 
Y Nearly-minimal test program:
Y 
Y X #include <stdio.h>
Y X #include <unistd.h>
Y X X #define	OFF	0x1
Y X #define	SIZE	0x2000
Y X X char buf[0xfff + SIZE] __aligned(4096);
Y X X int
Y X main(int argc, char **argv)
Y X {
Y X 	int fd, n, off;
Y X X 	off = (argc == 1) ? OFF : atoi(argv[1]) & 0xfff;
Y X 	fd = open("/dev/ada0", 0);
Y X 	printf("open %d\n", fd);
Y X 	n = read(fd, buf + off, SIZE);
Y X 	printf("read %d\n", n);
Y X 	close(fd);
Y X 	sync();
Y X }
Y 
Y All even alignments tested worked, but most odd alignments tested had
Y the error.
Y 
Y bsdlabel has the following:
Y 
Y X static u_char   bootarea[BBSIZE];
Y 
Y gcc aligns large char arrays, but clang apparently doesn't, and the
Y alignment of this is odd in all i386 binaries that I looked at
Y (debugging bslabel didn't accidentally fix the alignment, but the same
Y i/o's in the test program didn't show the problem until I broke the
Y alignment there).
Y 
Y System info in attached dmesg.  It is a z97 system with not much i/o.

mav@ understood the problem, but didn't seem to fix it.  However, it seems
to be fixed now.

Try turning off ahci in he BIOS for a quick fix.

I know of 2 other non-old bugs in bsdlabel:
- some versions can't handle aliass like ad4 for ada0.  This is fixed in the
   current version
- the current version can't write labels.  It says to use gpart, and I
   say "no thanks".  Even the geom debugflags hack doesn't fix this.

I know of 4 large old bugs in bsdlabel:
- support for i/o without ioctls (-r) is broken (-r has no effect)
- error handling when an ioctl fails is bad (usually it is to exit).  This
   combined with the previous bug and similar bugs in fdisk make FreeBSD
   disk utilities less portable to FreeBSD systems than non-FreeBSD disk
   utilities to FreeBSD systems, since the others tend to have an option
   or fallback to access the disk directly and enough parameters to replace
   the ioctls.
- bsdlabel -[A]e erases metadata when not started with -A
- wildcard handling for numbers is broken in too many cases.  This is
   inherited from disklabel.  I only fixed a couple of critical cases in
   my version of disklabel.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20160531122254.Q1052>