Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 28 Jan 2011 11:35:55 -0800
From:      Matthew Fleming <mdf356@gmail.com>
To:        John Baldwin <jhb@freebsd.org>
Cc:        freebsd-hackers@freebsd.org
Subject:   Re: Divide-by-zero in loader
Message-ID:  <AANLkTi=OsECq3XDP3sn_egNAQLTHxedqW0q2iyd0OuSk@mail.gmail.com>
In-Reply-To: <201101281423.02701.jhb@freebsd.org>
References:  <AANLkTikxWagpLk-qFiEBLsx_S4vXkzCU57kht%2BF%2BcaC-@mail.gmail.com> <201101281400.01840.jhb@freebsd.org> <AANLkTinHdZV3vYn25bHq7z_zzhKQM2z-Y8%2BsE%2BEz-fCP@mail.gmail.com> <201101281423.02701.jhb@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Jan 28, 2011 at 11:23 AM, John Baldwin <jhb@freebsd.org> wrote:
> On Friday, January 28, 2011 2:14:45 pm Matthew Fleming wrote:
>> On Fri, Jan 28, 2011 at 11:00 AM, John Baldwin <jhb@freebsd.org> wrote:
>> > On Friday, January 28, 2011 12:41:08 pm Matthew Fleming wrote:
>> >> I spent a few days chasing down a bug and I'm wondering if a loader
>> >> change would be appropriate.
>> >>
>> >> So we have these new front-panel LCDs, and like everything these days
>> >> it's a SoC. =A0Normally it presents to FreeBSD as a USB communication=
s
>> >> device (ucom), but when the SoC is sitting in its own boot loader, it
>> >> presents as storage (umass). =A0If the box is rebooted in this state,
>> >> the reboot gets into /boot/loader and then reboots itself. =A0(It too=
k a
>> >> few days just to figure out I was getting into /boot/loader, since th=
e
>> >> only prompt I could definitively stop at was boot2).
>> >>
>> >> Anyways, I eventually debugged it to the device somehow presenting
>> >> itself to /boot/loader with a geometry of 1024/256/0, and since od_se=
c
>> >> is 0 that causes a divide-by-zero error in bd_io() while the loader i=
s
>> >> trying to figure out if this is GPT or MBR formatted. =A0We're still
>> >> trying to figure out why the loader sees this incorrect geometry.
>> >>
>> >> But meanwhile, this patch fixes the issue, and I wonder if it would b=
e
>> >> a useful safety-belt for other devices where an incorrect geometry ca=
n
>> >> be seen?
>> >
>> > That's probably fine. =A0A sector count of zero is invalid for CHS. =
=A0However,
>> > probably we should not even be using C/H/S at all if the device claims=
 to
>> > support EDD. =A0We already use raw LBAs if it supports EDD, and we sho=
uld
>> > probably just ignore C/H/S altogether if it supports EDD.
>>
>> This is all almost entirely outside my knowledge, but at the moment
>> bd_eddprobe() requres a geometry of 1023/255/63 before it attempts to
>> check if EDD can be used. =A0Is that check incorrect?
>
> Well, it is very conservative in that it only uses EDD if it thinks it ca=
n't
> use C/H/S. =A0It would be interesting to see if simply checking for a sec=
tor
> count of 0 there would avoid the divide-by-zero and let your device "work=
".

I'll need a few more pointers to try this.

As structured today, bd_io is what does the divide-by-zero, and it
always looks at od_sec to determine the maximum transfer size.  It's
only later in bd_io that it checks the EDD mode to decide how to read.

	/*
	 * Play it safe and don't cross track boundaries.
	 * (XXX this is probably unnecessary)
	 */
	sec =3D dblk % od->od_sec;	/* offset into track */
	x =3D min(od->od_sec - sec, resid);
	if (maxfer > 0)
	    x =3D min(x, maxfer);		/* fit bounce buffer */

However, I suppose this may be safe-ish to do as just:

	x =3D resid;
	if (maxfer > 0 && maxfer < resid)
	    x =3D maxfer;		/* fit bounce buffer */

In which case the only uses of od_sec is now in bd_chs_io, and some printf'=
s.

Though note that currently also the bd_eddprobe is only done in the
bd_open_mbr() step which is only if the GPT reads showed it isn't a
GPT device.  I suppose that the EDD probe can be part of bd_getgeom,
though.

I'll try this and see what happens on my device.

Thanks,
matthew

> However, it might actually be useful to always use EDD if possible, esp.
> EDD3 since that lets you not use bounce buffers down in 1MB.
>
>> In my specific case I know there's no bootable stuff on this disk; the
>> earlier layers bypassed it correctly without a problem.
>>
>> Thanks,
>> matthew
>>
>
> --
> John Baldwin
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AANLkTi=OsECq3XDP3sn_egNAQLTHxedqW0q2iyd0OuSk>