From owner-freebsd-current@FreeBSD.ORG Tue Feb 23 22:44:35 2010 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A0690106566B for ; Tue, 23 Feb 2010 22:44:35 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 428E28FC0C for ; Tue, 23 Feb 2010 22:44:35 +0000 (UTC) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id B2A4546B51; Tue, 23 Feb 2010 17:44:34 -0500 (EST) Received: from jhbbsd.localnet (smtp.hudson-trading.com [209.249.190.9]) by bigwig.baldwin.cx (Postfix) with ESMTPA id 0168C8A01F; Tue, 23 Feb 2010 17:44:33 -0500 (EST) From: John Baldwin To: Brandon Gooch Date: Tue, 23 Feb 2010 17:40:46 -0500 User-Agent: KMail/1.12.1 (FreeBSD/7.2-CBSD-20100120; KDE/4.3.1; amd64; ; ) References: <747dc8f31002220835g481b0baeqb1d6df32a79b7da2@mail.gmail.com> <201002231603.36500.jhb@freebsd.org> <179b97fb1002231404x1b5fce88v1d76450cc70473a1@mail.gmail.com> In-Reply-To: <179b97fb1002231404x1b5fce88v1d76450cc70473a1@mail.gmail.com> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201002231740.46478.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0.1 (bigwig.baldwin.cx); Tue, 23 Feb 2010 17:44:33 -0500 (EST) X-Virus-Scanned: clamav-milter 0.95.1 at bigwig.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-1.3 required=4.2 tests=AWL,BAYES_00 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on bigwig.baldwin.cx Cc: freebsd-current@freebsd.org Subject: Re: ZFS boot problems with memory > 1MB X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Feb 2010 22:44:35 -0000 On Tuesday 23 February 2010 5:04:03 pm Brandon Gooch wrote: > On Tue, Feb 23, 2010 at 3:03 PM, John Baldwin wrote: > > On Tuesday 23 February 2010 3:36:19 pm Brandon Gooch wrote: > >> On Tue, Feb 23, 2010 at 1:01 PM, John Baldwin wrote: > >> > On Tuesday 23 February 2010 12:36:31 pm Brandon Gooch wrote: > >> >> On Tue, Feb 23, 2010 at 10:24 AM, John Baldwin wrote: > >> >> > On Tuesday 23 February 2010 10:28:49 am Brandon Gooch wrote: > >> >> >> On Tue, Feb 23, 2010 at 7:29 AM, Andriy Gapon wrote: > >> >> >> > on 23/02/2010 13:18 Renato Botelho said the following: > >> >> >> >> On Mon, Feb 22, 2010 at 7:35 PM, Chris Hedley > >> >> >> >> wrote: > >> >> >> > [snip] > >> >> >> >>> Do you have USB legacy support enabled in your BIOS? I'm not sure > > if > >> >> >> >>> there's an option for the loader to use USB devices natively, but > > the BIOS's > >> >> >> >>> legacy option where it provides AT/PS2 emulation is probably the > > easiest way > >> >> >> >>> to get the keyboard working. > >> >> >> >> > >> >> >> >> Yes, I do, but it seems to be a regression on FreeBSD itself, I had > > this problem > >> >> >> >> in the past and I checked the same things i need to check in the > > past again and > >> >> >> >> everything is fine. > >> >> >> > > >> >> >> > A more precise way to state that would be "a regression in FreeBSD > > boot/loader". > >> >> >> > I think that you are referring to the issue that was fixed by > > r189017. > >> >> >> > It might be worthwhile investigating what was done in that revision > > and what > >> >> >> > happened in sys/boot code since then. > >> >> >> > > >> >> >> > One possibility is that your BIOS uses memory above 1MB for USB > > emulation, but > >> >> >> > doesn't mark that memory as used in system memory map. In that case > > that memory > >> >> >> > could be overwritten by the loader. If that's true then the blame > > is on the BIOS. > >> >> >> > Alternatively, our code might be parsing the system memory map > > incorrectly. > >> >> >> > But I am just making wild guesses here. > >> >> >> > > >> >> >> > >> >> >> I don't know if it is at all related, but this commit has caused > >> >> >> problems for me booting at least one of my machines: > >> >> >> > >> >> >> > > http://svn.freebsd.org/viewvc/base/head/sys/boot/i386/zfsboot/zfsboot.c?r1=199714&r2=200309 > >> >> >> > >> >> >> Commit message: > >> >> >> > >> >> >> Revision 200309 - (view) (annotate) - [select for diffs] > >> >> >> Modified Wed Dec 9 20:36:56 2009 UTC (2 months, 2 weeks ago) by jhb > >> >> >> File length: 24893 byte(s) > >> >> >> Diff to previous 199714 > >> >> >> - Port bios_getmem() from libi386 to {gpt,}zfsboot() and use it to > >> >> >> safely allocate a heap region above 1MB. This enables > > {gpt,}zfsboot() > >> >> >> to allocate much larger buffers than before. > >> >> >> - Use a larger buffer (1MB instead of 128K) for temporary ZFS buffers. > > This > >> >> >> allows more reliable reading of compressed files in a raidz/raidz2 > > pool. > >> >> >> > >> >> >> Submitted by: Matt Reimer mattjreimer of gmail > >> >> >> MFC after: 1 week > >> >> > > >> >> > Starting a new thread, which problems are you seeing with this change? > > ZFS is > >> >> > a good bit more memory hungry than UFS, so it really needs to use high > > memory > >> >> > for its heap. Also, I wonder if you still have problems if you use the > > older > >> >> > zfsboot with the newer zfsloader? Finally, you need to use disklabel - > > B or > >> >> > some such to update the zfsboot bits for this change to take effect. > >> >> > > >> >> > -- > >> >> > John Baldwin > >> >> > > >> >> > >> >> I filed a PR so it wouldn't fall through the cracks: > >> >> > >> >> http://www.freebsd.org/cgi/query-pr.cgi?pr=144234 > >> >> > >> >> I guess I tried a combination of various revisions of bootstrap code > >> >> and loaders when I first encountered the issue. It was when I wrote a > >> >> recent gptzfsboot to the geom that I saw the symptoms: > >> >> > >> >> error 1 lba 48 > >> >> error 1 lba 1 > >> >> No ZFS pools located, can't boot > >> >> > >> >> I just wound up using sys/boot/i386/zfsboot/zfsboot.c revision 199714 > >> >> to build a working gptzfsboot on another system and wrote that to the > >> >> disk to get the machine operational. > >> > > >> > Try this: > >> > > >> > Index: zfsboot.c > >> > =================================================================== > >> > --- zfsboot.c (revision 204207) > >> > +++ zfsboot.c (working copy) > >> > @@ -467,6 +467,7 @@ > >> > static inline void > >> > putc(int c) > >> > { > >> > + v86.ctl = 0; > >> > v86.addr = 0x10; > >> > v86.eax = 0xe00 | (c & 0xff); > >> > v86.ebx = 0x7; > >> > @@ -617,6 +618,8 @@ > >> > off_t off; > >> > struct dsk *dsk; > >> > > >> > + dmadat = (void *)(roundup2(__base + (int32_t)&_end, 0x10000) - > > __base); > >> > + > >> > bios_getmem(); > >> > > >> > if (high_heap_size > 0) { > >> > @@ -627,9 +630,6 @@ > >> > heap_end = (char *) PTOV(bios_basemem); > >> > } > >> > > >> > - dmadat = (void *)(roundup2(__base + (int32_t)&_end, 0x10000) - > > __base); > >> > - v86.ctl = V86_FLAGS; > >> > - > >> > dsk = malloc(sizeof(struct dsk)); > >> > dsk->drive = *(uint8_t *)PTOV(ARGS); > >> > dsk->type = dsk->drive & DRV_HARD ? TYPE_AD : TYPE_FD; > >> > @@ -1157,6 +1157,7 @@ > >> > * when no such key is pressed in reality. As far as I can tell, > >> > * this only happens shortly after a reboot. > >> > */ > >> > + v86.ctl = V86_FLAGS; > >> > v86.addr = 0x16; > >> > v86.eax = fn << 8; > >> > v86int(); > >> > > >> > -- > >> > John Baldwin > >> > > >> > >> It still breaks: > >> > >> error 1 lba 48 > >> error 1 lba 1 > >> No ZFS pools located, can't boot > > > > Ok. Can you add a printf to zfsboot.c to print out dsk->start in the case > > that you get an error? error 1 means that the BIOS thinks it got a bad > > parameter, presumably in the disk packet. If you wanted to be ambitious, just > > print out all of the fields in the packet when it fails. > > > > -- > > John Baldwin > > > > Adding printf statements to drvread(): > > printf("dsk->xxx: %u\n", dsk->xxx): > > Output: > > error 1 lba 48 > dsk->drive: 0 > dsk->type: 0 > dsk->unit: 0 > dsk->slice: 0 > dsk->part: 0 > dsk->init: 0 > dsk->start: 978673664 This value looks a bit high, do you have a partition that starts at an offset of about 466GB into the disk? > error 1 lba 1 > dsk->drive: 0 > dsk->type: 0 > dsk->unit: 0 > dsk->slice: 0 > dsk->part: 0 > dsk->init: 0 > dsk->start: 0 > No ZFS pools located, can't boot Sorry, I meant members of the 'packet' variable, though dsk->start is useful to have as well. -- John Baldwin