Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 5 Dec 2018 01:48:00 +0200
From:      Toomas Soome <tsoome@me.com>
To:        Mark Martinec <Mark.Martinec+freebsd@ijs.si>
Cc:        freebsd-current <freebsd-current@freebsd.org>, freebsd-stable@freebsd.org, Ian Lepore <ian@freebsd.org>
Subject:   Re: Boot loader stuck after first stage upgrading 11.2 to 12.0-RC2
Message-ID:  <EC8DD049-8BBE-4E96-A68B-A2846CED00BA@me.com>
In-Reply-To: <1543954753.1860.243.camel@freebsd.org>
References:  <22f5b92a09ea4d62ac3feb74457067f7@ijs.si> <5EEBAFC0-4FA3-4219-A918-7376F4223656@me.com> <f2737ffb236d39761767aa10a603c084@ijs.si> <0F5FCC70-EADB-4F9E-A391-F1A73BE5608F@me.com> <dc762bdf408c92daae826425fdba98d9@ijs.si> <B3C7194D-93B8-406B-9E8E-BA55D49D657A@me.com> <1543954753.1860.243.camel@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Yes, that must be true but it does not hurt to get checked.

And of course, lsdev -v from 11.x loader would be good too.

Anyhow, I am afraid we have reached to point where more specific debug info i=
s needed (printed out), with lack of output about disks at all, it must be r=
elated to floppy device checks.

Rgds,
Toomas

Sent from my iPhone

> On 4 Dec 2018, at 22:19, Ian Lepore <ian@freebsd.org> wrote:
>=20
> On Tue, 2018-12-04 at 21:51 +0200, Toomas Soome via freebsd-stable
> wrote:
>>=20
>>>=20
>>> On 4 Dec 2018, at 19:59, Mark Martinec <Mark.Martinec+freebsd@ijs.s
>>> i> wrote:
>>>=20
>>>>=20
>>>>>=20
>>>>> 2018-11-29 18:43, Toomas Soome wrote:
>>>>>>=20
>>>>>> I just did push biosdisk updates to stable/12, I wonder if
>>>>>> you could
>>>>>> test those bits=E2=80=A6
>>> Myself wrote:
>>>>=20
>>>>>=20
>>>>> Thank you!  I haven't tried it yet, but I wonder whether this
>>>>> fix was
>>>>> already incorporated into 12.0-RC3, which would make my rescue
>>>>> easier.
>>>>> Otherwise I can build a stable/12 on another host and
>>>>> transplant
>>>>> the problematic file(s) to the affected host - if I knew which
>>>>> files
>>>>> to copy.
>>> 2018-12-02 18:59, Toomas wrote:
>>>>=20
>>>> The files are /boot/loader* binaries - to be exact, check which
>>>> one is
>>>> linked to /boot/loader. I can provide binaries if needed.
>>>> [...]
>>>> rgds,
>>>> toomas
>>> I got a maintenance window today so I tried with the new loader,
>>> and it did not help.
>>>=20
>>> More specifically:
>>>=20
>>> As it comes with 12-RC2, the /boot/loader was hard linked with
>>> loader_lua.
>>> Its size is 421888 bytes. So I concentrated on this loader.
>>>=20
>>> I build a fresh stable/12 on another host, and copied the newly
>>> built loader_lua (425984 bytes) to the /boot directory of the
>>> affected
>>> host, deleted the file 'loader', and hard-linked loader_lua to
>>> loader.
>>>=20
>>> The situation has not changed: the BTX loader lists all BIOS drives
>>> C..J (disk0..disk7), then a spinner starts and gets stuck forever.
>>> It never reaches the 'BIOS 635kB/3537856kB available memory' line.
>>>=20
>>> While trying to restore the old /boot from 11.2, I tried booting
>>> a live image from a 12.0-RC3 memory stick - and the loader got
>>> stuck again, same as when booting from a disk.
>>>=20
>>> So I had to boot from an 11.2 memstick to be able to regain
>>> control.
>>>=20
>>>  Mark
>>>=20
>>>=20
>> ok, if you could perform 2 tests:
>>=20
>> 1. from loader prompt enter 0x413 0xa000 - @w . cr
>>=20
>> 2. on first spinner, press space and type on boot: prompt:
>> /boot/loader_4th and see if that will do better
>> thanks,
>> toomas
>>=20
>=20
> I don't think that will be an option.  If it hasn't gotten to the point
> of saying how much BIOS available memory there is, it's only halfway
> through loader main() and has hung before getting to interact().
>=20
> In fact, if that line hasn't printed, but some disk drives have been
> listed, it pretty much has to be hung in the "March through the device
> switch probing for things" loop. If all the disks are listed, then it
> got through that entry in the devsw, and is likely hanging in the
> dv_init calls for either the pxedisk or zfsdev devices.
>=20
> -- Ian
>=20
>>=20
>>>=20
>>>=20
>>>>=20
>>>>>=20
>>>>>>=20
>>>>>>>=20
>>>>>>> On 29 Nov 2018, at 17:01, Mark Martinec <Mark.Martinec+free
>>>>>>> bsd@ijs.si> wrote:
>>>>>>> After successfully upgraded three hosts from 11.2-p4 to
>>>>>>> 12.0-RC2 (amd64,
>>>>>>> zfs, bios), I tried my luck with one of our production
>>>>>>> hosts, and ended up
>>>>>>> with a stuck loader after rebooting with a new kernel
>>>>>>> (after the first
>>>>>>> stage of upgrade).
>>>>>>> These were the steps, and all went smoothly and normally
>>>>>>> until a reboot:
>>>>>>> freebsd-update upgrade -r 12.0-RC2
>>>>>>> freebsd-update install
>>>>>>> shutdown -r now
>>>>>>> While booting, the 'BTX loader' comes up, lists the BIOS
>>>>>>> drives,
>>>>>>> then the spinner below the list comes up and begins
>>>>>>> turning,
>>>>>>> stuttering, and after a couple of seconds it grinds to a
>>>>>>> standstill
>>>>>>> and nothing happens afterwards.
>>>>>>> At this point the ZFS and the bootstrap loader is supposed
>>>>>>> to
>>>>>>> come up, but it doesn't.
>>>>>>> This host has too zfs pools, the system pool consists of
>>>>>>> two SSDs
>>>>>>> in a zfs mirror (also holding a freebsd-boot partition
>>>>>>> each), the
>>>>>>> other pool is a raidz2 with six JBOD disks on an LSI
>>>>>>> controller.
>>>>>>> The gptzfsboot in both freebsd-boot partitions is fresh
>>>>>>> from 11.2,
>>>>>>> both zpool versions are up-to-date with 11.2. The 'zpool
>>>>>>> status -v'
>>>>>>> is happy with both pools.
>>>>>>> After rebooting from an USB drive and reverting the /boot
>>>>>>> directory
>>>>>>> to a previous version, the machine comes up normally again
>>>>>>> with the 11.2-RELEASE-p4.
>>>>>>> I found a file init.core in the / directory, slightly
>>>>>>> predating the
>>>>>>> last reboot with a salvaged system - although it was
>>>>>>> probably not
>>>>>>> a cause of the problem, but a consequence of the rescue
>>>>>>> operation.
>>>>>>> It is unfortunate that this is a production host, so I
>>>>>>> can't play
>>>>>>> much with it. One or two more quick experiments I can
>>>>>>> probably
>>>>>>> afford, but not much more. Should I just first wait for the
>>>>>>> official 12.0 release? Should I try booting with a 12.0 on
>>>>>>> USB
>>>>>>> and try to import pools? Suggestions welcome.
>>>>>>> Now that the /boot has been manually restored to the 11.2
>>>>>>> state,
>>>>>>> A SECOND QUESTION is about freebsd-update, which still
>>>>>>> thinks we are
>>>>>>> in the middle of an upgrade procedure. Trying now to just
>>>>>>> update
>>>>>>> the 11.2-RELEASE-p4 to 11.2-RELEASE-p5, the fetch
>>>>>>> complains:
>>>>>>> # uname -a
>>>>>>> FreeBSD xxx 11.2-RELEASE-p4 FreeBSD 11.2-RELEASE-p4
>>>>>>> #
>>>>>>> # freebsd-version
>>>>>>> 11.2-RELEASE-p4
>>>>>>> #
>>>>>>> # freebsd-update fetch
>>>>>>> src component not installed, skipped
>>>>>>> You have a partially completed upgrade pending
>>>>>>> Run '/usr/sbin/freebsd-update install' first.
>>>>>>> Run '/usr/sbin/freebsd-update fetch -F' to proceed anyway.
>>>>>>> So what is the right way to get rid of all traces of the
>>>>>>> unsuccessful upgrade, and let freebsd-update believe we are
>>>>>>> cleanly
>>>>>>> at 11.2-p4 ?  Removing /var/db/freebsd-update did not help.
>>>>>>> Mark
>> _______________________________________________
>> freebsd-stable@freebsd.org mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
>> To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.
>> org"
>>=20
> _______________________________________________
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"=





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?EC8DD049-8BBE-4E96-A68B-A2846CED00BA>