Date: Wed, 28 Jan 2015 16:57:45 -0800 From: Mark Millard <markmi@dsl-only.net> To: FreeBSD PowerPC ML <freebsd-ppc@freebsd.org> Subject: Re: 10.1 powerpc64 kernel build/boot-ability oddity (PowerMac):10.1-RELEASE-p4 boots 10.1-STABLE fails to Message-ID: <78919D58-B433-404C-ACBD-388EA66B9821@dsl-only.net> In-Reply-To: <2B4FCA85-6874-41D8-A093-E87EC96CB5FA@dsl-only.net> References: <2B4FCA85-6874-41D8-A093-E87EC96CB5FA@dsl-only.net>
next in thread | previous in thread | raw e-mail | index | archive | help
I was aware of the issue from the page Nathan referenced but my context = is backwards from the expected issue and from Nathan's wording (below): A) When I do *not* stop it to switch kernels at the loader prompt is = when 10.1-STABLE *crashes*. (True of both loader.conf having kernel=3D set to = pick out /boot/kernel10.1S/ and of cp -ax of 10.1-STABLE to = /boot/kernel/ (with loader.conf defaulted or explicit about /boot/kernel/.) B) When I *do* stop it and explicitly switch from 10.1-RELENG to = 10.1-STABLE at the loader prompt is when it *works* fine for 10.1-STABLE. So far I've tried all my usual permutations of make.conf and src.conf = settings and the behaviors is unchanged across the various builds . I've tried building -r275566, -r276979, and -r477483 of 10.1-STABLE and = they all get the same result in my tests. An interesting point is that across all those 10.1-STABLE builds the = following two lines are always the same for the failure (no variation in = address in SRR0 or in SRR1's value): %SRR0: 00000000.01c277fc %SRR1: 10000000.00003030 It normally says "Invalid memory address" but occasionally says = "Decrementer exception". I have yet to find a way to build 10.1-STABLE that works for direct = booting but I've no problems with any 10.1-RELENG (or the 10.1-RELEASE) = based builds that I've tried. I'll slowly keep looking into it. (Generally other things are limiting = me to synchronizing world and kernel once and a while for FreeBSD. My = time is mostly going elsewhere still.) > Nathan wrote: >=20 > This is a bug in loader, unfortunately. Due to the way that it = interacts=20 > with Open Firmware's memory management, it is not in general possible = to=20 > change kernels at the loader prompt. Depending on memory layout,=20 > sometimes it will work (as you noticed) and sometimes it will enter an=20= > inconsistent state, usually crashing very early (as you also noticed).=20= > This is the one "known issue" mentioned on the PowerPC port website at=20= > http://www.freebsd.org/platforms/ppc.html. > -Nathan =3D=3D=3D Mark Millard markmi@dsl-only.net On 2015-Jan-26, at 03:25 AM, Mark Millard <markmi@dsl-only.net> wrote: I discovered that I have a 10.1 powerpc64 kernel build/boot-ability = oddity (PowerMac). First some context: The builds are/were done on a PowerMac G5 quad-core. $ ls -Fpald /boot/kernel* drwxr-xr-x 2 root wheel 26624 Jan 19 22:26 /boot/kernel/ drwxr-xr-x 2 root wheel 26624 Jan 19 22:26 /boot/kernel.old/ drwxr-xr-x 2 root wheel 26624 Jan 19 22:26 /boot/kernel10.1RE/ drwxr-xr-x 2 root wheel 26624 Jan 23 23:44 /boot/kernel10.1S/ drwxr-xr-x 2 root wheel 26624 Jan 25 19:52 /boot/kernel10.1S-alt/ $ freebsd-version -ku 10.1-RELEASE-p4 10.1-STABLE kernel/, kernel.old/, and kernel10.1RE/ are all copies of each other = currently (cp -xa ...) . It/they are my build of a variant of = 10.1-RELEASE-p4. The other two are builds of variants of 10.1-STABLE = kernels (r276979 and r277483 variants). In this configuration I can boot kernel just fine. I can also stop in = Openfirmware and type any of... boot kernel.old boot kernel10.1RE boot kernel10.1S boot kernel10.1S-alt and the boot works fine and "uname -a" then agrees with whichever one = that I picked. For example boot kernel10.1S-alt results in: $ uname -a FreeBSD FBSDG5M1 10.1-STABLE FreeBSD 10.1-STABLE #8 r277483M: Sun Jan 25 = 19:51:41 PST 2015 root@FBSDG5M1:/usr/obj/usr/src/sys/GENERIC64vtsc = powerpc But if I do either of the following and then try to "shutdown -r now" = afterwards I end up with a decrementer error (sometimes) or addressing = error (the rest of the time, which is most of the time). This is while = openfirmware is still displaying things before I can stop it by typing. = I end up with the options "mac-boot" and "shut-down". cp -ax /boot/kernel10.1S/ /boot/kernel/ cp -ax /boot/kernel10.1S-alt/ /boot/kernel/ Power-off/power-on gets the same kinds of failures that "shutdown -r = now" gets. (Note: I will focus on kernel-10.1S-alt since my source tree = has been updated after I built kernel10.1S so it no longer fully = matches.) Booting from a USB stick instead of the SSD (cmd-option-OF, boot = ud:2,\ppc\bootinfo.txt) and picking shell, doing an appropriate mount, = and then one of cp -ax /boot/kernel.old/ /boot/kernel/ cp -ax /boot/kernel10.1RE/ /boot/kernel/ and then umount and "shutdown -r now" reboots fine and things are back = to normal for future booting. It seems that 10.1-RELEASE-p4 establishes context for 10.1-STABLE that = 10.1-STABLE does not correctly establish for itself --at least in my = builds. But I've no clue what the issue is yet. Context notes: I have multiple source trees (with 10.1-STABLE in /usr/src and the other = elsewhere). I use "make -j 8 kernel KERNCONF=3DGENERIC64vtsc = INSTKERNNAME=3D...". (The later svnlite status "?" lines for any extra = files are not shown.) $ svnlite info ~markmi/src_10_1_releng Path: /home/markmi/src_10_1_releng Working Copy Root Path: /home/markmi/src_10_1_releng URL: https://svn0.us-west.freebsd.org/base/releng/10.1 Relative URL: ^/releng/10.1 Repository Root: https://svn0.us-west.freebsd.org/base Repository UUID: ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f Revision: 277195 Node Kind: directory Schedule: normal Last Changed Author: delphij Last Changed Rev: 277195 Last Changed Date: 2015-01-14 13:27:46 -0800 (Wed, 14 Jan 2015) $ svnlite info /usr/src Path: /usr/src Working Copy Root Path: /usr/src URL: https://svn0.us-west.freebsd.org/base/stable/10 Relative URL: ^/stable/10 Repository Root: https://svn0.us-west.freebsd.org/base Repository UUID: ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f Revision: 277483 Node Kind: directory Schedule: normal Last Changed Author: smh Last Changed Rev: 277483 Last Changed Date: 2015-01-21 01:45:48 -0800 (Wed, 21 Jan 2015) $ svnlite status ~markmi/src_10_1_releng M /home/markmi/src_10_1_releng/sys/ddb/db_main.c M /home/markmi/src_10_1_releng/sys/ddb/db_script.c M /home/markmi/src_10_1_releng/sys/powerpc/ofw/ofw_machdep.c M /home/markmi/src_10_1_releng/sys/powerpc/ofw/ofwcall64.S M = /home/markmi/src_10_1_releng/sys/powerpc/powermac/powermac_thermal.c $ svnlite status /usr/src M /usr/src/sys/ddb/db_main.c M /usr/src/sys/ddb/db_script.c M /usr/src/sys/powerpc/ofw/ofw_machdep.c M /usr/src/sys/powerpc/ofw/ofwcall64.S M /usr/src/sys/powerpc/powermac/powermac_thermal.c All of the above except powermac_thermal.c are tied to my trying to = produce evidence for later intermittent PowerMac G5 boot issues than = what I'm reporting here. I will not get into the details for why but = I've set up to use a Justin Hibbits patch for powermac_thermal.c, not = that I need it for the PowerMac that I'm using for this note. (I move = the same SSD around between machines.) I used svnlite diff for each of the above to produce .diff files. = Diffing the .diffs and then then original files is shown below (no = differences). $ diff src10.1-RELENG.diff src10.1-STABLE.diff 3c3 < --- sys/ddb/db_main.c (revision 277195) --- > --- sys/ddb/db_main.c (revision 277483) 27c27 < --- sys/ddb/db_script.c (revision 277195) --- > --- sys/ddb/db_script.c (revision 277483) 57c57 < --- sys/powerpc/ofw/ofw_machdep.c (revision 277195) --- > --- sys/powerpc/ofw/ofw_machdep.c (revision 277483) 73c73 < --- sys/powerpc/ofw/ofwcall64.S (revision 277195) --- > --- sys/powerpc/ofw/ofwcall64.S (revision 277483) 401c401 < --- sys/powerpc/powermac/powermac_thermal.c (revision 277195) --- > --- sys/powerpc/powermac/powermac_thermal.c (revision 277483) $ diff ~markmi/src_10_1_releng/sys/ddb/db_main.c = /usr/src/sys/ddb/db_main.c $ diff ~markmi/src_10_1_releng/sys/ddb/db_script.c = /usr/src/sys/ddb/db_script.c $ diff ~markmi/src_10_1_releng/sys/powerpc/ofw/ofw_machdep.c = /usr/src/sys/powerpc/ofw/ofw_machdep.c $ diff ~markmi/src_10_1_releng/sys/powerpc/ofw/ofwcall64.S = /usr/src/sys/powerpc/ofw/ofwcall64.S $ diff ~markmi/src_10_1_releng/sys/powerpc/powermac/powermac_thermal.c = /usr/src/sys/powerpc/powermac/powermac_thermal.c The same variant of GENERIC64 is used for both the source trees: I call = it GENERIC64vtsc: $ more sys/powerpc/conf/GENERIC64vtsc include GENERIC64 ident GENERIC64vtsc nooptions PS3 #Sony Playstation 3 = HACK!!! to allow sc options DDB # HACK!!! to dump early crash = info (but 11.0-CURRENT already has it) options GDB # HACK!!! ... #options KTR #options KTR_MASK=3DKTR_TRAP #options KTR_CPUMASK=3D0xF #options KTR_VERBOSE # HACK!!! to allow sc for 2560x1440 display on Radeon X1950 that vt = historically mishandled during booting device sc #device kbdmux # HACK: already listed by vt options SC_OFWFB # OFW frame buffer options SC_DFLT_FONT # compile font in makeoptions SC_DFLT_FONT=3Dcp437 # Disable extra checking typically used for FreeBSD 11.0-CURRENT: nooptions DEADLKRES #Enable the deadlock resolver nooptions INVARIANTS #Enable calls of extra sanity = checking nooptions INVARIANT_SUPPORT #Extra sanity checks of internal = structures, required by INVARIANTS nooptions WITNESS #Enable checks to detect = deadlocks and cycles nooptions WITNESS_SKIPSPIN #Don't run witness on spinlocks = for speed nooptions MALLOC_DEBUG_MAXZONES # Separate malloc(9) zones (I'm not referring in this Email to the context that I sometimes use the = file content for 11.0-CURRENT. That would be another thing to test but I = have not tried to have my 11.0-CURRENT variant as /boot/kernel/ so far. = But "boot kernel11C" does work when /boot/kernel/ is based on = 10.1-RELEASE-p4.) $ more /etc/make.conf WRKDIRPREFIX=3D/usr/obj/portswork WITH_DEBUG=3D #MALLOC_PRODUCTION=3D $ more /etc/src.conf #WITH_DEBUG_FILES=3D #WITHOUT_CLANG=3D But ~markmi/src_10_1_releng was built longer ago and had the #'s removed = in /etc/src.conf and no MALLOC_PRODUCTION=3D line in /etc/make.conf at = all. (I'll note that I use WITHOUT_CLANG when I use WITH_DEBUG_FILES because = clang fails to fully build otherwise.) $ more /boot/loader.conf verbose_loading=3D"YES" kern.vty=3Dvt =3D=3D=3D Mark Millard markmi at dsl-only.net
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?78919D58-B433-404C-ACBD-388EA66B9821>