From owner-freebsd-ppc@freebsd.org Mon Feb 20 23:35:42 2017 Return-Path: Delivered-To: freebsd-ppc@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5872DCE6AC1; Mon, 20 Feb 2017 23:35:42 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: from mail-wm0-x244.google.com (mail-wm0-x244.google.com [IPv6:2a00:1450:400c:c09::244]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id CCC131911; Mon, 20 Feb 2017 23:35:41 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: by mail-wm0-x244.google.com with SMTP id r18so16959131wmd.3; Mon, 20 Feb 2017 15:35:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:subject:message-id:mail-followup-to:references :mime-version:content-disposition:in-reply-to:user-agent; bh=y8lxRZiqFE+RaXEozUOQgq+0sg6e+vQmwCfXhNkkwdU=; b=MSbVqeapl9WUwC+0+oswpS9plQEzggq7OZihyhGzROXxdn3pMKIVLCLZX5D2fhKP9w ZZZy4oYgw6wM6lDWrXuyUEdo/Bdh740GORYFBgRibalREgccaTkKl8l6rF4qq2EEAjZm hnaIF6y2CYh7VyAnRNYoqp/6mK2rOPtWCXKkarWtriYihlHxZqosVO/RqrceHivh9mkx 1Tftqdec3axk6R+kpaiREGtwUO7RU1gwJAc00sR+Jo3zvXoxITKmDZRrr1TZJAlWClL0 531kGyOqeOd0ZBeZz9woozygxNOBGAhctNXBGi8LOmKiV5xs4nQrVyQpp8en6Lh/f8pz p15w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id :mail-followup-to:references:mime-version:content-disposition :in-reply-to:user-agent; bh=y8lxRZiqFE+RaXEozUOQgq+0sg6e+vQmwCfXhNkkwdU=; b=ey6uH+o/wh+35dA/oMzUIqnUyBO24ovGDMoDBRtJJwm+SrYIoEV4/XYTPSM2GNGUp8 o1+BtUUCa6YKEm7aqPKuWn7PeUEsVOeplCPV8apQEdRVlyCIWCvm8GahFcoJGJ9yDX/W /Rp3SIKmpQwhJs1O41uW8y7RukztJqro6reSNVkQurE9V0PGzN0KEDItbdb5CEcPH7Zv lFt4tgIa9U39eu0u43302kD0Bt6sPqWpQeJhWSoU9+IjUdyunnd8ns5NPAln2obZXue+ GciUhZ+SdhH7w+p8d6SdFNWsSPIVnZ/K0p/4/sPRElAVg9AV8f70YgiwUH/E7bYkIaDL QUOA== X-Gm-Message-State: AMke39nB9Y6nguBBB32IimLos+dnFdED2RitYUxVTb9EEXhgZs91nvRca16iz3wAc/OSGw== X-Received: by 10.28.170.4 with SMTP id t4mr12781610wme.89.1487633740252; Mon, 20 Feb 2017 15:35:40 -0800 (PST) Received: from dft-labs.eu (n1x0n-1-pt.tunnel.tserv5.lon1.ipv6.he.net. [2001:470:1f08:1f7::2]) by smtp.gmail.com with ESMTPSA id 136sm15311968wms.32.2017.02.20.15.35.39 (version=TLS1_2 cipher=AES128-SHA bits=128/128); Mon, 20 Feb 2017 15:35:39 -0800 (PST) Date: Tue, 21 Feb 2017 00:35:37 +0100 From: Mateusz Guzik To: Mark Millard Cc: Justin Hibbits , mjg@freebsd.org, FreeBSD Current , svn-src-head@freebsd.org, FreeBSD PowerPC ML Subject: Re: svn commit: r313268 - head/sys/kern [through -r313271 for atomic_fcmpset use and later: fails on PowerMac G5 "Quad Core"; -r313266 works] Message-ID: <20170220233537.GB26759@dft-labs.eu> Mail-Followup-To: Mateusz Guzik , Mark Millard , Justin Hibbits , mjg@freebsd.org, FreeBSD Current , svn-src-head@freebsd.org, FreeBSD PowerPC ML References: <2FD12B8F-2255-470A-98D4-2DCE9C7495F5@dsl-only.net> <20170220191044.GA8526@dft-labs.eu> <83428304-87BE-413C-BAB9-8FF218E7661C@dsl-only.net> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 20 Feb 2017 23:35:42 -0000 On Mon, Feb 20, 2017 at 03:10:44PM -0800, Mark Millard wrote: > On 2017-Feb-20, at 2:58 PM, Mark Millard wrote: > > > On 2017-Feb-20, at 11:10 AM, Mateusz Guzik wrote: > > > >> On Sat, Feb 18, 2017 at 04:18:05AM -0800, Mark Millard wrote: > >>> [Note: I experiment with clang based powerpc64 builds, > >>> reporting problems that I find. Justin is familiar > >>> with this, as is Nathan.] > >>> > >>> I tried to update the PowerMac G5 (a so-called "Quad Core") > >>> that I have access to from head -r312761 to -r313864 and > >>> ended up with random panics and hang ups in fairly short > >>> order after booting. > >>> > >>> Some approximate bisecting for the kernel lead to: > >>> (sometimes getting part way into a buildkernel attempt > >>> for a different version before a failure happens) > >>> > >>> -r313266: works (just before use of atomic_fcmpset) > >>> vs. > >>> -r313271: fails (last of the "use atomic_fcmpset" check-ins) > >>> > >>> (I did not try -r313268 through -r313270 as the use was > >>> gradually added.) > >>> > >>> So I'm currently running a -r313864 world with a -r313266 > >>> kernel. > >>> > >>> No kernel that I tried that was from before -r313266 had the > >>> problems. > >>> > >>> Any kernel that I tried that was from after -r313271 had the > >>> problems. > >>> > >>> Of course I did not try them all in other direction. :) > >>> > >> > >> I found that spin mutexes were not properly handling this, fixed in > >> r313996. > >> > >> Locally I added a if (cpu_tick() % 2) return (0); snipped to amd64 > >> fcmpset to simulate failures. Everything works, while it would easily > >> fail without the patch. > >> > >> That said, I hope this concludes the 'missing check for not-reread value > >> of failed fcmpset' saga. > >> > >> -- > >> Mateusz Guzik > > > > I tried to update from -r313864 to -r313999 in my amd64 context > > (a VirtualBox machine under macOS) but it now crashes late in > > the boot sequence (after it processes a dump if I make one but > > before I can log in). > > > > This update was via my usual explicit svnlite update; buildworld > > buildkernel; etc. production style build of world and kernel, > > including use of MALLOC_PRODUCTION. > > > > The window shows: > > > > _vm_map_lock+0xf > > vm_map_wire+0x32 > > rtROMemObjNativeLockInMap+0x8c > > rtROMemObjNativeLockUser+0x51 > > RTR0MemObjLockUserTag+0x231 > > vbglR0HGCMInternalPreprocessCall+0x65d > > vbglR0HGCMInternalCall+0x17c > > vgdrvIoCtl_HGCMCall+0x43f > > VGDrvCommonIoCtl+0x261 > > vgdrvFreeBSDIOCtl+0x2cd > > devfs_ioctl+0xae > > VOP_IOCTL_APV+0x88 > > vn_ioctl+0x161 > > devfs_ioctl_f+0x1f > > kern_ioctl+0x280 > > sys_ioctl+0x13f > > amd64_syscall+0x397 > > Xfast_syscall+0xfb > > More detail from booting with the -r313864 kernel.old > and using kgdb on what the dump produced: > > # kgdb kernel.debug /var/crash/vmcore. > /var/crash/vmcore.0 /var/crash/vmcore.last > # kgdb kernel.debug /var/crash/vmcore.0 > GNU gdb 6.1.1 [FreeBSD] > Copyright 2004 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and you are > welcome to change it and/or distribute copies of it under certain conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for details. > This GDB was configured as "amd64-marcel-freebsd"... > > Unread portion of the kernel message buffer: > <118>Starting vboxservice. > <118>VBoxService 5.1.14 r112924 (verbosity: 0) freebsd.amd64 (Jan 20 2017 18:37:45) release log > <118>00:00:00.000120 main Log opened 2017-02-20T22:38:46.348080000Z > <118>00:00:00.000162 main OS Product: FreeBSD > <118>00:00:00.000171 main OS Release: 12.0-CURRENT > <118>00:00:00.000180 main OS Version: FreeBSD 12.0-CURRENT r313999M > <118>00:00:00.000192 main Executable: /usr/local/sbin/VBoxService > <118>00:00:00.000194 main Process ID: 609 > <118>00:00:00.000196 main Package type: BSD_64BITS_GENERIC (OSE) > > > Fatal trap 12: page fault while in kernel mode > cpuid = 2; apic id = 02 > fault virtual address = 0xd6 > fault code = supervisor read data, page not present > instruction pointer = 0x20:0xffffffff80d4ebaf > stack pointer = 0x28:0xfffffe0122e2bef0 > frame pointer = 0x28:0xfffffe0122e2bf00 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = 609 (VBoxService) > > #9 0xffffffff80eb6be1 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:236 > #10 0xffffffff80d4ebaf in _vm_map_lock (map=0x1, file=0x0, line=0) at /usr/src/sys/vm/vm_map.c:501 The function is: void _vm_map_lock(vm_map_t map, const char *file, int line) { if (map->system_map) mtx_lock_flags_(&map->system_mtx, 0, file, line); else sx_xlock_(&map->lock, file, line); map->timestamp++; } system_map is at offset 0xd5, thus the faulting address of 0xd6 with map address of 1 looks like the backtrace is corect. But this suggests the bug is unrelated to my changes and there is a chance there is no bug in the first place. Please make sure that the virtualbox module is recompiled against proper source tree. If the problem persists, please bisect. The range is not big. Off hand I don't see what can cause the failure in question (and chances are there is no bug if kbi changed and the module was not recompiled). > #11 0xffffffff80d51ea2 in vm_map_wire (map=, start=4534272, end=4538368, flags=1) at /usr/src/sys/vm/vm_map.c:2534 > #12 0xffffffff8265291c in rtR0MemObjNativeLockInMap () from /boot/modules/vboxguest.ko > #13 0xffffffff82652881 in rtR0MemObjNativeLockUser () from /boot/modules/vboxguest.ko > #14 0xffffffff8264ec01 in RTR0MemObjLockUserTag () from /boot/modules/vboxguest.ko > #15 0xffffffff82624afd in vbglR0HGCMInternalPreprocessCall () from /boot/modules/vboxguest.ko > #16 0xffffffff8262411a in VbglR0HGCMInternalCall () from /boot/modules/vboxguest.ko > #17 0xffffffff8261ec4f in vgdrvIoCtl_HGCMCall () from /boot/modules/vboxguest.ko > #18 0xffffffff8261d221 in VGDrvCommonIoCtl () from /boot/modules/vboxguest.ko > -- Mateusz Guzik