From owner-freebsd-hackers@freebsd.org Sun Feb 18 20:08:22 2018 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 75574F26412 for ; Sun, 18 Feb 2018 20:08:22 +0000 (UTC) (envelope-from marklmi26-fbsd@yahoo.com) Received: from sonic313-11.consmr.mail.ne1.yahoo.com (sonic313-11.consmr.mail.ne1.yahoo.com [66.163.185.34]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id F1AAA68735 for ; Sun, 18 Feb 2018 20:08:21 +0000 (UTC) (envelope-from marklmi26-fbsd@yahoo.com) X-YMail-OSG: VCF2V70VM1m5bNANyuWvqveT_.28d5TVYU_3NdrPK4jcdrx8PpPN05eYtzcqEyb IAud.U9o_p75UbRHY0QhyrqMfyz3KtvXfp94OpqyEi3IbEpn7k1ONgKBjZDK0cj8_vYGCWTeOZHf 12EVMu7FYZ9Bkp9FYKL5ghN1NzXaXhNBJWwW9DSO2XJWweE_BQpU6clUSg60_E.AZ60WFxHdbtL1 IjpdyIbHInjMqcHGp69RlEkIP9JWFLkm14vj9Vwg5CYyO1s6G2EASKeNBwrX9q3L3kP12q18fg1G GMFFAnvCjTdTjZ9TWxyqcbdPIL91ZsCDghI8dLz7dI9ZpXsC25P3hJbaSpkfbLTvLm8GKcYQ_PPG yCvoMpQtxOhFdVMGi4x37EPVBeSIlnW6NCTYRWyzSmXjYc9BQ2mwzaWuKaYNgOx5E.YXy2h1fRv0 A9XpWFvz4Nh33gBiNxcYexeQseEdOjomfzu1Xmc7bpePJbNjHg0zNSiEynY5gcd4jg.ww Received: from sonic.gate.mail.ne1.yahoo.com by sonic313.consmr.mail.ne1.yahoo.com with HTTP; Sun, 18 Feb 2018 20:08:20 +0000 Received: from smtp231.mail.ne1.yahoo.com (EHLO [192.168.1.25]) ([10.218.253.210]) by smtp412.mail.ne1.yahoo.com (JAMES SMTP Server ) with ESMTPA ID 9cb940f2491913eaffd95b58e217359f; Sun, 18 Feb 2018 19:48:07 +0000 (UTC) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 11.2 \(3445.5.20\)) Subject: Re: amd64 head -r329465 (non-debug build, but with symbols): "panic: spin lock held too long" during make check-old, reported during a sys_vfork From: Mark Millard In-Reply-To: Date: Sun, 18 Feb 2018 11:48:05 -0800 Cc: FreeBSD Current , FreeBSD Hackers Content-Transfer-Encoding: quoted-printable Message-Id: <7914AB08-AC79-4D83-BBD7-CE8B78070624@yahoo.com> References: <6907E068-C80A-44B8-A8AD-3EF27D52D127@yahoo.com> <20832C61-AA5D-41A6-8BF9-90CC87D17219@yahoo.com> To: Mateusz Guzik X-Mailer: Apple Mail (2.3445.5.20) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 18 Feb 2018 20:08:22 -0000 On 2018-Feb-18, at 10:08 AM, Mateusz Guzik wrote: > Can you please bisect this? There is another report stating that = r329418 works fine. I saw that Trond indicated an intent to test -r329418 but I've not seen any reports about -r329418 or how much activity was used to make any judgment about its status. But I can assume -r329418 is good if you want. Bisecting is likely going to be problematical for self-updates: builds and installs and such can crash, making the installs risky. I do not have an alternate builder for amd64 set up. Even without that, it is not clear how many hours of build-related = activity it takes to have a high probability that the problem is gone. (I've seen widely variable amounts of activity between failures in -r329465 .) It = is obvious to try an earlier version after failure but not obvious when to try a later version. My FreeBSD time is also rather limited (compared to historically over = the last few years), so the activity could be spread over parts of various weekends, depending on how it goes. >> On Sun, Feb 18, 2018 at 6:35 PM, Mark Millard wrote: >>=20 >> On 2018-Feb-17, at 6:10 PM, Mark Millard wrote: >>=20 >> > [Some more information added, from /usr/libexec/kgdb use.] >> > >> > On 2018-Feb-17, at 5:39 PM, Mark Millard wrote: >> > >> >> This is for FreeBSD running under Hyper-V on a Windows 10 Pro = machine. >> >> The FreeBSD "disk" bindings are to SSDs, not the insides of NTFS = files. >> >> 29 logical processors assigned to FreeBSD (on a 32-thread Ryzen >> >> Threadripper 1950X). No other Hyper-V use. >>=20 >> Trond's report seems to be for a "4 core" Intel i7 context (as seen >> by FreeBSD in virtual box). So Ryzen seems to be non-essential for >> reproduction. >>=20 >> Both of our reports are from some form of using FreeBSD in a virtual >> machine (Hyper-V and VirtualBox). I do not know if that is a required >> type of context or not. >>=20 >> >> This happened during: >> >> >> >> # = ~/sys_build_scripts.amd64-host/make_powerpc64vtsc_nodebug_clang_altbinutil= s-amd64-host.sh check-old = DESTDIR=3D/usr/obj/DESTDIRs/clang-powerpc64-installworld_altbinutils >> >> Script started, output file is = /root/sys_typescripts/typescript_make_powerpc64vtsc_nodebug_clang_altbinut= ils-amd64-host-2018-02-17:15:56:20 >> >>>>> Checking for old files >> >> >>=20 >> I got another example but during a buildworld: >>=20 >> >>> Deleting stale files in build tree... >> cd /usr/src; MACHINE_ARCH=3Dpowerpc64 MACHINE=3Dpowerpc CPUTYPE=3D = BUILD_TOOLS_META=3D.NOMETA CC=3D"cc -target = powerpc64-unknown-freebsd12.0 = --sysroot=3D/usr/obj/powerpc64vtsc_clang_altbinutils/powerpc.powerpc64/usr= /src/powerpc.powerpc64/tmp = -B/usr/local/powerpc64-unknown-freebsd12.0/bin/" CXX=3D"c++ -target = powerpc64-unknown-freebsd12.0 = --sysroot=3D/usr/obj/powerpc64vtsc_clang_altbinutils/powerpc.powerpc64/usr= /src/powerpc.powerpc64/tmp = -B/usr/local/powerpc64-unknown-freebsd12.0/bin/" CPP=3D"cpp -target = powerpc64-unknown-freebsd12.0 = --sysroot=3D/usr/obj/powerpc64vtsc_clang_altbinutils/powerpc.powerpc64/usr= /src/powerpc.powerpc64/tmp = -B/usr/local/powerpc64-unknown-freebsd12.0/bin/" = AS=3D"/usr/local/powerpc64-unknown-freebsd12.0/bin/as" = AR=3D"/usr/local/powerpc64-unknown-freebsd12.0/bin/ar" = LD=3D"/usr/local/powerpc64-unknown-freebsd12.0/bin/ld" LLVM_LINK=3D"" = NM=3D/usr/local/powerpc64-unknown-freebsd12.0/bin/nm = OBJCOPY=3D"/usr/local/powerpc64-unknown-freebsd12.0/bin/objcopy" = RANLIB=3D/usr/local/powerpc64-unknown- >> freebsd12.0/bin/ranlib = STRINGS=3D/usr/local/bin/powerpc64-unknown-freebsd12.0-strings = SIZE=3D"/usr/local/powerpc64-unknown-freebsd12.0/bin/size" INSTALL=3D"sh = /usr/src/tools/install.sh" = PATH=3D/usr/obj/powerpc64vtsc_clang_altbinutils/powerpc.powerpc64/usr/src/= powerpc.powerpc64/tmp/legacy/usr/sbin:/usr/obj/powerpc64vtsc_clang_altbinu= tils/powerpc.powerpc64/usr/src/powerpc.powerpc64/tmp/legacy/usr/bin:/usr/o= bj/powerpc64vtsc_clang_altbinutils/powerpc.powerpc64/usr/src/powerpc.power= pc64/tmp/legacy/bin:/usr/obj/powerpc64vtsc_clang_altbinutils/powerpc.power= pc64/usr/src/powerpc.powerpc64/tmp/usr/sbin:/usr/obj/powerpc64vtsc_clang_a= ltbinutils/powerpc.powerpc64/usr/src/powerpc.powerpc64/tmp/usr/bin:/sbin:/= bin:/usr/sbin:/usr/bin = SYSROOT=3D/usr/obj/powerpc64vtsc_clang_altbinutils/powerpc.powerpc64/usr/s= rc/powerpc.powerpc64/tmp make -f Makefile.inc1 BWPHASE=3Dworldtmp = DESTDIR=3D/usr/obj/powerpc64vtsc_clang_altbinutils/powerpc.powerpc64/usr/s= rc/powerpc.powerpc64/tmp -DBATCH_DELETE_OLD_FILES delete-old d >> elete-old-libs >/dev/null >>=20 >> load: 0.68 cmd: make 62180 [select] 25.15r 0.00u 0.00s 0% 1468k >> make: Working in: = /usr/obj/powerpc64vtsc_clang_altbinutils/powerpc.powerpc64/usr/src/powerpc= .powerpc64 >> packet_write_wait: Connection to 192.168.1.165 port 22: Broken pipe >>=20 >>=20 >> (I noticed the long pause and got the ^T in before the panic.) >>=20 >> Yet again it is xargs related fork activity that gets the problem = (from core.txt.1 ): >>=20 >> 561 Thread 100836 (PID=3D69982: xargs) fork_trampoline () at = /usr/src/sys/amd64/amd64/exception.S:840 >> . . . >> * 559 Thread 100811 (PID=3D62304: xargs) doadump = (textdump=3D-2122191464) at pcpu.h:230 >>=20 >> spin lock 0xffffffff81b3cf00 (sched lock 24) held by = 0xfffff806aa6d5000 (tid 100836) too long >> panic: spin lock held too long >> cpuid =3D 24 >> time =3D 1518974055 >> KDB: stack backtrace: >> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame = 0xfffffe00f11304d0 >> vpanic() at vpanic+0x18d/frame 0xfffffe00f1130530 >> panic() at panic+0x43/frame 0xfffffe00f1130590 >> _mtx_lock_indefinite_check() at _mtx_lock_indefinite_check+0x71/frame = 0xfffffe00f11305a0 >> thread_lock_flags_() at thread_lock_flags_+0xdb/frame = 0xfffffe00f1130610 >> statclock_cnt() at statclock_cnt+0xdc/frame 0xfffffe00f1130650 >> handleevents() at handleevents+0x113/frame 0xfffffe00f11306a0 >> timercb() at timercb+0xa9/frame 0xfffffe00f11306f0 >> lapic_handle_timer() at lapic_handle_timer+0xa7/frame = 0xfffffe00f1130730 >> timerint_u() at timerint_u+0x96/frame 0xfffffe00f1130810 >> thread_lock_flags_() at thread_lock_flags_+0xc1/frame = 0xfffffe00f1130880 >> fork1() at fork1+0x1b9f/frame 0xfffffe00f1130930 >> sys_vfork() at sys_vfork+0x4c/frame 0xfffffe00f1130980 >> amd64_syscall() at amd64_syscall+0xa48/frame 0xfffffe00f1130ab0 >> fast_syscall_common() at fast_syscall_common+0x101/frame = 0x7fffffffc5a0 >>=20 >=20 =3D=3D=3D Mark Millard marklmi at yahoo.com ( markmi at dsl-only.net is going away in 2018-Feb, late)