Date: Sun, 18 Feb 2018 11:48:05 -0800 From: Mark Millard <marklmi26-fbsd@yahoo.com> To: Mateusz Guzik <mjguzik@gmail.com> Cc: FreeBSD Current <freebsd-current@freebsd.org>, FreeBSD Hackers <freebsd-hackers@freebsd.org> Subject: Re: amd64 head -r329465 (non-debug build, but with symbols): "panic: spin lock held too long" during make check-old, reported during a sys_vfork Message-ID: <7914AB08-AC79-4D83-BBD7-CE8B78070624@yahoo.com> In-Reply-To: <CAGudoHH5Yz6312QSADiOVx9kd17=WcatEB3qyNjTa5qh_hXASg@mail.gmail.com> References: <DA76F62D-3373-47CA-AD95-DE9BA580772B@yahoo.com> <6907E068-C80A-44B8-A8AD-3EF27D52D127@yahoo.com> <20832C61-AA5D-41A6-8BF9-90CC87D17219@yahoo.com> <CAGudoHH5Yz6312QSADiOVx9kd17=WcatEB3qyNjTa5qh_hXASg@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2018-Feb-18, at 10:08 AM, Mateusz Guzik <mjguzik at gmail.com> wrote: > Can you please bisect this? There is another report stating that = r329418 works fine. I saw that Trond indicated an intent to test -r329418 but I've not seen any reports about -r329418 or how much activity was used to make any judgment about its status. But I can assume -r329418 is good if you want. Bisecting is likely going to be problematical for self-updates: builds and installs and such can crash, making the installs risky. I do not have an alternate builder for amd64 set up. Even without that, it is not clear how many hours of build-related = activity it takes to have a high probability that the problem is gone. (I've seen widely variable amounts of activity between failures in -r329465 .) It = is obvious to try an earlier version after failure but not obvious when to try a later version. My FreeBSD time is also rather limited (compared to historically over = the last few years), so the activity could be spread over parts of various weekends, depending on how it goes. >> On Sun, Feb 18, 2018 at 6:35 PM, Mark Millard <marklmi26-fbsd at = yahoo.com> wrote: >>=20 >> On 2018-Feb-17, at 6:10 PM, Mark Millard <marklmi26-fbsd at = yahoo.com> wrote: >>=20 >> > [Some more information added, from /usr/libexec/kgdb use.] >> > >> > On 2018-Feb-17, at 5:39 PM, Mark Millard <marklmi26-fbsd at = yahoo.com> wrote: >> > >> >> This is for FreeBSD running under Hyper-V on a Windows 10 Pro = machine. >> >> The FreeBSD "disk" bindings are to SSDs, not the insides of NTFS = files. >> >> 29 logical processors assigned to FreeBSD (on a 32-thread Ryzen >> >> Threadripper 1950X). No other Hyper-V use. >>=20 >> Trond's report seems to be for a "4 core" Intel i7 context (as seen >> by FreeBSD in virtual box). So Ryzen seems to be non-essential for >> reproduction. >>=20 >> Both of our reports are from some form of using FreeBSD in a virtual >> machine (Hyper-V and VirtualBox). I do not know if that is a required >> type of context or not. >>=20 >> >> This happened during: >> >> >> >> # = ~/sys_build_scripts.amd64-host/make_powerpc64vtsc_nodebug_clang_altbinutil= s-amd64-host.sh check-old = DESTDIR=3D/usr/obj/DESTDIRs/clang-powerpc64-installworld_altbinutils >> >> Script started, output file is = /root/sys_typescripts/typescript_make_powerpc64vtsc_nodebug_clang_altbinut= ils-amd64-host-2018-02-17:15:56:20 >> >>>>> Checking for old files >> >> >>=20 >> I got another example but during a buildworld: >>=20 >> >>> Deleting stale files in build tree... >> cd /usr/src; MACHINE_ARCH=3Dpowerpc64 MACHINE=3Dpowerpc CPUTYPE=3D = BUILD_TOOLS_META=3D.NOMETA CC=3D"cc -target = powerpc64-unknown-freebsd12.0 = --sysroot=3D/usr/obj/powerpc64vtsc_clang_altbinutils/powerpc.powerpc64/usr= /src/powerpc.powerpc64/tmp = -B/usr/local/powerpc64-unknown-freebsd12.0/bin/" CXX=3D"c++ -target = powerpc64-unknown-freebsd12.0 = --sysroot=3D/usr/obj/powerpc64vtsc_clang_altbinutils/powerpc.powerpc64/usr= /src/powerpc.powerpc64/tmp = -B/usr/local/powerpc64-unknown-freebsd12.0/bin/" CPP=3D"cpp -target = powerpc64-unknown-freebsd12.0 = --sysroot=3D/usr/obj/powerpc64vtsc_clang_altbinutils/powerpc.powerpc64/usr= /src/powerpc.powerpc64/tmp = -B/usr/local/powerpc64-unknown-freebsd12.0/bin/" = AS=3D"/usr/local/powerpc64-unknown-freebsd12.0/bin/as" = AR=3D"/usr/local/powerpc64-unknown-freebsd12.0/bin/ar" = LD=3D"/usr/local/powerpc64-unknown-freebsd12.0/bin/ld" LLVM_LINK=3D"" = NM=3D/usr/local/powerpc64-unknown-freebsd12.0/bin/nm = OBJCOPY=3D"/usr/local/powerpc64-unknown-freebsd12.0/bin/objcopy" = RANLIB=3D/usr/local/powerpc64-unknown- >> freebsd12.0/bin/ranlib = STRINGS=3D/usr/local/bin/powerpc64-unknown-freebsd12.0-strings = SIZE=3D"/usr/local/powerpc64-unknown-freebsd12.0/bin/size" INSTALL=3D"sh = /usr/src/tools/install.sh" = PATH=3D/usr/obj/powerpc64vtsc_clang_altbinutils/powerpc.powerpc64/usr/src/= powerpc.powerpc64/tmp/legacy/usr/sbin:/usr/obj/powerpc64vtsc_clang_altbinu= tils/powerpc.powerpc64/usr/src/powerpc.powerpc64/tmp/legacy/usr/bin:/usr/o= bj/powerpc64vtsc_clang_altbinutils/powerpc.powerpc64/usr/src/powerpc.power= pc64/tmp/legacy/bin:/usr/obj/powerpc64vtsc_clang_altbinutils/powerpc.power= pc64/usr/src/powerpc.powerpc64/tmp/usr/sbin:/usr/obj/powerpc64vtsc_clang_a= ltbinutils/powerpc.powerpc64/usr/src/powerpc.powerpc64/tmp/usr/bin:/sbin:/= bin:/usr/sbin:/usr/bin = SYSROOT=3D/usr/obj/powerpc64vtsc_clang_altbinutils/powerpc.powerpc64/usr/s= rc/powerpc.powerpc64/tmp make -f Makefile.inc1 BWPHASE=3Dworldtmp = DESTDIR=3D/usr/obj/powerpc64vtsc_clang_altbinutils/powerpc.powerpc64/usr/s= rc/powerpc.powerpc64/tmp -DBATCH_DELETE_OLD_FILES delete-old d >> elete-old-libs >/dev/null >>=20 >> load: 0.68 cmd: make 62180 [select] 25.15r 0.00u 0.00s 0% 1468k >> make: Working in: = /usr/obj/powerpc64vtsc_clang_altbinutils/powerpc.powerpc64/usr/src/powerpc= .powerpc64 >> packet_write_wait: Connection to 192.168.1.165 port 22: Broken pipe >>=20 >>=20 >> (I noticed the long pause and got the ^T in before the panic.) >>=20 >> Yet again it is xargs related fork activity that gets the problem = (from core.txt.1 ): >>=20 >> 561 Thread 100836 (PID=3D69982: xargs) fork_trampoline () at = /usr/src/sys/amd64/amd64/exception.S:840 >> . . . >> * 559 Thread 100811 (PID=3D62304: xargs) doadump = (textdump=3D-2122191464) at pcpu.h:230 >>=20 >> spin lock 0xffffffff81b3cf00 (sched lock 24) held by = 0xfffff806aa6d5000 (tid 100836) too long >> panic: spin lock held too long >> cpuid =3D 24 >> time =3D 1518974055 >> KDB: stack backtrace: >> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame = 0xfffffe00f11304d0 >> vpanic() at vpanic+0x18d/frame 0xfffffe00f1130530 >> panic() at panic+0x43/frame 0xfffffe00f1130590 >> _mtx_lock_indefinite_check() at _mtx_lock_indefinite_check+0x71/frame = 0xfffffe00f11305a0 >> thread_lock_flags_() at thread_lock_flags_+0xdb/frame = 0xfffffe00f1130610 >> statclock_cnt() at statclock_cnt+0xdc/frame 0xfffffe00f1130650 >> handleevents() at handleevents+0x113/frame 0xfffffe00f11306a0 >> timercb() at timercb+0xa9/frame 0xfffffe00f11306f0 >> lapic_handle_timer() at lapic_handle_timer+0xa7/frame = 0xfffffe00f1130730 >> timerint_u() at timerint_u+0x96/frame 0xfffffe00f1130810 >> thread_lock_flags_() at thread_lock_flags_+0xc1/frame = 0xfffffe00f1130880 >> fork1() at fork1+0x1b9f/frame 0xfffffe00f1130930 >> sys_vfork() at sys_vfork+0x4c/frame 0xfffffe00f1130980 >> amd64_syscall() at amd64_syscall+0xa48/frame 0xfffffe00f1130ab0 >> fast_syscall_common() at fast_syscall_common+0x101/frame = 0x7fffffffc5a0 >>=20 >=20 =3D=3D=3D Mark Millard marklmi at yahoo.com ( markmi at dsl-only.net is going away in 2018-Feb, late)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?7914AB08-AC79-4D83-BBD7-CE8B78070624>