Date: Sat, 31 Mar 2018 12:13:07 -0400 From: Joe Maloney <jmaloney@ixsystems.com> To: Andrew Reilly <areilly@bigpond.net.au> Cc: Jonathan Looney <jonlooney@gmail.com>, FreeBSD Current <current@freebsd.org>, Warner Losh <imp@bsdimp.com>, "jtl@freebsd.org" <jtl@freebsd.org> Subject: Re: 12-Current panics on boot (didn't a week ago.) Message-ID: <CAFvkmYPpxwomrYD_OG1-K-dcYnNLO6g-10OoYZK3jyDdzpF4Jg@mail.gmail.com> In-Reply-To: <20180331002746.GA2466@Zen.ac-r.nu> References: <20180324035653.GA3411@Zen.ac-r.nu> <CANCZdfozmyxC5MuNS8Tu_LD1bbAYNTnTcPe52-6sz9KPCQou_Q@mail.gmail.com> <B70A5BB4-CC2B-4503-8998-2A360D24E0CF@bigpond.net.au> <20180324232206.GA2457@Zen.ac-r.nu> <CANCZdfovA1MiWhp6ueSaWTCtKv31wHW6B5-pS2rCLmspDuHeTw@mail.gmail.com> <20180325032110.GA10881@Zen.ac-r.nu> <CADrOrmstNpFFK%2BoobR5yWELSmTz4_C_CV1JYiVRMpDwZ6cr_yw@mail.gmail.com> <20180331002746.GA2466@Zen.ac-r.nu>
next in thread | previous in thread | raw e-mail | index | archive | help
The drm-next-kmod, and drm-stable-kmod modules panic for me. I will attach logs when I can. On Friday, March 30, 2018, Andrew Reilly <areilly@bigpond.net.au> wrote: > Hi Jonathan, all, > > I've just compiled and booted a kernel derived from current-GENERIC > but with nooptions TCP_BLACKBOX, and much to my surprise it boots. > Possible link to network-related activities is that the next line > of boot output that was not being displayed during the crash is: > > [ath_hal] loaded > > That's vaguely network-shaped: could it be an issue? > > Please let me know if there's anything else that I could test or > poke, in order to find the real culprit. > > My make.conf says: > > KERNCONF=3DZEN > WRKDIRPREFIX=3D/usr/obj/ports > MALLOC_PRODUCTION=3Dyes > > My /usr/src/sys/amd64/conf/ZEN says: > > include GENERIC > nooptions TCP_BLACKBOX > > Uname -a says: > FreeBSD Zen.ac-r.nu 12.0-CURRENT FreeBSD 12.0-CURRENT #0 r331768M: Sat > Mar 31 10:47:52 AEDT 2018 root@Zen:/usr/obj/usr/src/amd64.amd64/sys/Z= EN > amd64 > > Cheers, > > Andrew > > > Here's the top part of the new dmesg.boot, FYI: > Copyright (c) 1992-2018 The FreeBSD Project. > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 > The Regents of the University of California. All rights reserved. > FreeBSD is a registered trademark of The FreeBSD Foundation. > FreeBSD 12.0-CURRENT #0 r331768M: Sat Mar 31 10:47:52 AEDT 2018 > root@Zen:/usr/obj/usr/src/amd64.amd64/sys/ZEN amd64 > FreeBSD clang version 6.0.0 (tags/RELEASE_600/final 326565) (based on LLV= M > 6.0.0) > WARNING: WITNESS option enabled, expect reduced performance. > VT(vga): resolution 640x480 > CPU: AMD Ryzen 7 1700 Eight-Core Processor (2994.45-MHz K8-clas= s > CPU) > Origin=3D"AuthenticAMD" Id=3D0x800f11 Family=3D0x17 Model=3D0x1 Ste= pping=3D1 > Features=3D0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8, > APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT> > Features2=3D0x7ed8320b<SSE3,PCLMULQDQ,MON,SSSE3,FMA,CX16, > SSE4.1,SSE4.2,MOVBE,POPCNT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND> > AMD Features=3D0x2e500800<SYSCALL,NX,MMX+,FFXSR,Page1GB,RDTSCP,LM> > AMD Features2=3D0x35c233ff<LAHF,CMP,SVM,ExtAPIC,CR8,ABM,SSE4A,MAS, > Prefetch,OSVW,SKINIT,WDT,TCE,Topology,PCXC,PNXC,DBE,PL2I,MWAITX> > Structured Extended Features=3D0x209c01a9<FSGSBASE, > BMI1,AVX2,SMEP,BMI2,RDSEED,ADX,SMAP,CLFLUSHOPT,SHA> > XSAVE Features=3D0xf<XSAVEOPT,XSAVEC,XINUSE,XSAVES> > AMD Extended Feature Extensions ID EBX=3D0x7<CLZERO,IRPerf,XSaveErPtr> > SVM: (disabled in BIOS) NP,NRIP,VClean,AFlush,DAssist,NAsids=3D32768 > TSC: P-state invariant, performance statistics > real memory =3D 34359738368 (32768 MB) > avail memory =3D 33271214080 (31729 MB) > Event timer "LAPIC" quality 600 > ACPI APIC Table: <ALASKA A M I > > FreeBSD/SMP: Multiprocessor System Detected: 8 CPUs > FreeBSD/SMP: 1 package(s) x 2 cache groups x 4 core(s) > random: unblocking device. > Firmware Warning (ACPI): Optional FADT field Pm2ControlBlock has valid > Length but zero Address: 0x0000000000000000/0x1 (20180313/tbfadt-796) > ioapic0 <Version 2.1> irqs 0-23 on motherboard > ioapic1 <Version 2.1> irqs 24-55 on motherboard > SMP: AP CPU #7 Launched! > SMP: AP CPU #3 Launched! > SMP: AP CPU #2 Launched! > SMP: AP CPU #6 Launched! > SMP: AP CPU #5 Launched! > SMP: AP CPU #4 Launched! > SMP: AP CPU #1 Launched! > Timecounter "TSC-low" frequency 1497224985 Hz quality 1000 > random: entropy device external interface > [ath_hal] loaded > module_register_init: MOD_LOAD (vesa, 0xffffffff8109f600, 0) error 19 > random: registering fast source Intel Secure Key RNG > random: fast provider: "Intel Secure Key RNG" > kbd1 at kbdmux0 > netmap: loaded module > nexus0 > vtvga0: <VT VGA driver> on motherboard > cryptosoft0: <software crypto> on motherboard > aesni0: <AES-CBC,AES-XTS,AES-GCM,AES-ICM,SHA1,SHA256> on motherboard > acpi0: <ALASKA A M I > on motherboard > acpi0: Power Button (fixed) > cpu0: <ACPI CPU> on acpi0 > cpu1: <ACPI CPU> on acpi0 > cpu2: <ACPI CPU> on acpi0 > cpu3: <ACPI CPU> on acpi0 > cpu4: <ACPI CPU> on acpi0 > cpu5: <ACPI CPU> on acpi0 > cpu6: <ACPI CPU> on acpi0 > cpu7: <ACPI CPU> on acpi0 > attimer0: <AT timer> port 0x40-0x43 irq 0 on acpi0 > Timecounter "i8254" frequency 1193182 Hz quality 0 > Event timer "i8254" frequency 1193182 Hz quality 100 > atrtc0: <AT realtime clock> port 0x70-0x71 on acpi0 > atrtc0: registered as a time-of-day clock, resolution 1.000000s > Event timer "RTC" frequency 32768 Hz quality 0 > hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff irq 0,8 o= n > acpi0 > Timecounter "HPET" frequency 14318180 Hz quality 950 > Event timer "HPET" frequency 14318180 Hz quality 350 > Event timer "HPET1" frequency 14318180 Hz quality 350 > Event timer "HPET2" frequency 14318180 Hz quality 350 > Timecounter "ACPI-fast" frequency 3579545 Hz quality 900 > acpi_timer0: <32-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0 > pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 > pci0: <ACPI PCI bus> on pcib0 > amdsmn0: <AMD Family 17h System Management Network> on hostb0 > amdtemp0: <AMD CPU On-Die Thermal Sensors> on hostb0 > > > On Sun, Mar 25, 2018 at 04:35:31AM +0000, Jonathan Looney wrote: > > For now, you can update through r331485 and then take TCP_BLACKBOX out = of > > your kernel config file. That won=E2=80=99t really =E2=80=9Cfix=E2=80= =9D anything, but should at > > least get you a booting system (assuming the new code from r331347 is > > really triggering a problem). > > > > > > I=E2=80=99ll take another look to see if I missed something in the comm= it. But, > at > > the moment, I=E2=80=99m hard-pressed to see how r331347 would cause the= problem > you > > describe. > > > > > > Jonathan > > > > On Sat, Mar 24, 2018 at 9:17 PM Andrew Reilly <areilly@bigpond.net.au> > > wrote: > > > > > OK, I've completed the search: r331346 works, r331347 panics > > > somewhere in the initialization of random. > > > > > > In the 331347 change (Add the "TCP Blackbox Recorder") I can't see > > > anything obvious to tweak, unfortunately. It's a fair chunk of new > > > code but it's all network-stack related, and my kernel is panicking > > > long before any network activity happens. > > > > > > Any suggestions? > > > > > > Cheers, > > > > > > Andrew > > > > > > On Sat, Mar 24, 2018 at 05:23:18PM -0600, Warner Losh wrote: > > > > Thanks Andrew... I can't recreate this on my VM nor my real hardwar= e. > > > > > > > > Warner > > > > > > > > On Sat, Mar 24, 2018 at 5:22 PM, Andrew Reilly < > areilly@bigpond.net.au> > > > > wrote: > > > > > > > > > So, r331464 crashes in the same place, on my system. r331064 sti= ll > > > boots > > > > > OK. I'll keep searching. > > > > > > > > > > One week ago there was a change to randomdev to poll for signals > every > > > so > > > > > often, as a defence against very large reads. That wouldn't have > > > > > introduced a race somewhere, > > > > > or left things in an unexpected state, perhaps? That change > (r331070) > > > by > > > > > cem@ is just a few revisions after the one that is working for me= . > > > I'll > > > > > start looking there... > > > > > > > > > > Cheers, > > > > > > > > > > Andrew > > > > > > > > > > On Sun, Mar 25, 2018 at 07:49:17AM +1100, Andrew Reilly wrote: > > > > > > Hi Warner, > > > > > > > > > > > > The breakage was in 331470, and at least one version earlier, > that I > > > > > updated past when it panicked. > > > > > > > > > > > > I'm guessing that kdb's inability to dump would be down to it n= ot > > > having > > > > > found any disk devices yet, right? So yes, bisecting to narrow > down > > > the > > > > > issue is probably the best bet. I'll try your r331464: if that > works > > > that > > > > > leaves only four or five revisions. Of course the breakage could > be > > > > > hardware specific. > > > > > > > > > > > > Cheers, > > > > > > -- > > > > > > Andrew > > > > > > > > > > _______________________________________________ > > freebsd-current@freebsd.org mailing list > > https://lists.freebsd.org/mailman/listinfo/freebsd-current > > To unsubscribe, send any mail to " > freebsd-current-unsubscribe@freebsd.org" > _______________________________________________ > freebsd-current@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org= " > --=20 Joe Maloney QA Manager / iXsystems Enterprise Storage & Servers Driven By Open Source
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAFvkmYPpxwomrYD_OG1-K-dcYnNLO6g-10OoYZK3jyDdzpF4Jg>