Date: Sat, 27 Apr 2019 17:57:23 +0200 From: Jakob Alvermark <jakob@alvermark.net> To: Johannes Lundberg <johalun0@gmail.com>, x11@freebsd.org Cc: Tycho Nightingale <tychon@freebsd.org> Subject: Re: drm-current-kmod-4.16.g20190424 hangs Message-ID: <06636dbc-abd0-bf48-798c-74e15096cde1@alvermark.net> In-Reply-To: <63377546-e82f-3d95-0a4b-0d0d8602d9e4@gmail.com> References: <5713985b-e97f-c7f2-2592-47a17baf8095@alvermark.net> <f8da5a4b-98c9-417f-dd8b-6bd4b4713d25@gmail.com> <88157456-fe57-997a-14f6-9d1b01c2dbc9@alvermark.net> <63377546-e82f-3d95-0a4b-0d0d8602d9e4@gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi, Tried both of them, unsuccessfully. Jakob On 2019-04-26 17:23, Johannes Lundberg wrote: > > From https://reviews.freebsd.org/D19845 Can you try the sysctls > suggested there? > > In D19845#428854 <https://reviews.freebsd.org/D19845#428854>, > @tychon <https://reviews.freebsd.org/p/tychon/> wrote: > > In D19845#428768 <https://reviews.freebsd.org/D19845#428768>, > @greg_unrelenting.technology > <https://reviews.freebsd.org/p/greg_unrelenting.technology/> > wrote: > > Some more i915 GPU testing (w/o the latest update here): after > using Firefox (opengl layers, xwayland) for some time, GPU > resets start happening > > drmn0: Resetting chip for stuck wait on rcs0 > drmn0: Resetting chip for stuck wait on rcs0 > drmn0: Resetting chip for stuck wait on rcs0 > … > DMAR0: Fault Overflow > DMAR0: vgapci0: pci:0:2:0 sid 10 fault acc 0 adt 0x0 reason 0x5 addr 2e09000 > DMAR0: Fault Overflow > DMAR0: vgapci0: pci:0:2:0 sid 10 fault acc 0 adt 0x0 reason 0x5 addr 2e09000 > > and eventually the whole system freezes if I don't quit the > compositor / switch to vt console. > > Looks like a symptom of non-translatable physical address. I've > encountered drivers which need additional work outside of the > scope of this effort. Perhaps this is the case there as I can't > any more cases in the Linux KPI where a physical address is > substituted for a DMA one. > Also, I assume this is in remap mode. Does it work in identify map > mode hw.busdma.default="bounce"? Unless there is an API which > escaped, if it works in hw.dmar.enable="0" it's not a regression > from before :-/ > > > > On 4/26/19 8:13 AM, Jakob Alvermark wrote: >> Sure: >> >> ---<<BOOT>>--- >> Copyright (c) 1992-2019 The FreeBSD Project. >> Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 >> The Regents of the University of California. All rights reserved. >> FreeBSD is a registered trademark of The FreeBSD Foundation. >> FreeBSD 13.0-CURRENT #194 r346736M: Fri Apr 26 12:26:20 CEST 2019 >> root@flyer:/usr/obj/usr/src/amd64.amd64/sys/FLYER amd64 >> FreeBSD clang version 8.0.0 (tags/RELEASE_800/final 356365) (based on >> LLVM 8.0.0) >> VT(efifb): resolution 1366x768 >> CPU: Intel(R) Pentium(R) CPU N3540 @ 2.16GHz (2166.72-MHz K8-class >> CPU) >> Origin="GenuineIntel" Id=0x30678 Family=0x6 Model=0x37 Stepping=8 >> Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> >> >> Features2=0x41d8e3bf<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,SSE4.2,MOVBE,POPCNT,TSCDLT,RDRAND> >> >> AMD Features=0x28100800<SYSCALL,NX,RDTSCP,LM> >> AMD Features2=0x101<LAHF,Prefetch> >> Structured Extended Features=0x2282<TSCADJ,SMEP,ERMS,NFPUSG> >> VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID >> TSC: P-state invariant, performance statistics >> real memory = 8589934592 (8192 MB) >> avail memory = 8120422400 (7744 MB) >> Event timer "LAPIC" quality 600 >> ACPI APIC Table: <ACRSYS ACRPRDCT> >> WARNING: L1 data cache covers fewer APIC IDs than a core (0 < 1) >> FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs >> FreeBSD/SMP: 1 package(s) x 4 core(s) >> __stack_chk_init: WARNING: Initializing stack protection with >> non-random cookies! >> __stack_chk_init: WARNING: This severely limits the benefit of >> -fstack-protector! >> ioapic0: Changing APIC ID to 2 >> ioapic0 <Version 2.0> irqs 0-86 on motherboard >> Launching APs: 2 3 1 >> Timecounter "TSC-low" frequency 1083359641 Hz quality 1000 >> Cuse v0.1.36 @ /dev/cuse >> random: entropy device external interface >> kbd1 at kbdmux0 >> module_register_init: MOD_LOAD (vesa, 0xffffffff81150570, 0) error 19 >> random: registering fast source Intel Secure Key RNG >> random: fast provider: "Intel Secure Key RNG" >> 000.000049 [4254] netmap_init netmap: loaded module >> [ath_hal] loaded >> nexus0 >> efirtc0: <EFI Realtime Clock> on motherboard >> efirtc0: registered as a time-of-day clock, resolution 1.000000s >> cryptosoft0: <software crypto> on motherboard >> acpi0: <ACRSYS ACRPRDCT> on motherboard >> acpi0: Power Button (fixed) >> unknown: I/O range not supported >> cpu0: <ACPI CPU> on acpi0 >> atrtc0: <AT realtime clock> port 0x70-0x77 on acpi0 >> atrtc0: Warning: Couldn't map I/O. >> atrtc0: registered as a time-of-day clock, resolution 1.000000s >> Event timer "RTC" frequency 32768 Hz quality 0 >> hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff irq 8 >> on acpi0 >> Timecounter "HPET" frequency 14318180 Hz quality 950 >> Event timer "HPET" frequency 14318180 Hz quality 450 >> Event timer "HPET1" frequency 14318180 Hz quality 440 >> Event timer "HPET2" frequency 14318180 Hz quality 440 >> attimer0: <AT timer> port 0x40-0x43,0x50-0x53 irq 0 on acpi0 >> Timecounter "i8254" frequency 1193182 Hz quality 0 >> Event timer "i8254" frequency 1193182 Hz quality 100 >> Timecounter "ACPI-safe" frequency 3579545 Hz quality 850 >> acpi_timer0: <24-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0 >> acpi_ec0: <Embedded Controller: GPE 0x18> port 0x62,0x66 on acpi0 >> pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 >> pcib0: Length mismatch for 3 range: 109fffff vs 10a00000 >> pci0: <ACPI PCI bus> on pcib0 >> vgapci0: <VGA-compatible display> port 0x2050-0x2057 mem >> 0x90000000-0x903fffff,0x80000000-0x8fffffff at device 2.0 on pci0 >> vgapci0: Boot video device >> ahci0: <AHCI SATA controller> port >> 0x2048-0x204f,0x205c-0x205f,0x2040-0x2047,0x2058-0x205b,0x2020-0x203f >> mem 0x9091e000-0x9091e7ff at device 19.0 on pci0 >> ahci0: AHCI v1.30 with 2 3Gbps ports, Port Multiplier not supported >> ahcich0: <AHCI channel> at channel 0 on ahci0 >> xhci0: <Intel BayTrail USB 3.0 controller> mem 0x90900000-0x9090ffff >> at device 20.0 on pci0 >> xhci0: 32 bytes context size, 64-bit DMA >> xhci0: Port routing mask set to 0xffffffff >> usbus0 on xhci0 >> usbus0: 5.0Gbps Super Speed USB v3.0 >> pci0: <encrypt/decrypt> at device 26.0 (no driver attached) >> hdac0: <Intel BayTrail HDA Controller> mem 0x90910000-0x90913fff at >> device 27.0 on pci0 >> pcib1: <ACPI PCI-PCI bridge> at device 28.0 on pci0 >> pcib1: [GIANT-LOCKED] >> pcib2: <ACPI PCI-PCI bridge> at device 28.1 on pci0 >> pcib2: [GIANT-LOCKED] >> pci1: <ACPI PCI bus> on pcib2 >> iwn0: <Intel Centrino Advanced 6235> mem 0x90600000-0x90601fff at >> device 0.0 on pci1 >> arc4random: WARNING: initial seeding bypassed the cryptographic >> random device because it was not yet seeded and the knob >> 'bypass_before_seeding' was enabled. >> pcib3: <ACPI PCI-PCI bridge> at device 28.2 on pci0 >> pcib3: [GIANT-LOCKED] >> pci2: <ACPI PCI bus> on pcib3 >> re0: <RealTek 8168/8111 B/C/CP/D/DP/E/F/G PCIe Gigabit Ethernet> port >> 0x1000-0x10ff mem 0x90500000-0x90500fff,0x90400000-0x90403fff at >> device 0.0 on pci2 >> re0: Using 1 MSI-X message >> re0: ASPM disabled >> re0: Chip rev. 0x4c000000 >> re0: MAC rev. 0x00000000 >> miibus0: <MII bus> on re0 >> rgephy0: <RTL8251/8153 1000BASE-T media interface> PHY 1 on miibus0 >> rgephy0: none, 10baseT, 10baseT-FDX, 10baseT-FDX-flow, 100baseTX, >> 100baseTX-FDX, 100baseTX-FDX-flow, 1000baseT-FDX, >> 1000baseT-FDX-master, 1000baseT-FDX-flow, 1000baseT-FDX-flow-master, >> auto, auto-flow >> re0: Using defaults for TSO: 65518/35/2048 >> re0: Ethernet address: c4:54:44:d6:95:39 >> re0: netmap queues/slots: TX 1/256, RX 1/256 >> isab0: <PCI-ISA bridge> at device 31.0 on pci0 >> isa0: <ISA bus> on isab0 >> acpi_button0: <Power Button> on acpi0 >> acpi_button1: <Sleep Button> on acpi0 >> gpio0: <Intel Baytrail GPIO Controller> iomem 0xfed0c000-0xfed0cfff >> irq 49 on acpi0 >> gpiobus0: <GPIO bus> on gpio0 >> gpioc0: <GPIO controller> on gpio0 >> gpio1: <Intel Baytrail GPIO Controller> iomem 0xfed0d000-0xfed0dfff >> irq 48 on acpi0 >> gpiobus1: <GPIO bus> on gpio1 >> gpioc1: <GPIO controller> on gpio1 >> gpio2: <Intel Baytrail GPIO Controller> iomem 0xfed0e000-0xfed0efff >> irq 50 on acpi0 >> gpiobus2: <GPIO bus> on gpio2 >> gpioc2: <GPIO controller> on gpio2 >> sdhci_acpi0: <Intel Bay Trail/Braswell eMMC 4.5/4.5.1 Controller> >> iomem 0x90a02000-0x90a02fff irq 44 on acpi0 >> mmc0: <MMC/SD bus> on sdhci_acpi0 >> sdhci_acpi1: <Intel Bay Trail/Braswell SDXC Controller> iomem >> 0x90a00000-0x90a00fff irq 47 on acpi0 >> ig4iic_acpi0: <Designware I2C Controller> iomem 0x90a07000-0x90a07fff >> irq 32 on acpi0 >> acpi_acad0: <AC Adapter> on acpi0 >> battery0: <ACPI Control Method Battery> on acpi0 >> acpi_lid0: <Control Method Lid Switch> on acpi0 >> acpi_tz0: <Thermal Zone> on acpi0 >> atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0 >> atkbd0: <AT Keyboard> irq 1 on atkbdc0 >> kbd0 at atkbd0 >> atkbd0: [GIANT-LOCKED] >> psm0: <PS/2 Mouse> irq 12 on atkbdc0 >> psm0: [GIANT-LOCKED] >> psm0: model Synaptics Touchpad, device ID 0 >> uart0: <16550 or compatible> at port 0x3f8 irq 4 flags 0x10 on isa0 >> coretemp0: <CPU On-Die Thermal Sensors> on cpu0 >> est0: <Enhanced SpeedStep Frequency Control> on cpu0 >> ZFS filesystem version: 5 >> ZFS storage pool version: features support (5000) >> Timecounters tick every 1.000 msec >> hdacc0: <Realtek ALC283 HDA CODEC> at cad 0 on hdac0 >> hdaa0: <Realtek ALC283 Audio Function Group> at nid 1 on hdacc0 >> hdaa0: Coef 0x06 val 0x2104 -> 0x2100 >> hdaa0: Coef 0x45 val 0xc429 -> 0xd429 >> hdaa0: Coef 0x1b val 0x080b -> 0x0c2b >> hdaa0: Coef 0x32 val 0x4ea3 -> 0x4ea3 >> pcm0: <Realtek ALC283 (Analog 2.0+HP/2.0)> at nid 20,33 and 18 on hdaa0 >> hdacc1: <Intel (0x2882) HDA CODEC> at cad 2 on hdac0 >> hdaa1: <Intel (0x2882) Audio Function Group> at nid 1 on hdacc1 >> pcm1: <Intel (0x2882) (HDMI/DP 8ch)> at nid 4 on hdaa1 >> ugen0.1: <0x8086 XHCI root HUB> at usbus0 >> uhub0: <0x8086 XHCI root HUB, class 9/0, rev 3.00/1.00, addr 1> on >> usbus0 >> ada0 at ahcich0 bus 0 scbus0 target 0 lun 0 >> ada0: <SAMSUNG MZ7PD128HCFV-000H1 DXM01H0Q> ACS-2 ATA SATA 3.x device >> ada0: Serial Number S1MBNSAFA22012 >> ada0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) >> ada0: Command Queueing enabled >> ada0: 122104MB (250069680 512 byte sectors) >> ada0: quirks=0x3<4K,NCQ_TRIM_BROKEN> >> mmc0: No compatible cards found on bus >> iicbus0: <Philips I2C bus> on ig4iic_acpi0 >> iicsmb0: <SMBus over I2C bridge> on iicbus0 >> smbus0: <System Management Bus> on iicsmb0 >> Trying to mount root from zfs:flyer2/ROOT/default []... >> Root mount waiting for: usbus0 >> uhub0: 7 ports with 7 removable, self powered >> Root mount waiting for: usbus0 >> ugen0.2: <vendor 0x05e3 USB2.0 Hub> at usbus0 >> uhub1 on uhub0 >> uhub1: <vendor 0x05e3 USB2.0 Hub, class 9/0, rev 2.00/85.37, addr 1> >> on usbus0 >> uhub1: 4 ports with 3 removable, self powered >> Root mount waiting for: usbus0 >> ugen0.3: <vendor 0x8087 product 0x07da> at usbus0 >> Root mount waiting for: usbus0 >> ugen0.4: <Cisco-Linksys Compact Wireless-G USB Adapter> at usbus0 >> ugen0.5: <SunplusIT INC. HD WebCam> at usbus0 >> random: unblocking device. >> drmn0: <drmn> on vgapci0 >> vgapci0: child drmn0 requested pci_enable_io >> [drm] Unable to create a private tmpfs mount, hugepage support will >> be disabled(-19). >> Successfully added WC MTRR for [0x80000000-0x8fffffff]: 0; >> [drm] Got stolen memory base 0x7b000000, size 0x4000000 >> [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). >> [drm] Driver supports precise vblank timestamp query. >> [drm] Connector VGA-1: get mode from tunables: >> [drm] - kern.vt.fb.modes.VGA-1 >> [drm] - kern.vt.fb.default_mode >> [drm] Connector DP-1: get mode from tunables: >> [drm] - kern.vt.fb.modes.DP-1 >> [drm] - kern.vt.fb.default_mode >> [drm] Connector HDMI-A-1: get mode from tunables: >> [drm] - kern.vt.fb.modes.HDMI-A-1 >> [drm] - kern.vt.fb.default_mode >> [drm] Connector eDP-1: get mode from tunables: >> [drm] - kern.vt.fb.modes.eDP-1 >> [drm] - kern.vt.fb.default_mode >> [drm] Initialized i915 1.6.0 20171222 for drmn0 on minor 0 >> ichwd0: <Intel Bay Trail SoC watchdog timer> on isa0 >> VT: Replacing driver "efifb" with new "fb". >> start FB_INFO: >> type=11 height=768 width=1366 depth=32 >> cmsize=16 size=4227072 >> pbase=0x80000000 vbase=0xfffff80080000000 >> name=drmn0 flags=0x0 stride=5504 bpp=32 >> cmap[0]=0 cmap[1]=7f0000 cmap[2]=7f00 cmap[3]=c4a000 >> end FB_INFO >> drmn0: fb0: inteldrmfb frame buffer device >> wlan0: Ethernet address: 80:00:0b:5a:cd:23 >> lo0: link state changed to UP >> iwn0: iwn_read_firmware: ucode rev=0x12a80601 >> re0: link state changed to DOWN >> wlan0: link state changed to UP >> ubt0 on uhub1 >> ubt0: <vendor 0x8087 product 0x07da, class 224/1, rev 2.00/78.69, >> addr 2> on usbus0 >> rum0 on uhub1 >> rum0: <Cisco-Linksys Compact Wireless-G USB Adapter, class 0/0, rev >> 2.00/0.01, addr 3> on usbus0 >> rum0: MAC/BBP RT2573 (rev 0x2573a), RF RT2528 >> WARNING: attempt to domain_add(bluetooth) after domainfinalize() >> WARNING: attempt to domain_add(netgraph) after domainfinalize() >> ubt0: ubt_bulk_read_callback:979: bulk-in transfer failed: >> USB_ERR_STALLED >> wlan1: Ethernet address: 00:18:f8:34:d2:8d >> wlan1: link state changed to UP >> . >> Security policy loaded: MAC/ntpd (mac_ntpd) >> . >> [drm] GPU HANG: ecode 7:0:0x86f2fffd, in Xorg [100491], reason: Hang >> on rcs0, action: reset >> drmn0: Resetting chip after gpu hang >> drmn0: i915_reset_device timed out, cancelling all in-flight rendering. >> >> On 2019-04-26 16:48, Johannes Lundberg wrote: >>> Hi >>> >>> Hmm, this is not good. The only thing I can think of is the dma changes >>> to base linuxkpi... >>> >>> Can you share a dmesg output from boot to crash, or at least to after >>> driver is loaded? >>> >>> Tycho, any ideas? >>> >>> >>> On 4/26/19 5:00 AM, Jakob Alvermark wrote: >>>> Hi, >>>> >>>> >>>> When I upgraded -current to r346730 drm-current-kmod-4.16.g20190323 >>>> wouldn't load, "device_attach: drmn0 attach returned 19" >>>> >>>> So I upgraded drm-current-kmod to 4.16.g20190424. >>>> >>>> It loads fine, but shortly after starting Xorg the screen freezes. >>>> >>>> The only way out is pressing the power button, it shuts down cleanly. >>>> >>>> /var/log/messages shows this: >>>> >>>> kernel: [drm] GPU HANG: ecode 7:0:0x60ac6ee9, in Xorg [100385], >>>> reason: Hang on rcs0, action: reset >>>> kernel: drmn0: Resetting chip after gpu hang >>>> syslogd: last message repeated 1 times >>>> kernel: drmn0: i915_reset_device timed out, cancelling all in-flight >>>> rendering. >>>> kernel: . >>>> >>>> Tried once more, same thing happened: >>>> >>>> kernel: [drm] GPU HANG: ecode 7:0:0x86f2fffd, in Xorg [100491], >>>> reason: Hang on rcs0, action: reset >>>> kernel: drmn0: Resetting chip after gpu hang >>>> syslogd: last message repeated 1 times >>>> kernel: drmn0: i915_reset_device timed out, cancelling all in-flight >>>> rendering. >>>> kernel: . >>>> >>>> Reverting back to drm-current-kmod-4.16.g20190323 and -current to >>>> r346593 (yay boot environments!) it is stable. >>>> >>>> This is on a laptop with CPU: Intel(R) Pentium(R) CPU N3540 @ >>>> 2.16GHz (2166.72-MHz K8-class CPU) >>>> Baytrail graphics. >>>> >>>> >>>> Jakob >>>> >>>> _______________________________________________ >>>> freebsd-x11@freebsd.org mailing list >>>> https://lists.freebsd.org/mailman/listinfo/freebsd-x11 >>>> To unsubscribe, send any mail to "freebsd-x11-unsubscribe@freebsd.org"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?06636dbc-abd0-bf48-798c-74e15096cde1>