Date: Thu, 21 Apr 2022 08:03:39 -0700 From: Steve Kargl <sgk@troutmask.apl.washington.edu> To: Emmanuel Vadot <manu@bidouilliste.com> Cc: freebsd-current@freebsd.org, freebsd-x11@freebsd.org Subject: Re: Daily black screen of death Message-ID: <YmFyS/%2BehzOoT33C@troutmask.apl.washington.edu> In-Reply-To: <20220421094404.11cdf22e45c8bee5fc749ba5@bidouilliste.com> References: <Yl8AQPZOTRpkX4y2@troutmask.apl.washington.edu> <20220421094404.11cdf22e45c8bee5fc749ba5@bidouilliste.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Apr 21, 2022 at 09:44:04AM +0200, Emmanuel Vadot wrote: > > Hello Steve, > > On Tue, 19 Apr 2022 11:32:32 -0700 > Steve Kargl <sgk@troutmask.apl.washington.edu> wrote: > > > FYI, > > > > I'm experiencing an almost daily black screen of death panic. > > Kernel, world, drm-current-kmod, and gpu-firmware-kmod were > > all rebuilt and installed at the same time. Uname shows > > > > FreeBSD 14.0-CURRENT #0 main-n254360-eb9d205fa69: Tue Apr 5 13:49:47 PDT 2022 > > > > So, April 5th sources. > > > > The panic results in a keyboard lock and no dump. The system > > does not have a serial console. Only recourse is a hard rest. > > > > Hand transcribed from photo > > > > _sleep() at _sleep+0x38a/frame 0xfffffe012b7c0680 > > buf_daemon_shutdown() at buf_daemon_shutdown+0x6b/frame 0xfffffe012b7c06a0 > > kern_reboot() at kern_reboot+0x2ae/frame 0xfffffe012b7c06e0 > > vpanic() at vpanic+0x1ee/frame 0xfffffe012b7c0730 > > panic() at panic+0x43/frame 0xfffffe012b7c0790 > > > > Above repeats 100s of time scrolling off the screen with ever > > increasing frame pointer. > > > > Final message, > > > > mi_switch() at mi_switch+0x18e/frame 0xfffffe012b7c14b0 > > __mtx_lock_sleep() at __mtx_lock_sleep+0x173/frame 0xfffffe012b7c1510 > > __mtx_lock_flags() at __mtx_lock_flags+0xc0/frame 0xfffffe012b7c1550 > > linux_wake_up() at linux_wake_up+0x38/frame 0xfffffe012b7c15a0 > > radeon_fence_is_signaled() at radeon_fence_is_signaled+0x99/frame 0xfffffe012b7c15f0 > > dma_resv_add_shared_fence() at dma_resv_add_shared_fence+0x99/frame 0xfffffe012b7c1640 > > ttm_eu_fence_buffer_objects() at ttm_eu_fence_buffer_objects+0x79/frame 0xfffffe012b7c1680 > > radeon_cs_parser_fini() at radeon_cs_parser_fini+0x53/frame 0xfffffe012b7c16b0 > > radeaon_cs_ioctl() at radeaon_cs_ioctl+0x75e/frame 0xfffffe012b7c1b30 > > drm_ioctl_kernel() at drm_ioctl_kernel+0xc7/frame 0xfffffe012b7c1b80 > > drm_ioctl() at drm_ioctl+0x2c3/frame 0xfffffe012b7c1c70 > > linux_file_ioctl() at linux_file_ioctl+0x309/frame 0xfffffe012b7c1cd0 > > kern_ioctl() at kern_ioctl+0x1dc/frame 0xfffffe012b7c1d40 > > sys_ioctl() at sys_ioctl+0x121/frame 0xfffffe012b7c1e10 > > amd64_syscall() at amd64_syscall+0x108/frame 0xfffffe012b7c1f30 > > fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe012b7c1f30 > > --- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x36a096c34ea, rsp = 0x3fa11e623eb8, \ > > rbp = 0x3fa11e623ee0 --- > > panic: _sleep: curthread not running > > cpuid = 4 > > time = 1650389478 > > KDB: stack backtrace: > > > > One common trigger appears to be the use of firefox-99.0,2 from > > the ports collection. > > > > -- > > Steve > > > > What version of drm are you using ? > Since when do you experience this ? > drm as not changed much for a long time now except adapting a few > files for new linuxkpi addition. > drm-current-kmod-5.4.144.g20220223 gpu-firmware-kmod-g20210330 I upgraded a Jan 2022 kernel+world+drm+gpu 2 to 3 weeks ago. The Jan 2022 system just worked. I've had the problem since the upgrade. I've also rebuild firefox, libdrm, the X-server, and X11 libraries. Still see the panic. As the panic messages scroll off the screen, I'm not sure the above last bit is the actual cause or simply a side effect. Some additional info from a dmesg after the reboot. WARNING: / was not properly dismounted [drm] radeon kernel modesetting enabled. drmn0: <drmn> on vgapci0 vgapci0: child drmn0 requested pci_enable_io vgapci0: child drmn0 requested pci_enable_io sysctl_warn_reuse: can't re-use a leaf (hw.dri.debug)! [drm] initializing kernel modesetting (CAICOS 0x1002:0x6779 0x1092:0x6450 0x00). [drm ERROR :radeon_atombios_init] Unable to find PCI I/O BAR; using MMIO for ATOM IIO ATOM BIOS: C26401 drmn0: VRAM: 1024M 0x0000000000000000 - 0x000000003FFFFFFF (1024M used) drmn0: GTT: 1024M 0x0000000040000000 - 0x000000007FFFFFFF [drm] Detected VRAM RAM=1024M, BAR=256M [drm] RAM width 64bits DDR [TTM] Zone kernel: Available graphics memory: 8359708 KiB [TTM] Zone dma32: Available graphics memory: 2097152 KiB [TTM] Initializing pool allocator [drm] radeon: 1024M of VRAM memory ready [drm] radeon: 1024M of GTT memory ready. [drm] Loading CAICOS Microcode drmn0: successfully loaded firmware image 'radeon/CAICOS_pfp.bin' drmn0: successfully loaded firmware image 'radeon/CAICOS_me.bin' drmn0: successfully loaded firmware image 'radeon/BTC_rlc.bin' drmn0: successfully loaded firmware image 'radeon/CAICOS_mc.bin' drmn0: successfully loaded firmware image 'radeon/CAICOS_smc.bin' [drm] Internal thermal controller with fan control [drm] radeon: dpm initialized drmn0: successfully loaded firmware image 'radeon/SUMO_uvd.bin' [drm] GART: num cpu pages 262144, num gpu pages 262144 [drm] enabling PCIE gen 2 link speeds, disable with radeon.pcie_gen2=0 [drm] PCIE GART of 1024M enabled (table at 0x0000000000162000). drmn0: WB enabled drmn0: fence driver on ring 0 use gpu addr 0x0000000040000c00 and cpu addr 0x0xfffff8000be96c00 drmn0: fence driver on ring 3 use gpu addr 0x0000000040000c0c and cpu addr 0x0xfffff8000be96c0c drmn0: fence driver on ring 5 use gpu addr 0x0000000000072118 and cpu addr 0x0xfffff800c0072118 [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). [drm] Driver supports precise vblank timestamp query. drmn0: radeon: MSI limited to 32-bit drmn0: radeon: using MSI. [drm] radeon: irq initialized. [drm] ring test on 0 succeeded in 4 usecs [drm] ring test on 3 succeeded in 6 usecs [drm] ring test on 5 succeeded in 3 usecs [drm] UVD initialized successfully. [drm] ib test on ring 0 succeeded in 0 usecs [drm] ib test on ring 3 succeeded in 0 usecs [drm] ib test on ring 5 succeeded [drm] Connector HDMI-A-1: get mode from tunables: [drm] - kern.vt.fb.modes.HDMI-A-1 [drm] - kern.vt.fb.default_mode [drm] Connector DVI-I-1: get mode from tunables: [drm] - kern.vt.fb.modes.DVI-I-1 [drm] - kern.vt.fb.default_mode [drm] Connector VGA-1: get mode from tunables: [drm] - kern.vt.fb.modes.VGA-1 [drm] - kern.vt.fb.default_mode [drm] Radeon Display Connectors [drm] Connector 0: [drm] HDMI-A-1 [drm] HPD2 [drm] DDC: 0x6460 0x6460 0x6464 0x6464 0x6468 0x6468 0x646c 0x646c [drm] Encoders: [drm] DFP1: INTERNAL_UNIPHY1 [drm] Connector 1: [drm] DVI-I-1 [drm] HPD4 [drm] DDC: 0x6450 0x6450 0x6454 0x6454 0x6458 0x6458 0x645c 0x645c [drm] Encoders: [drm] DFP2: INTERNAL_UNIPHY [drm] Connector 2: [drm] VGA-1 [drm] DDC: 0x6430 0x6430 0x6434 0x6434 0x6438 0x6438 0x643c 0x643c [drm] Encoders: [drm] CRT1: INTERNAL_KLDSCP_DAC1 [drm] fb mappable at 0xC0363000 [drm] vram apper at 0xC0000000 [drm] size 8294400 [drm] fb depth is 24 [drm] pitch is 7680 WARNING: Device "fb" is Giant locked and may be deleted before FreeBSD 14.0. VT: Replacing driver "vga" with new "fb". taskqueue_drain with the following non-sleepable locks held: exclusive sleep mutex vtdev (vtdev) r = 0 (0xffffffff80c8a750) locked @ /usr/src/sys/dev/vt/vt_core.c:3061 stack backtrace: #0 0xffffffff80690415 at witness_debugger+0x65 #1 0xffffffff80691579 at witness_warn+0x3e9 #2 0xffffffff80683363 at taskqueue_drain+0x33 #3 0xffffffff818b6383 at vt_kms_postswitch+0x73 #4 0xffffffff804d5d04 at vt_fb_init+0xf4 #5 0xffffffff804dcb89 at vt_replace_backend+0x109 #6 0xffffffff804d5e13 at vt_fb_attach+0x13 #7 0xffffffff818b6e80 at linux_register_framebuffer+0x510 #8 0xffffffff818be1f9 at __drm_fb_helper_initial_config_and_unlock+0x459 #9 0xffffffff817b7f5e at radeon_fbdev_init+0xde #10 0xffffffff817b3cb6 at radeon_modeset_init+0x8d6 #11 0xffffffff817c095a at radeon_driver_load_kms+0x16a #12 0xffffffff8188e5f6 at drm_dev_register+0xe6 #13 0xffffffff817b7160 at radeon_pci_probe+0x230 #14 0xffffffff8083e151 at linux_pci_attach_device+0x431 #15 0xffffffff8065cb71 at device_attach+0x3c1 #16 0xffffffff8065e820 at bus_generic_driver_added+0x90 #17 0xffffffff8065a3d9 at devclass_driver_added+0x39 Sleeping on "tq_drain" with the following non-sleepable locks held: exclusive sleep mutex vtdev (vtdev) r = 0 (0xffffffff80c8a750) locked @ /usr/src/sys/dev/vt/vt_core.c:3061 stack backtrace: #0 0xffffffff80690415 at witness_debugger+0x65 #1 0xffffffff80691579 at witness_warn+0x3e9 #2 0xffffffff80632494 at _sleep+0x54 #3 0xffffffff8068342b at taskqueue_drain+0xfb #4 0xffffffff818b6383 at vt_kms_postswitch+0x73 #5 0xffffffff804d5d04 at vt_fb_init+0xf4 #6 0xffffffff804dcb89 at vt_replace_backend+0x109 #7 0xffffffff804d5e13 at vt_fb_attach+0x13 #8 0xffffffff818b6e80 at linux_register_framebuffer+0x510 #9 0xffffffff818be1f9 at __drm_fb_helper_initial_config_and_unlock+0x459 #10 0xffffffff817b7f5e at radeon_fbdev_init+0xde #11 0xffffffff817b3cb6 at radeon_modeset_init+0x8d6 #12 0xffffffff817c095a at radeon_driver_load_kms+0x16a #13 0xffffffff8188e5f6 at drm_dev_register+0xe6 #14 0xffffffff817b7160 at radeon_pci_probe+0x230 #15 0xffffffff8083e151 at linux_pci_attach_device+0x431 #16 0xffffffff8065cb71 at device_attach+0x3c1 #17 0xffffffff8065e820 at bus_generic_driver_added+0x90 lock order reversal: (Giant after non-sleepable) 1st 0xffffffff80c8a750 vtdev (vtdev, sleep mutex) @ /usr/src/sys/dev/vt/vt_core.c:3061 2nd 0xffffffff80c02840 Giant (Giant, sleep mutex) @ /usr/src/sys/kern/kern_synch.c:232 lock order Giant -> vtdev established at: #0 0xffffffff8068f733 at witness_checkorder+0x323 #1 0xffffffff80609689 at __mtx_lock_flags+0x99 #2 0xffffffff804dc186 at vt_upgrade+0x2d6 #3 0xffffffff805b98a3 at mi_startup+0x123 #4 0xffffffff802ca3b2 at btext+0x22 lock order vtdev -> Giant attempted at: #0 0xffffffff8068ffdb at witness_checkorder+0xbcb #1 0xffffffff80609689 at __mtx_lock_flags+0x99 #2 0xffffffff8063277d at _sleep+0x33d #3 0xffffffff8068342b at taskqueue_drain+0xfb #4 0xffffffff818b6383 at vt_kms_postswitch+0x73 #5 0xffffffff804d5d04 at vt_fb_init+0xf4 #6 0xffffffff804dcb89 at vt_replace_backend+0x109 #7 0xffffffff804d5e13 at vt_fb_attach+0x13 #8 0xffffffff818b6e80 at linux_register_framebuffer+0x510 #9 0xffffffff818be1f9 at __drm_fb_helper_initial_config_and_unlock+0x459 #10 0xffffffff817b7f5e at radeon_fbdev_init+0xde #11 0xffffffff817b3cb6 at radeon_modeset_init+0x8d6 #12 0xffffffff817c095a at radeon_driver_load_kms+0x16a #13 0xffffffff8188e5f6 at drm_dev_register+0xe6 #14 0xffffffff817b7160 at radeon_pci_probe+0x230 #15 0xffffffff8083e151 at linux_pci_attach_device+0x431 #16 0xffffffff8065cb71 at device_attach+0x3c1 #17 0xffffffff8065e820 at bus_generic_driver_added+0x90 start FB_INFO: type=11 height=1080 width=1920 depth=32 cmsize=16 size=8294400 pbase=0xc0363000 vbase=0xfffff800c0363000 name=drmn0 flags=0x0 stride=7680 bpp=32 cmap[0]=0 cmap[1]=7f0000 cmap[2]=7f00 cmap[3]=c4a000 end FB_INFO drmn0: fb0: radeondrmfb frame buffer device [drm] Initialized radeon 2.50.0 20080528 for drmn0 on minor 0 -- steve
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YmFyS/%2BehzOoT33C>