Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 21 Apr 2022 08:03:39 -0700
From:      Steve Kargl <sgk@troutmask.apl.washington.edu>
To:        Emmanuel Vadot <manu@bidouilliste.com>
Cc:        freebsd-current@freebsd.org, freebsd-x11@freebsd.org
Subject:   Re: Daily black screen of death
Message-ID:  <YmFyS/%2BehzOoT33C@troutmask.apl.washington.edu>
In-Reply-To: <20220421094404.11cdf22e45c8bee5fc749ba5@bidouilliste.com>
References:  <Yl8AQPZOTRpkX4y2@troutmask.apl.washington.edu> <20220421094404.11cdf22e45c8bee5fc749ba5@bidouilliste.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Apr 21, 2022 at 09:44:04AM +0200, Emmanuel Vadot wrote:
> 
>  Hello Steve,
> 
> On Tue, 19 Apr 2022 11:32:32 -0700
> Steve Kargl <sgk@troutmask.apl.washington.edu> wrote:
> 
> > FYI,
> > 
> > I'm experiencing an almost daily black screen of death panic.
> > Kernel, world, drm-current-kmod, and gpu-firmware-kmod were
> > all rebuilt and installed at the same time.  Uname shows
> > 
> > FreeBSD 14.0-CURRENT #0 main-n254360-eb9d205fa69: Tue Apr 5 13:49:47 PDT 2022
> > 
> > So, April 5th sources.
> > 
> > The panic results in a keyboard lock and no dump.  The system
> > does not have a serial console.  Only recourse is a hard rest.
> > 
> > Hand transcribed from photo
> > 
> > _sleep() at _sleep+0x38a/frame 0xfffffe012b7c0680
> > buf_daemon_shutdown() at buf_daemon_shutdown+0x6b/frame 0xfffffe012b7c06a0
> > kern_reboot() at kern_reboot+0x2ae/frame 0xfffffe012b7c06e0
> > vpanic() at vpanic+0x1ee/frame 0xfffffe012b7c0730
> > panic() at panic+0x43/frame 0xfffffe012b7c0790
> > 
> > Above repeats 100s of time scrolling off the screen with ever
> > increasing frame pointer.
> > 
> > Final message,
> > 
> > mi_switch() at mi_switch+0x18e/frame 0xfffffe012b7c14b0
> > __mtx_lock_sleep() at __mtx_lock_sleep+0x173/frame 0xfffffe012b7c1510
> > __mtx_lock_flags() at __mtx_lock_flags+0xc0/frame 0xfffffe012b7c1550
> > linux_wake_up() at linux_wake_up+0x38/frame 0xfffffe012b7c15a0
> > radeon_fence_is_signaled() at radeon_fence_is_signaled+0x99/frame 0xfffffe012b7c15f0
> > dma_resv_add_shared_fence() at dma_resv_add_shared_fence+0x99/frame 0xfffffe012b7c1640
> > ttm_eu_fence_buffer_objects() at ttm_eu_fence_buffer_objects+0x79/frame 0xfffffe012b7c1680
> > radeon_cs_parser_fini() at radeon_cs_parser_fini+0x53/frame 0xfffffe012b7c16b0
> > radeaon_cs_ioctl() at radeaon_cs_ioctl+0x75e/frame 0xfffffe012b7c1b30
> > drm_ioctl_kernel() at drm_ioctl_kernel+0xc7/frame 0xfffffe012b7c1b80
> > drm_ioctl() at drm_ioctl+0x2c3/frame 0xfffffe012b7c1c70
> > linux_file_ioctl() at linux_file_ioctl+0x309/frame 0xfffffe012b7c1cd0
> > kern_ioctl() at kern_ioctl+0x1dc/frame 0xfffffe012b7c1d40
> > sys_ioctl() at sys_ioctl+0x121/frame 0xfffffe012b7c1e10
> > amd64_syscall() at amd64_syscall+0x108/frame 0xfffffe012b7c1f30
> > fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe012b7c1f30
> > --- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x36a096c34ea, rsp = 0x3fa11e623eb8, \
> >     rbp = 0x3fa11e623ee0 ---
> > panic: _sleep: curthread not running
> > cpuid = 4
> > time = 1650389478
> > KDB: stack backtrace:
> > 
> > One common trigger appears to be the use of firefox-99.0,2 from
> > the ports collection.  
> > 
> > -- 
> > Steve
> > 
> 
>  What version of drm are you using ?
>  Since when do you experience this ?
>  drm as not changed much for a long time now except adapting a few
> files for new linuxkpi addition.
> 

drm-current-kmod-5.4.144.g20220223
gpu-firmware-kmod-g20210330

I upgraded a Jan 2022 kernel+world+drm+gpu 2 to 3 weeks ago.
The Jan 2022 system just worked.  I've had the problem since
the upgrade.  I've also rebuild firefox, libdrm, the X-server,
and X11 libraries.  Still see the panic.

As the panic messages scroll off the screen, I'm not sure the
above last bit is the actual cause or simply a side effect.

Some additional info from a dmesg after the reboot.


WARNING: / was not properly dismounted
[drm] radeon kernel modesetting enabled.
drmn0: <drmn> on vgapci0
vgapci0: child drmn0 requested pci_enable_io
vgapci0: child drmn0 requested pci_enable_io
sysctl_warn_reuse: can't re-use a leaf (hw.dri.debug)!
[drm] initializing kernel modesetting (CAICOS 0x1002:0x6779 0x1092:0x6450 0x00).
[drm ERROR :radeon_atombios_init] Unable to find PCI I/O BAR; using MMIO for ATOM IIO
ATOM BIOS: C26401
drmn0: VRAM: 1024M 0x0000000000000000 - 0x000000003FFFFFFF (1024M used)
drmn0: GTT: 1024M 0x0000000040000000 - 0x000000007FFFFFFF
[drm] Detected VRAM RAM=1024M, BAR=256M
[drm] RAM width 64bits DDR
[TTM] Zone  kernel: Available graphics memory: 8359708 KiB
[TTM] Zone   dma32: Available graphics memory: 2097152 KiB
[TTM] Initializing pool allocator
[drm] radeon: 1024M of VRAM memory ready
[drm] radeon: 1024M of GTT memory ready.
[drm] Loading CAICOS Microcode
drmn0: successfully loaded firmware image 'radeon/CAICOS_pfp.bin'
drmn0: successfully loaded firmware image 'radeon/CAICOS_me.bin'
drmn0: successfully loaded firmware image 'radeon/BTC_rlc.bin'
drmn0: successfully loaded firmware image 'radeon/CAICOS_mc.bin'
drmn0: successfully loaded firmware image 'radeon/CAICOS_smc.bin'
[drm] Internal thermal controller with fan control
[drm] radeon: dpm initialized
drmn0: successfully loaded firmware image 'radeon/SUMO_uvd.bin'
[drm] GART: num cpu pages 262144, num gpu pages 262144
[drm] enabling PCIE gen 2 link speeds, disable with radeon.pcie_gen2=0
[drm] PCIE GART of 1024M enabled (table at 0x0000000000162000).
drmn0: WB enabled
drmn0: fence driver on ring 0 use gpu addr 0x0000000040000c00 and cpu addr 0x0xfffff8000be96c00
drmn0: fence driver on ring 3 use gpu addr 0x0000000040000c0c and cpu addr 0x0xfffff8000be96c0c
drmn0: fence driver on ring 5 use gpu addr 0x0000000000072118 and cpu addr 0x0xfffff800c0072118
[drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[drm] Driver supports precise vblank timestamp query.
drmn0: radeon: MSI limited to 32-bit
drmn0: radeon: using MSI.
[drm] radeon: irq initialized.
[drm] ring test on 0 succeeded in 4 usecs
[drm] ring test on 3 succeeded in 6 usecs
[drm] ring test on 5 succeeded in 3 usecs
[drm] UVD initialized successfully.
[drm] ib test on ring 0 succeeded in 0 usecs
[drm] ib test on ring 3 succeeded in 0 usecs
[drm] ib test on ring 5 succeeded
[drm] Connector HDMI-A-1: get mode from tunables:
[drm]   - kern.vt.fb.modes.HDMI-A-1
[drm]   - kern.vt.fb.default_mode
[drm] Connector DVI-I-1: get mode from tunables:
[drm]   - kern.vt.fb.modes.DVI-I-1
[drm]   - kern.vt.fb.default_mode
[drm] Connector VGA-1: get mode from tunables:
[drm]   - kern.vt.fb.modes.VGA-1
[drm]   - kern.vt.fb.default_mode
[drm] Radeon Display Connectors
[drm] Connector 0:
[drm]   HDMI-A-1
[drm]   HPD2
[drm]   DDC: 0x6460 0x6460 0x6464 0x6464 0x6468 0x6468 0x646c 0x646c
[drm]   Encoders:
[drm]     DFP1: INTERNAL_UNIPHY1
[drm] Connector 1:
[drm]   DVI-I-1
[drm]   HPD4
[drm]   DDC: 0x6450 0x6450 0x6454 0x6454 0x6458 0x6458 0x645c 0x645c
[drm]   Encoders:
[drm]     DFP2: INTERNAL_UNIPHY
[drm] Connector 2:
[drm]   VGA-1
[drm]   DDC: 0x6430 0x6430 0x6434 0x6434 0x6438 0x6438 0x643c 0x643c
[drm]   Encoders:
[drm]     CRT1: INTERNAL_KLDSCP_DAC1
[drm] fb mappable at 0xC0363000
[drm] vram apper at 0xC0000000
[drm] size 8294400
[drm] fb depth is 24
[drm]    pitch is 7680
WARNING: Device "fb" is Giant locked and may be deleted before FreeBSD 14.0.
VT: Replacing driver "vga" with new "fb".
taskqueue_drain with the following non-sleepable locks held:
exclusive sleep mutex vtdev (vtdev) r = 0 (0xffffffff80c8a750) locked @ /usr/src/sys/dev/vt/vt_core.c:3061
stack backtrace:
#0 0xffffffff80690415 at witness_debugger+0x65
#1 0xffffffff80691579 at witness_warn+0x3e9
#2 0xffffffff80683363 at taskqueue_drain+0x33
#3 0xffffffff818b6383 at vt_kms_postswitch+0x73
#4 0xffffffff804d5d04 at vt_fb_init+0xf4
#5 0xffffffff804dcb89 at vt_replace_backend+0x109
#6 0xffffffff804d5e13 at vt_fb_attach+0x13
#7 0xffffffff818b6e80 at linux_register_framebuffer+0x510
#8 0xffffffff818be1f9 at __drm_fb_helper_initial_config_and_unlock+0x459
#9 0xffffffff817b7f5e at radeon_fbdev_init+0xde
#10 0xffffffff817b3cb6 at radeon_modeset_init+0x8d6
#11 0xffffffff817c095a at radeon_driver_load_kms+0x16a
#12 0xffffffff8188e5f6 at drm_dev_register+0xe6
#13 0xffffffff817b7160 at radeon_pci_probe+0x230
#14 0xffffffff8083e151 at linux_pci_attach_device+0x431
#15 0xffffffff8065cb71 at device_attach+0x3c1
#16 0xffffffff8065e820 at bus_generic_driver_added+0x90
#17 0xffffffff8065a3d9 at devclass_driver_added+0x39
Sleeping on "tq_drain" with the following non-sleepable locks held:
exclusive sleep mutex vtdev (vtdev) r = 0 (0xffffffff80c8a750) locked @ /usr/src/sys/dev/vt/vt_core.c:3061
stack backtrace:
#0 0xffffffff80690415 at witness_debugger+0x65
#1 0xffffffff80691579 at witness_warn+0x3e9
#2 0xffffffff80632494 at _sleep+0x54
#3 0xffffffff8068342b at taskqueue_drain+0xfb
#4 0xffffffff818b6383 at vt_kms_postswitch+0x73
#5 0xffffffff804d5d04 at vt_fb_init+0xf4
#6 0xffffffff804dcb89 at vt_replace_backend+0x109
#7 0xffffffff804d5e13 at vt_fb_attach+0x13
#8 0xffffffff818b6e80 at linux_register_framebuffer+0x510
#9 0xffffffff818be1f9 at __drm_fb_helper_initial_config_and_unlock+0x459
#10 0xffffffff817b7f5e at radeon_fbdev_init+0xde
#11 0xffffffff817b3cb6 at radeon_modeset_init+0x8d6
#12 0xffffffff817c095a at radeon_driver_load_kms+0x16a
#13 0xffffffff8188e5f6 at drm_dev_register+0xe6
#14 0xffffffff817b7160 at radeon_pci_probe+0x230
#15 0xffffffff8083e151 at linux_pci_attach_device+0x431
#16 0xffffffff8065cb71 at device_attach+0x3c1
#17 0xffffffff8065e820 at bus_generic_driver_added+0x90
lock order reversal: (Giant after non-sleepable)
 1st 0xffffffff80c8a750 vtdev (vtdev, sleep mutex) @ /usr/src/sys/dev/vt/vt_core.c:3061
 2nd 0xffffffff80c02840 Giant (Giant, sleep mutex) @ /usr/src/sys/kern/kern_synch.c:232
lock order Giant -> vtdev established at:
#0 0xffffffff8068f733 at witness_checkorder+0x323
#1 0xffffffff80609689 at __mtx_lock_flags+0x99
#2 0xffffffff804dc186 at vt_upgrade+0x2d6
#3 0xffffffff805b98a3 at mi_startup+0x123
#4 0xffffffff802ca3b2 at btext+0x22
lock order vtdev -> Giant attempted at:
#0 0xffffffff8068ffdb at witness_checkorder+0xbcb
#1 0xffffffff80609689 at __mtx_lock_flags+0x99
#2 0xffffffff8063277d at _sleep+0x33d
#3 0xffffffff8068342b at taskqueue_drain+0xfb
#4 0xffffffff818b6383 at vt_kms_postswitch+0x73
#5 0xffffffff804d5d04 at vt_fb_init+0xf4
#6 0xffffffff804dcb89 at vt_replace_backend+0x109
#7 0xffffffff804d5e13 at vt_fb_attach+0x13
#8 0xffffffff818b6e80 at linux_register_framebuffer+0x510
#9 0xffffffff818be1f9 at __drm_fb_helper_initial_config_and_unlock+0x459
#10 0xffffffff817b7f5e at radeon_fbdev_init+0xde
#11 0xffffffff817b3cb6 at radeon_modeset_init+0x8d6
#12 0xffffffff817c095a at radeon_driver_load_kms+0x16a
#13 0xffffffff8188e5f6 at drm_dev_register+0xe6
#14 0xffffffff817b7160 at radeon_pci_probe+0x230
#15 0xffffffff8083e151 at linux_pci_attach_device+0x431
#16 0xffffffff8065cb71 at device_attach+0x3c1
#17 0xffffffff8065e820 at bus_generic_driver_added+0x90
start FB_INFO:
type=11 height=1080 width=1920 depth=32
cmsize=16 size=8294400
pbase=0xc0363000 vbase=0xfffff800c0363000
name=drmn0 flags=0x0 stride=7680 bpp=32
cmap[0]=0 cmap[1]=7f0000 cmap[2]=7f00 cmap[3]=c4a000
end FB_INFO
drmn0: fb0: radeondrmfb frame buffer device
[drm] Initialized radeon 2.50.0 20080528 for drmn0 on minor 0

-- 
steve



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YmFyS/%2BehzOoT33C>