Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 27 Nov 2020 15:07:10 -0800
From:      Bakul Shah <bakul@iitbombay.org>
To:        Rebecca Cran <rebecca@bsdio.com>
Cc:        Hans Petter Selasky <hps@selasky.org>, freebsd-current@freebsd.org, kib@freebsd.org
Subject:   Re: panic shortly after boot when amdgpu.ko is loaded (fpu?)
Message-ID:  <916B4D57-6C8A-4510-AE29-5E289717CBCA@iitbombay.org>
In-Reply-To: <0075A3F0-C106-4970-B840-0DFAEA29DBC9@iitbombay.org>
References:  <2a0f9031-a96d-2989-4d6c-a7691c451b74@bsdio.com> <d19ff5d6-65a8-251a-693b-3ff42b60a252@selasky.org> <40ac5686-aa96-f9e4-7c9c-5dbe628af49a@bsdio.com> <0075A3F0-C106-4970-B840-0DFAEA29DBC9@iitbombay.org>

next in thread | previous in thread | raw e-mail | index | archive | help


> On Nov 27, 2020, at 1:47 PM, Bakul Shah <bakul@iitbombay.org> wrote:
>=20
>=20
>=20
>> On Nov 27, 2020, at 9:09 AM, Rebecca Cran <rebecca@bsdio.com> wrote:
>>=20
>> On 11/27/20 4:29 AM, Hans Petter Selasky wrote:
>>>=20
>>> Is the problem always triggered by hald? If you disable hald in =
rc.conf, does the system run for a longer period of time?
>>=20
>> It turns out that disabling ntpd let the system run for a longer =
period of time - until I ran "sysctl sys" at which point I got a panic.
>>=20
>> And this time the panic actually implicates amdgpu.ko, which is an =
improvement:
>>=20
>>=20
>> #9  0x0000000000000000 in ?? ()
>> #10 0xffffffff82a14c4e in amdgpu_device_get_pcie_replay_count ()
>>   from /boot/modules/amdgpu.ko
>> #11 0xffffffff82a14b80 in sysctl_handle_attr () from =
/boot/modules/amdgpu.ko
>>=20
>> #12 0xffffffff80c06cc1 in sysctl_root_handler_locked =
(oid=3D0xfffffe02133ff000,
>>    arg1=3D0xfffffe016e360980, arg2=3D-8724518803888, =
req=3D0xfffffe016e360980,
>>    tracker=3D0xfffff81099af6280) at =
/usr/src/sys/kern/kern_sysctl.c:184
>> #13 0xffffffff80c0610c in sysctl_root (oidp=3D<optimized out>,
>>    arg1=3D0xfffff810aa27e650, arg2=3D-2100190360, =
req=3D0xfffffe016e360980)
>>    at /usr/src/sys/kern/kern_sysctl.c:2211
>>=20
>>=20
>> Since it _is_ a problem in amdgpu, I'll stop this thread and re-post =
on freebsd-x11.
>=20
> FWIW, I am using amdgpu on a Ryzen 5 3500U system on a couple days old
> -current (r368025). "sysctl sys" complains about "unknown oid 'sys'".
> I am runing hald & ntpd.  I had a few amdgpu related panics initially
> but they vanished once I added
> 	PORTS_MODULES=3Dgraphics/drm-devel-kmod
> to /etc/src.conf to compile it along with the kernel. I am running
> GENERIC-NODEBUG. The machine gets rebooted when I install a new kernel
> (usually once a week).
>=20
> My guess is some weird interaction rather than something in amdgpu.

To get sysctl sys working I compiled a GENERIC kernel from today's
368108 revision and so far there are no problems.

$ sysctl sys.device.drmn0.pcie_replay_count
sys.device.drmn0.pcie_replay_count: 0

sysctl -a also works.

Last commit log on drm-devel-kmod (the last tiem may be what you're
running into):
Author: manu <manu@FreeBSD.org>
Date:   Mon Nov 9 13:37:12 2020 +0000

    drm-current-kmod/drm-devel-kmod: Update to latest version

    - Use acpi code from base (thanks to wulf@)
    - Add radeon/i386 patches (thanks to tilj@)
    - Translate O_ flags for linuxulator (thanks to Greg V)
    - Lot of linuxkpi cleanup
    - Hack for amdgpu when the IP isn't init properly, this happens
      on one of my laptop with a dGPU. We still don't support it but
      we don't panic when we load amdgpu





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?916B4D57-6C8A-4510-AE29-5E289717CBCA>