From owner-freebsd-current@freebsd.org Fri Nov 27 23:07:15 2020 Return-Path: Delivered-To: freebsd-current@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id ECC1146B855 for ; Fri, 27 Nov 2020 23:07:15 +0000 (UTC) (envelope-from bakul@iitbombay.org) Received: from mail-pf1-x433.google.com (mail-pf1-x433.google.com [IPv6:2607:f8b0:4864:20::433]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4CjVdG5xxNz4ZQJ for ; Fri, 27 Nov 2020 23:07:14 +0000 (UTC) (envelope-from bakul@iitbombay.org) Received: by mail-pf1-x433.google.com with SMTP id b6so5716475pfp.7 for ; Fri, 27 Nov 2020 15:07:14 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=g5HYv8IfJGwABUWxnFGPqN9b/T6trSSsCPV7aNy1I2U=; b=cGqmLNLWtZz4M52oMJNkbsRPNtVUV+M4+8fQ97u6M9zmVMA26Qien7f2hUfua8CbtB mMh7nDdsqEKq3wLdsWbqrTOexIIZYP6QfyFtr+kbzD5PN0RN2gv3sotQ4YOEDFb6KYfI HcLncqur2FAH2ueHn58pqoQYNTo78ncJbrz9FhnO5qug5KLn22AO4rvMXdn9+LES9LMW uD83FM9fZOduLKgjG/i0KP2sTPqV7bSuy4hgGF7mkgx7+7f3W57EJNWjUmk/cO0Nz4UI RnJjO6ZE9V2gwBLqSa3E03pxXqn2jEb76/hhDPeTB7WS86pLeVezyFiYKVwxhbCD12gd X6Lg== X-Gm-Message-State: AOAM533CohlDGE4D9ejUurolJbco9UNtwtTKvHv25src6mDqHjBWNQON ngklNe6wF2k3RBe+yK+PxeeHdg== X-Google-Smtp-Source: ABdhPJyHcB+O2lK7poPG2yIgFkl/wrz1XFoeUse15/e1Kt0HYi1p4vj1VKK4hAaVatuMWOwpcATaHQ== X-Received: by 2002:a17:90a:b114:: with SMTP id z20mr12568136pjq.14.1606518433272; Fri, 27 Nov 2020 15:07:13 -0800 (PST) Received: from [192.168.1.113] (172-125-77-130.lightspeed.sntcca.sbcglobal.net. [172.125.77.130]) by smtp.gmail.com with ESMTPSA id x28sm8872349pfr.186.2020.11.27.15.07.11 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 27 Nov 2020 15:07:12 -0800 (PST) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.20.0.2.21\)) Subject: Re: panic shortly after boot when amdgpu.ko is loaded (fpu?) From: Bakul Shah In-Reply-To: <0075A3F0-C106-4970-B840-0DFAEA29DBC9@iitbombay.org> Date: Fri, 27 Nov 2020 15:07:10 -0800 Cc: Hans Petter Selasky , freebsd-current@freebsd.org, kib@freebsd.org Content-Transfer-Encoding: quoted-printable Message-Id: <916B4D57-6C8A-4510-AE29-5E289717CBCA@iitbombay.org> References: <2a0f9031-a96d-2989-4d6c-a7691c451b74@bsdio.com> <40ac5686-aa96-f9e4-7c9c-5dbe628af49a@bsdio.com> <0075A3F0-C106-4970-B840-0DFAEA29DBC9@iitbombay.org> To: Rebecca Cran X-Mailer: Apple Mail (2.3654.20.0.2.21) X-Rspamd-Queue-Id: 4CjVdG5xxNz4ZQJ X-Spamd-Bar: -- X-Spamd-Result: default: False [-3.00 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; TO_DN_SOME(0.00)[]; MV_CASE(0.50)[]; R_SPF_ALLOW(-0.20)[+ip6:2607:f8b0:4000::/36]; RCVD_COUNT_THREE(0.00)[3]; DKIM_TRACE(0.00)[iitbombay-org.20150623.gappssmtp.com:+]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RBL_DBL_DONT_QUERY_IPS(0.00)[2607:f8b0:4864:20::433:from]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US]; MID_RHS_MATCH_FROM(0.00)[]; SUBJECT_HAS_QUESTION(0.00)[]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[iitbombay-org.20150623.gappssmtp.com:s=20150623]; FREEFALL_USER(0.00)[bakul]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; NEURAL_HAM_LONG(-1.00)[-1.000]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-current@freebsd.org]; DMARC_NA(0.00)[iitbombay.org]; SPAMHAUS_ZRD(0.00)[2607:f8b0:4864:20::433:from:127.0.2.255]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[2607:f8b0:4864:20::433:from]; RCVD_TLS_ALL(0.00)[]; MAILMAN_DEST(0.00)[freebsd-current] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Nov 2020 23:07:16 -0000 > On Nov 27, 2020, at 1:47 PM, Bakul Shah wrote: >=20 >=20 >=20 >> On Nov 27, 2020, at 9:09 AM, Rebecca Cran wrote: >>=20 >> On 11/27/20 4:29 AM, Hans Petter Selasky wrote: >>>=20 >>> Is the problem always triggered by hald? If you disable hald in = rc.conf, does the system run for a longer period of time? >>=20 >> It turns out that disabling ntpd let the system run for a longer = period of time - until I ran "sysctl sys" at which point I got a panic. >>=20 >> And this time the panic actually implicates amdgpu.ko, which is an = improvement: >>=20 >>=20 >> #9 0x0000000000000000 in ?? () >> #10 0xffffffff82a14c4e in amdgpu_device_get_pcie_replay_count () >> from /boot/modules/amdgpu.ko >> #11 0xffffffff82a14b80 in sysctl_handle_attr () from = /boot/modules/amdgpu.ko >>=20 >> #12 0xffffffff80c06cc1 in sysctl_root_handler_locked = (oid=3D0xfffffe02133ff000, >> arg1=3D0xfffffe016e360980, arg2=3D-8724518803888, = req=3D0xfffffe016e360980, >> tracker=3D0xfffff81099af6280) at = /usr/src/sys/kern/kern_sysctl.c:184 >> #13 0xffffffff80c0610c in sysctl_root (oidp=3D, >> arg1=3D0xfffff810aa27e650, arg2=3D-2100190360, = req=3D0xfffffe016e360980) >> at /usr/src/sys/kern/kern_sysctl.c:2211 >>=20 >>=20 >> Since it _is_ a problem in amdgpu, I'll stop this thread and re-post = on freebsd-x11. >=20 > FWIW, I am using amdgpu on a Ryzen 5 3500U system on a couple days old > -current (r368025). "sysctl sys" complains about "unknown oid 'sys'". > I am runing hald & ntpd. I had a few amdgpu related panics initially > but they vanished once I added > PORTS_MODULES=3Dgraphics/drm-devel-kmod > to /etc/src.conf to compile it along with the kernel. I am running > GENERIC-NODEBUG. The machine gets rebooted when I install a new kernel > (usually once a week). >=20 > My guess is some weird interaction rather than something in amdgpu. To get sysctl sys working I compiled a GENERIC kernel from today's 368108 revision and so far there are no problems. $ sysctl sys.device.drmn0.pcie_replay_count sys.device.drmn0.pcie_replay_count: 0 sysctl -a also works. Last commit log on drm-devel-kmod (the last tiem may be what you're running into): Author: manu Date: Mon Nov 9 13:37:12 2020 +0000 drm-current-kmod/drm-devel-kmod: Update to latest version - Use acpi code from base (thanks to wulf@) - Add radeon/i386 patches (thanks to tilj@) - Translate O_ flags for linuxulator (thanks to Greg V) - Lot of linuxkpi cleanup - Hack for amdgpu when the IP isn't init properly, this happens on one of my laptop with a dGPU. We still don't support it but we don't panic when we load amdgpu