From owner-freebsd-current@freebsd.org Thu Jul 5 18:59:27 2018 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2CE7D1024293 for ; Thu, 5 Jul 2018 18:59:27 +0000 (UTC) (envelope-from hps@selasky.org) Received: from mail.turbocat.net (turbocat.net [IPv6:2a01:4f8:c17:6c4b::2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id B148870BCC; Thu, 5 Jul 2018 18:59:26 +0000 (UTC) (envelope-from hps@selasky.org) Received: from hps2016.home.selasky.org (unknown [62.141.128.70]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.turbocat.net (Postfix) with ESMTPSA id 9AE8D2600DB; Thu, 5 Jul 2018 20:59:23 +0200 (CEST) Subject: Re: atomic changes break drm-next-kmod? To: Pete Wright , John Baldwin , Niclas Zeising , "O. Hartmann" , FreeBSD Current References: <20180703170223.266dbf5b@thor.intern.walstatt.dynvpn.de> <845aca10-8c01-fa3b-087f-f957df4e7531@nomadlogic.org> <063ae5c3-0584-1284-dd9d-ab8b5790baf1@FreeBSD.org> <0bf8e57b-fdb4-4c1a-3d0d-a734f8187ca8@nomadlogic.org> <4c5411dd-9f6b-7245-6ade-e11040f74687@FreeBSD.org> <24f5d737-a205-6fcc-0a33-a84601d2ff7a@nomadlogic.org> From: Hans Petter Selasky Message-ID: <29ce4eab-6667-d2ca-b5d8-3deeef28f142@selasky.org> Date: Thu, 5 Jul 2018 20:59:04 +0200 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:52.0) Gecko/20100101 Thunderbird/52.8.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Jul 2018 18:59:27 -0000 On 07/05/18 19:48, Pete Wright wrote: > > > On 07/05/2018 10:10, John Baldwin wrote: >> On 7/3/18 5:10 PM, Pete Wright wrote: >>> >>> On 07/03/2018 15:56, John Baldwin wrote: >>>> On 7/3/18 3:34 PM, Pete Wright wrote: >>>>> On 07/03/2018 15:29, John Baldwin wrote: >>>>>> That seems like kgdb is looking at the wrong CPU.  Can you use >>>>>> 'info threads' and look for threads not stopped in 'sched_switch' >>>>>> and get their backtraces?  You could also just do 'thread apply >>>>>> all bt' and put that file at a URL if that is easiest. >>>>>> >>>>> sure thing John - here's a gist of "thread apply all bt" >>>>> >>>>> https://gist.github.com/gem-pete/d8d7ab220dc8781f0827f965f09d43ed >>>> That doesn't look right at all.  Are you sure the kernel matches the >>>> vmcore?  Also, which kgdb version are you using? >>>> >>> yea i agree that doesn't look right at all.  here is my setup: >>> >>> $ which kgdb >>> /usr/bin/kgdb >>> $ kgdb >>> GNU gdb 6.1.1 [FreeBSD] >>> $ ls -lh /var/crash/vmcore.1 >>> -rw-------  1 root  wheel   1.6G Jul  3 15:03 /var/crash/vmcore.1 >>> $ ls -l /usr/lib/debug/boot/kernel/kernel.debug >>> -r-xr-xr-x  1 root  wheel  87840496 Jul  3 13:54 >>> /usr/lib/debug/boot/kernel/kernel.debug >>> >>> and i invoke kgdb like so: >>> $ sudo kgdb /usr/lib/debug/boot/kernel/kernel.debug /var/crash/vmcore.1 >>> >>> here's a gist of my full gdb session: >>> http://termbin.com/krsn >>> >>> dunno - maybe i have a bad core dump?  regardless, more than happy to >>> help so let me know if i should try anything else or patches etc.. >> Can you try installing gdb from ports and using /usr/local/bin/kgdb? >> > > that seems to have done the trick, at least the output looks more > encouraging. > >  --- trap 0, rip = 0, rsp = 0, rbp = 0 --- > KDB: enter: panic > > __curthread () at ./machine/pcpu.h:231 > 231        __asm("movq %%gs:%1,%0" : "=r" (td) > > > here's my full kgdb session: > http://termbin.com/qa4f > > i don't see any threads not in "sched_switch" though :( Hi, The problem may be that the patch to enable atomic inlining of all macros forgot to set the SMP keyword which means SMP is not defined at all for KLD's so all non-kernel atomic usage is with MPLOCKED empty! /* * For userland, always use lock prefixes so that the binaries will run * on both SMP and !SMP systems. */ #if defined(SMP) || !defined(_KERNEL) #define MPLOCKED "lock ; " #else #define MPLOCKED #endif Can you try to recompile the LinuxKPI /sys/modules/linuxkpi with DEBUG_FLAGS="-DSMP" ? and similarly the drm-next package? --HPS