Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 16 Feb 2023 00:09:26 -0800
From:      Mark Millard <marklmi@yahoo.com>
To:        "kd@freebsd.org" <kd@FreeBSD.org>, "wma@freebsd.org" <wma@FreeBSD.org>, dev-commits-src-main@freebsd.org
Cc:        Warner Losh <imp@bsdimp.com>
Subject:   Re: git: 6926e2699ae5 - main - arm: Add support for using VFP in kernel [td == curthread failed form of panic for bt in gdb]
Message-ID:  <4F9A3687-9577-4419-AE1B-D02A4C9212ED@yahoo.com>
In-Reply-To: <782B252E-60AC-4036-BD74-46B95A31B337@yahoo.com>
References:  <3A143148-895F-472B-9AFB-5F1AA0FD1FA0@yahoo.com> <782B252E-60AC-4036-BD74-46B95A31B337@yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
[A very simple program gets the failure under gdb
or lldb of example breakpoints.]

On Feb 15, 2023, at 20:29, Mark Millard <marklmi@yahoo.com> wrote:

> On Feb 15, 2023, at 16:08, Mark Millard <marklmi@yahoo.com> wrote:
>=20
>> Kornel Dul=C4=99ba <kd_at_FreeBSD.org> wrote on
>> Date: Sat, 04 Feb 2023 19:22:23 UTC :
>>=20
>>> The branch main has been updated by kd:
>>>=20
>>> URL: =
https://cgit.FreeBSD.org/src/commit/?id=3D6926e2699ae55080f860488895a2a9aa=
6e6d9b4d
>>>=20
>>> commit 6926e2699ae55080f860488895a2a9aa6e6d9b4d
>>> Author: Kornel Dul=C4=99ba <kd@FreeBSD.org>
>>> AuthorDate: 2023-02-04 12:59:30 +0000
>>> Commit: Kornel Dul=C4=99ba <kd@FreeBSD.org>
>>> CommitDate: 2023-02-04 19:21:43 +0000
>>>=20
>>> arm: Add support for using VFP in kernel
>>>=20
>>> Add missing logic to allow in-kernel VFP usage for ARMv7 NEON.
>>> The implementation is strongly based on arm64 code.
>>> It introduces a family of fpu_kern_* functions to enable the usage
>>> of VFP instructions in kernel.
>>> Apart from that the existing armv7 VFP logic was modified,
>>> taking into account that the state of the VFP registers can now
>>> be modified in the kernel.
>>>=20
>>> Co-developed by: Wojciech Macek <wma@FreeBSD.org>
>>> Sponsored by: Stormshield
>>> Obtained from: Semihalf
>>> Reviewed by: andrew
>>> Differential Revision: https://reviews.freebsd.org/D37419
>>> ---
>>> lib/libthread_db/arch/arm/libpthread_md.c | 21 ++--
>>> sys/arm/arm/exec_machdep.c | 49 ++++----
>>> sys/arm/arm/machdep.c | 1 +
>>> sys/arm/arm/machdep_kdb.c | 31 ++++-
>>> sys/arm/arm/swtch-v6.S | 8 +-
>>> sys/arm/arm/swtch.S | 8 +-
>>> sys/arm/arm/vfp.c | 182 +++++++++++++++++++++++++++++-
>>> sys/arm/arm/vm_machdep.c | 6 +-
>>> sys/arm/include/fpu.h | 7 ++
>>> sys/arm/include/pcb.h | 5 +
>>> sys/arm/include/reg.h | 12 +-
>>> sys/arm/include/vfp.h | 17 +++
>>> 12 files changed, 293 insertions(+), 54 deletions(-)
>>=20
>> [This is a somewhat adjusted version of a note replying
>> to a Warner note about a panic someone got during a
>> process coredump that was happening.]
>>=20
>> Just a possible point, given recent kernel floating
>> point work:
>>=20
>> I tried to do a typical build and test of some
>> benchmark programs that I sometimes use that involve
>> floating point in some of the programs, some use with
>> multithreading involved. (As FreeBSD and g++ progress
>> I tend to do this once and a while, not as often on
>> armv7 as on aarch64.)
>>=20
>> On armv7, I now usually get a message about a failure
>> of an internal cross-check, which also leads to the
>> program being stopped early. The messaging from run
>> to run varies what the failure is, but the runs should
>> not vary and should not fail the cross-checks --and
>> previously did not, including when I last tried armv7.
>> (Not recently.)
>>=20
>> For the specific example failures, the initial serial
>> (single thread) test with float involved works but the
>> following multi-thread test in the same program fails
>> and causes the program to stop when it notices there
>> is a problem. (On occasion the cross-check does does
>> not detect a problem.)
>>=20
>> The programs that do not test floating point do not
>> fail. (Same algorithm on integral types.) These can
>> involve floating point outside the algorithm
>> benchmarked, but with no multi-threading involved for
>> such and no floating point based cross-checks involved.
>>=20
>> At this point it is far from obvious to me how I
>> would trackdown the specifics of what leads to the
>> failed cross-checks. But the above is suggestive of
>> there being problems for armv7 handling of saving
>> and restoring floating point context for
>> multi-threading in a process, at least. I've no clue
>> if such are strictly limited to the floating point
>> values that show up vs. if there might be wider
>> memory handling problems that result in the process.
>>=20
>=20
> Further runs of the benchmark program show that I also
> get cross-check failures for single-threaded (the first
> way it tests).
>=20
> But it turns out that, even for single treaded execution
> of the algorithm benchmarked, it is not run on the
> process's initial thread but instead on a created thread.
>=20
> Turns out that for a debug armv7 kernel (debug is not
> what I normally run) attempting a bt in gdb can lead to
> a kernel panic (td =3D=3D curthread failed) related to
> floating point handling:
>=20
> . . .
> (gdb) br serial_kernel_runner
> Breakpoint 1 at 0x1db34: serial_kernel_runner. (6 locations)
> (gdb) br parallel_kernel_runner
> Breakpoint 2 at 0x1b43c: parallel_kernel_runner. (6 locations)
> (gdb) run
> Starting program: =
/root/acpphint/acpphint_kernelsamplers_main-OPi+2E-2048MiB-threads_4-ILP32=
-FreeBSD_main_n260797_dc1b8c9a846e_32bit-g++_12_O3lto-libc++-cpulockdown=20=

> . . .
>=20
> Breakpoint 1, serial_kernel_runner<float, unsigned short> =
(clock_info=3D..., laps=3D3, memry=3D2, ki=3D...) at =
acpphint_kernelrunners.cpp:69
> 69      static auto serial_kernel_runner
> (gdb) bt
> #0  serial_panic: Assertion td =3D=3D curthread failed at =
/usr/main-src/sys/arm/arm/exec_machdep.c:103
> cpuid =3D 3
> time =3D 1676519530
> KDB: stack backtrace:
> db_trace_self() at db_trace_self
>         pc =3D 0xc05f04a0  lr =3D 0xc007ab0c =
(db_trace_self_wrapper+0x30)
>         sp =3D 0xe28ea960  fp =3D 0xe28eaa78
> db_trace_self_wrapper() at db_trace_self_wrapper+0x30
>         pc =3D 0xc007ab0c  lr =3D 0xc02ddc44 (vpanic+0x140)
>         sp =3D 0xe28eaa80  fp =3D 0xe28eaaa0
>         r4 =3D 0x00000100  r5 =3D 0x00000000
>         r6 =3D 0xc0790bb4  r7 =3D 0xc0b1b930
> vpanic() at vpanic+0x140
>         pc =3D 0xc02ddc44  lr =3D 0xc02dda28 (dump_savectx)
>         sp =3D 0xe28eaaa8  fp =3D 0xe28eaaac
>         r4 =3D 0xe28eaad0  r5 =3D 0xbfbfe150
>         r6 =3D 0xe28eaad0  r7 =3D 0xc076a096
>         r8 =3D 0xdb8a47f4  r9 =3D 0x00000016
>        r10 =3D 0x00000040
> dump_savectx() at dump_savectx
>         pc =3D 0xc02dda28  lr =3D 0xc05f3354 (get_vfpcontext+0xb8)
>         sp =3D 0xe28eaab4  fp =3D 0xe28eaac8
> get_vfpcontext() at get_vfpcontext+0xb8
>         pc =3D 0xc05f3354  lr =3D 0xc0611148 (cpu_ptrace+0x38)
>         sp =3D 0xe28eaad0  fp =3D 0xe28eabe8
>         r4 =3D 0xdb75cba0  r5 =3D 0xbfbfe150
> cpu_ptrace() at cpu_ptrace+0x38
>         pc =3D 0xc0611148  lr =3D 0xc0360f4c (kern_ptrace+0x810)
>         sp =3D 0xe28eabf0  fp =3D 0xe28eac70
>         r4 =3D 0xe583dba0  r5 =3D 0x00000000
>         r6 =3D 0xdb8a47a8 r10 =3D 0x00000040
> kern_ptrace() at kern_ptrace+0x810
>         pc =3D 0xc0360f4c  lr =3D 0xc0360550 (sys_ptrace+0x1cc)
>         sp =3D 0xe28eac78  fp =3D 0xe28eadc0
>         r4 =3D 0xe583de5c  r5 =3D 0xe583dba0
>         r6 =3D 0xbfbfe150  r7 =3D 0x00000000
>         r8 =3D 0x00000000  r9 =3D 0xe583de50
>        r10 =3D 0xdb756730
> sys_ptrace() at sys_ptrace+0x1cc
>         pc =3D 0xc0360550  lr =3D 0xc0613b48 (swi_handler+0x170)
>         sp =3D 0xe28eadc8  fp =3D 0xe28eae38
>         r4 =3D 0xe583dba0  r5 =3D 0x00000001
>         r6 =3D 0xc090b220  r7 =3D 0x00000000
>         r8 =3D 0x00000000  r9 =3D 0xe583de50
> swi_handler() at swi_handler+0x170
>         pc =3D 0xc0613b48  lr =3D 0xc05f2d90 (swi_exit)
>         sp =3D 0xe28eae40  fp =3D 0xbfbfe128
>         r4 =3D 0x00000042  r5 =3D 0x22e61c20
>         r6 =3D 0xbfbfe150  r7 =3D 0x0000001a
>         r8 =3D 0x00424124  r9 =3D 0x00000108
>        r10 =3D 0x00000040
> swi_exit() at swi_exit
>         pc =3D 0xc05f2d90  lr =3D 0xc05f2d90 (swi_exit)
>         sp =3D 0xe28eae40  fp =3D 0xbfbfe128
> KDB: enter: panic
> [ thread pid 5438 tid 106943 ]
> Stopped at      kdb_enter+0x54: ldrb    r15, [r15, r15, ror r15]!
>=20
> Note: the code was built via g++12 but using libc++,
> not libstdc++.
>=20
> So I tried the b=3Dprogram variant that does not tryin to
> lock down which CPUs are used by the threads (a completely
> C++20 standard program variant, not FreeBSD specific for
> its used source code). Failure again . . .
>=20
> (gdb) br serial_kernel_runner
> Breakpoint 1 at 0x1c1bc: serial_kernel_runner. (6 locations)
> (gdb) br parallel_kernel_runner
> Breakpoint 2 at 0x19ac8: parallel_kernel_runner. (6 locations)
> (gdb) run
> Starting program: =
/root/acpphint/acpphint_kernelsamplers_main-OPi+2E-2048MiB-threads_4-ILP32=
-FreeBSD_main_n260797_dc1b8c9a846e_32bit-g++_12_O3lto-libc++=20
> . . .
> Breakpoint 1, serial_kernel_runner<float, unsigned short> =
(clock_info=3D..., laps=3D3, memry=3D2, ki=3D...) at =
acpphint_kernelrunners.cpp:69
> 69      static auto serial_kernel_runner
> (gdb) bt
> #0  serial_kernel_runner<float, unsigned short> (clock_info=3D...,panic:=
 Assertion td =3D=3D curthread failed at =
/usr/main-src/sys/arm/arm/exec_machdep.c:103
> cpuid =3D 0
> time =3D 1676520400
> KDB: stack backtrace:
> db_trace_self() at db_trace_self
>         pc =3D 0xc05f04a0  lr =3D 0xc007ab0c =
(db_trace_self_wrapper+0x30)
>         sp =3D 0xe2964960  fp =3D 0xe2964a78
> db_trace_self_wrapper() at db_trace_self_wrapper+0x30
>         pc =3D 0xc007ab0c  lr =3D 0xc02ddc44 (vpanic+0x140)
>         sp =3D 0xe2964a80  fp =3D 0xe2964aa0
>         r4 =3D 0x00000100  r5 =3D 0x00000000
>         r6 =3D 0xc0790bb4  r7 =3D 0xc0b1b930
> vpanic() at vpanic+0x140
>         pc =3D 0xc02ddc44  lr =3D 0xc02dda28 (dump_savectx)
>         sp =3D 0xe2964aa8  fp =3D 0xe2964aac
>         r4 =3D 0xe2964ad0  r5 =3D 0xbfbfe158
>         r6 =3D 0xe2964ad0  r7 =3D 0xc076a096
>         r8 =3D 0xdb7a511c  r9 =3D 0x00000016
>        r10 =3D 0x00000040
> dump_savectx() at dump_savectx
>         pc =3D 0xc02dda28  lr =3D 0xc05f3354 (get_vfpcontext+0xb8)
>         sp =3D 0xe2964ab4  fp =3D 0xe2964ac8
> get_vfpcontext() at get_vfpcontext+0xb8
>         pc =3D 0xc05f3354  lr =3D 0xc0611148 (cpu_ptrace+0x38)
>         sp =3D 0xe2964ad0  fp =3D 0xe2964be8
>         r4 =3D 0xdb7ca3e0  r5 =3D 0xbfbfe158
> cpu_ptrace() at cpu_ptrace+0x38
>         pc =3D 0xc0611148  lr =3D 0xc0360f4c (kern_ptrace+0x810)
>         sp =3D 0xe2964bf0  fp =3D 0xe2964c70
>         r4 =3D 0xdb76fba0  r5 =3D 0x00000000
>         r6 =3D 0xdb7a50d0 r10 =3D 0x00000040
> kern_ptrace() at kern_ptrace+0x810
>         pc =3D 0xc0360f4c  lr =3D 0xc0360550 (sys_ptrace+0x1cc)
>         sp =3D 0xe2964c78  fp =3D 0xe2964dc0
>         r4 =3D 0xdb76fe5c  r5 =3D 0xdb76fba0
>         r6 =3D 0xbfbfe158  r7 =3D 0x00000000
>         r8 =3D 0x00000000  r9 =3D 0xdb76fe50
>        r10 =3D 0xdb754000
> sys_ptrace() at sys_ptrace+0x1cc
>         pc =3D 0xc0360550  lr =3D 0xc0613b48 (swi_handler+0x170)
>         sp =3D 0xe2964dc8  fp =3D 0xe2964e38
>         r4 =3D 0xdb76fba0  r5 =3D 0x00000001
>         r6 =3D 0xc090b220  r7 =3D 0x00000000
>         r8 =3D 0x00000000  r9 =3D 0xdb76fe50
> swi_handler() at swi_handler+0x170
>         pc =3D 0xc0613b48  lr =3D 0xc05f2d90 (swi_exit)
>         sp =3D 0xe2964e40  fp =3D 0xbfbfe130
>         r4 =3D 0x00000042  r5 =3D 0x22e61c20
>         r6 =3D 0xbfbfe158  r7 =3D 0x0000001a
>         r8 =3D 0x00424124  r9 =3D 0x00000108
>        r10 =3D 0x00000040
> swi_exit() at swi_exit
>         pc =3D 0xc05f2d90  lr =3D 0xc05f2d90 (swi_exit)
>         sp =3D 0xe2964e40  fp =3D 0xbfbfe130
> KDB: enter: panic
> [ thread pid 1107 tid 100140 ]
> Stopped at      kdb_enter+0x54: ldrb    r15, [r15, r15, ror r15]!
>=20
> For reference (whitespace may not have
> been preserved):
>=20
> void
> get_vfpcontext(struct thread *td, mcontext_vfp_t *vfp)
> {
>        struct pcb *pcb;
>=20
>        MPASS(td =3D=3D curthread);
>=20
>        pcb =3D td->td_pcb;
>        if ((pcb->pcb_fpflags & PCB_FP_STARTED) !=3D 0) {
>                critical_enter();
>                vfp_store(&pcb->pcb_vfpstate, false);
>                critical_exit();
>        }
>        KASSERT(pcb->pcb_vfpsaved =3D=3D &pcb->pcb_vfpstate,
>                ("Called get_vfpcontext while the kernel is using the =
VFP"));
>        memcpy(vfp->mcv_reg, pcb->pcb_vfpstate.reg,
>                sizeof(vfp->mcv_reg));
>        vfp->mcv_fpscr =3D pcb->pcb_vfpstate.fpscr;
> }
>=20
> Unfortunately the benchmark program is far from being a
> minimalist/simple example.
>=20
> I'm not sure what FreeBSD might have around that would
> have floating point in use but be simple, and possibly
> standardly available, to see if a simpler context is
> available for analogous testing.
>=20

The program, an example way to build it such that
it can lead to crashes, and 2 ways to get the
FreeBSD crash with it (native armv7 context):

// # cc -std=3Dc17 -pedantic -g -O3 simple_dbl.c
//
// # gdb a.out
// (gdb) br test
// (gdb) run
// FreeBSD CRASHES
//
// # lldb a.out
// (lldb) br set -F test
// FreeBSD CRASHES

#include <stdlib.h>

_Bool test(double v) {
    return v<0.5;
}

int main(void) {
    return test(drand48());
}


=3D=3D=3D
Mark Millard
marklmi at yahoo.com




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4F9A3687-9577-4419-AE1B-D02A4C9212ED>