Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 14 Jul 2020 20:12:41 +0100
From:      Andrew Turner <andrew@fubar.geek.nz>
To:        Glen Barber <gjb@FreeBSD.org>
Cc:        freebsd-current@freebsd.org
Subject:   Re: arm64 panic: reaper-related?
Message-ID:  <E91C43D1-FFC2-4C35-B1FA-25ED12574458@fubar.geek.nz>
In-Reply-To: <20200713140538.GA46078@FreeBSD.org>
References:  <20200713135821.GS61041@FreeBSD.org> <20200713140538.GA46078@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help

> On 13 Jul 2020, at 15:05, Glen Barber <gjb@FreeBSD.org> wrote:
>=20
> On Mon, Jul 13, 2020 at 01:58:21PM +0000, Glen Barber wrote:
>> Hi,
>>=20
>> This morning, one of our arm64 build machines panicked.  It looks =
like
>> it is somehow reaper-related, but I am not entirely sure.  Backtrace
>> follows.  Any thoughts?  I'm not quite sure where to go from here...
>> Thanks in advance for any input.
>>=20
>> db> set $lines 0
>> db> bt
>> Tracing pid 11 tid 100003 td 0xfffffd0001634000
>> db_trace_self() at db_stack_trace+0xf8
>>         pc =3D 0xffff00000075fdac  lr =3D 0xffff000000103e78
>>         sp =3D 0xffff00011eca89b0  fp =3D 0xffff00011eca89e0
>>=20
>> db_stack_trace() at db_command+0x228
>>         pc =3D 0xffff000000103e78  lr =3D 0xffff000000103af0
>>         sp =3D 0xffff00011eca89f0  fp =3D 0xffff00011eca8ad0
>>=20
>> db_command() at db_command_loop+0x58
>>         pc =3D 0xffff000000103af0  lr =3D 0xffff000000103898
>>         sp =3D 0xffff00011eca8ae0  fp =3D 0xffff00011eca8b00
>>=20
>> db_command_loop() at db_trap+0xf4
>>         pc =3D 0xffff000000103898  lr =3D 0xffff000000106c0c
>>         sp =3D 0xffff00011eca8b10  fp =3D 0xffff00011eca8d30
>>=20
>> db_trap() at kdb         pc =3D 0xffff000000106c0c  lr =3D =
0xffff000000463b0c
>>         sp =3D 0xffff00011eca8d40  fp =3D 0xffff00011eca8df0
>>=20
>> kdb_trap() at do_el1h_sync+0xf4
>>         pc =3D 0xffff000000463b0c  lr =3D 0xffff00000077b448
>>         sp =3D 0xffff00011eca8e00  fp =3D 0xffff00011eca8e30
>>=20
>> do_el1h_sync() at handle_el1h_sync+0x78
>>         pc =3D 0xffff00000077b448  lr =3D 0xffff000000762878
>>         sp =3D 0xffff00011eca8e40  fp =3D 0xffff00011eca8f50
>>=20
>> handle_el1h_sync() at kdb_enter+0x34
>>         pc =3D 0xffff000000762878  lr =3D 0xffff000000463168
>>         sp =3D 0xffff00011eca8f60  fp =3D 0xffff00011eca8ff0
>>=20
>> kdb_enter() at vpanic+0x1b0
>>         pc =3D 0xffff000000463168  lr =3D 0xffff000000417a74
>>         sp =3D 0xffff00011eca9000  fp =3D 0xffff00011eca90b0
>>=20
>> vpanic() at panic+0x44
>>         pc =3D 0xffff000000417a74  lr =3D 0xffff0000004178c0
>>         sp =3D 0xffff00011eca90c0  fp =3D 0xffff00011eca9140
>>=20
>> panic() at __stack_chk_fail+0x10
>>         pc =3D 0xffff0000004178c0  lr =3D 0xffff00000044ab6c
>>         sp =3D 0xffff00011eca9150  fp =3D 0xffff00011eca9150
>>=20
>> __stack_chk_fail() at putchar+0x2bc
>>         pc =3D 0xffff00000044ab6c  lr =3D 0xffff000000469ce8
>>         sp =3D 0xffff00011eca9160  fp =3D 0xffff00011eca91e0
>>=20
>> putchar() at 0x106
>>         pc =3D 0xffff000000469ce8  lr =3D 0x0000000000000106
>>         sp =3D 0xffff00011eca91f0  fp =3D 0x0000000000000000
>>=20
>> db> show proc 11
>> Process 11 (idle) at 0xfffffd0001630000:
>> state: NORMAL
>> uid: 0  gids: 0
>> parent: pid 0 at 0xffff0000010fae40
>> ABI: null
>> reaper: 0xffff0000010fae40 reapsubtree: 11
>> sigparent: 20
>> vmspace: 0xffff000001109200
>>   (map 0xffff000001109200)
>>   (map.pmap 0xffff0000011092c0)
>>   (pmap 0xffff000001109320)
>> threads: 48
>> 100003                   Run     CPU -1                      [idle: =
cpu0]
>> 100004                   Run     CPU 1                       [idle: =
cpu1]
>> 100005                   Run     CPU 2                       [idle: =
cpu2]
>> 100006                   Run     CPU 3                       [idle: =
cpu3]
>> 100007                   Run     CPU 4                       [idle: =
cpu4]
>> 100008                   Run     CPU 5                       [idle: =
cpu5]
>> 100009                   Run     CPU 6                       [idle: =
cpu6]
>> 100010                   Run     CPU 7                       [idle: =
cpu7]
>> 100011                   Run     CPU 8                       [idle: =
cpu8]
>> 100012                   CanRun                              [idle: =
cpu9]
>> 100013                   Run     CPU 10                      [idle: =
cpu10]
>> 100014                   Run     CPU 11                      [idle: =
cpu11]
>> 100015                   Run     CPU 12                      [idle: =
cpu12]
>> 100016                   Run     CPU 13                      [idle: =
cpu13]
>> 100017                   Run     CPU 14                      [idle: =
cpu14]
>> 100018                   Run     CPU 15                      [idle: =
cpu15]
>> 100019                   Run     CPU 16                      [idle: =
cpu16]
>> 100020                   Run     CPU 17                      [idle: =
cpu17]
>> 100021                   Run     CPU 18                      [idle: =
cpu18]
>> 100022                   Run     CPU 19                      [idle: =
cpu19]
>> 100023                   Run     CPU 20                      [idle: =
cpu20]
>> 100024                   Run     CPU 21                      [idle: =
cpu21]
>> 100025                   Run     CPU 22                      [idle: =
cpu22]
>> 100026                   Run     CPU 23                      [idle: =
cpu23]
>> 100027                   Run     CPU 24                      [idle: =
cpu24]
>> 100028                   Run     CPU 25                      [idle: =
cpu25]
>> 100029                   Run     CPU 26                      [idle: =
cpu26]
>> 100030                   CanRun                              [idle: =
cpu27]
>> 100031                   Run     CPU 28                      [idle: =
cpu28]
>> 100032                   Run     CPU 29                      [idle: =
cpu29]
>> 100033                   Run     CPU 30                      [idle: =
cpu30]
>> 100034                   Run     CPU 31                      [idle: =
cpu31]
>> 100035                   Run     CPU 32                      [idle: =
cpu32]
>> 100036                   Run     CPU 33                      [idle: =
cpu33]
>> 100037                   Run     CPU 34                      [idle: =
cpu34]
>> 100038                   Run     CPU 35                      [idle: =
cpu35]
>> 100039                   Run     CPU 36                      [idle: =
cpu36]
>> 100040                   Run     CPU 37                      [idle: =
cpu37]
>> 100041                   Run     CPU 38                      [idle: =
cpu38]
>> 100042                   Run     CPU 39                      [idle: =
cpu39]
>> 100043                   Run     CPU 40                      [idle: =
cpu40]
>> 100044                   Run     CPU 41                      [idle: =
cpu41]
>> 100045                   Run     CPU 42                      [idle: =
cpu42]
>> 100046                   Run     CPU 43                      [idle: =
cpu43]
>> 100047                   Run     CPU 44                      [idle: =
cpu44]
>> 100048                   Run     CPU 45                      [idle: =
cpu45]
>> 100049                   Run     CPU 46                      [idle: =
cpu46]
>> 100050                   Run     CPU 47                      [idle: =
cpu47]
>>=20
>>=20
>=20
> I should have included this as well...
>=20
> db> show panic
> panic: Misaligned access from kernel space!

How reproducible is this? The backtrace and panic messages don=E2=80=99t =
line up, but that may be related __stack_chk_fail being in the trace. =
This is called when a stack overflow is detected.

I added more diagnostics to the kernel in r363191. Is it possible to try =
upgrading the kernel to that?

Andrew




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E91C43D1-FFC2-4C35-B1FA-25ED12574458>