From owner-freebsd-current@freebsd.org Tue Jul 14 19:12:51 2020 Return-Path: Delivered-To: freebsd-current@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 41EE036C6AE for ; Tue, 14 Jul 2020 19:12:51 +0000 (UTC) (envelope-from andrew@fubar.geek.nz) Received: from fry.fubar.geek.nz (fry.fubar.geek.nz [139.59.165.16]) by mx1.freebsd.org (Postfix) with ESMTP id 4B5qsY1ThRz3RWM; Tue, 14 Jul 2020 19:12:48 +0000 (UTC) (envelope-from andrew@fubar.geek.nz) Received: from [192.168.42.12] (cpc91220-cmbg18-2-0-cust60.5-4.cable.virginm.net [81.104.142.61]) by fry.fubar.geek.nz (Postfix) with ESMTPSA id 101A64E6E3; Tue, 14 Jul 2020 19:12:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fubar.geek.nz; s=mail; t=1594753962; bh=e72AOu00NdsjDaro9oqvvYQ8NDsDSmKzShMoREfJWnw=; h=From:Subject:Date:In-Reply-To:Cc:To:References; b=URqdfDOzhRBysXbFNnV8eHtiqPMZDG1O7lNZwk4v+HK1C0nadvLynrrBwTq99UF7t xEysBIKQmZ300DDwaDnyLhJ1eUiXsGCuATjj4j4Un1LIzawMSQW0SjcDsqpvzmT3nm dGgk6wXRdbyJ7v9oLFXtKhyPrvlAkCpBiU++vWJgaj8cOVvRdhArRlmO9QjrS67hHP ucvsq7SMMiw+Xe3g0yxIV0praipjWKoCYHELgbdC+714p1ry1z0c7QWDPm1NCv9yE2 tnCCeFNL71hrYUfAGDNu+jD91a1egKEU0zDJwzxaqUsxC+VEXoGOdNLA5MEYgfrrpC ZLVBFsYZ7qVFnZ6zfweXvvnTgFz4JxcfdtGEQ80QChf52y7cnp2kiRNFFLusu+c3zf bNrGXLZZmBXV7ema++FlmQVK1S3D9FTvHI4IfbCGS2xu8s1UjUzLCDtCG7eqYxf+BR rcnOMvtboRSYRD5wUOTrikDzcKO+reXeteo8HURH0lSFYULTzgUN2cE976Sljm0rnn C4ofXaL1xeXmktgchT4HOQdgnPFq2yg6d6NKf5ICGPyYsT39RxsbLKhY5+ND03cCqk XdcvziGtAuBzQT+7FW/RcbCpBb6ThfTa+lDXFbFosgI+oPN1RzgK3wCYfGWeMgBB6s ei95BiHIMZGxg3hlTCt1DgBQ= From: Andrew Turner Message-Id: Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.14\)) Subject: Re: arm64 panic: reaper-related? Date: Tue, 14 Jul 2020 20:12:41 +0100 In-Reply-To: <20200713140538.GA46078@FreeBSD.org> Cc: freebsd-current@freebsd.org To: Glen Barber References: <20200713135821.GS61041@FreeBSD.org> <20200713140538.GA46078@FreeBSD.org> X-Mailer: Apple Mail (2.3445.104.14) X-Rspamd-Queue-Id: 4B5qsY1ThRz3RWM X-Spamd-Bar: - Authentication-Results: mx1.freebsd.org; dkim=pass header.d=fubar.geek.nz header.s=mail header.b=URqdfDOz; dmarc=pass (policy=none) header.from=fubar.geek.nz; spf=pass (mx1.freebsd.org: domain of andrew@fubar.geek.nz designates 139.59.165.16 as permitted sender) smtp.mailfrom=andrew@fubar.geek.nz X-Spamd-Result: default: False [-1.80 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[fubar.geek.nz:s=mail]; NEURAL_HAM_MEDIUM(-0.99)[-0.986]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; MV_CASE(0.50)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; R_SPF_ALLOW(-0.20)[+mx]; NEURAL_HAM_LONG(-0.95)[-0.951]; DKIM_TRACE(0.00)[fubar.geek.nz:+]; RCPT_COUNT_TWO(0.00)[2]; DMARC_POLICY_ALLOW(-0.50)[fubar.geek.nz,none]; NEURAL_HAM_SHORT(-0.46)[-0.462]; RCVD_NO_TLS_LAST(0.10)[]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+,1:+,2:~]; SUBJECT_ENDS_QUESTION(1.00)[]; ASN(0.00)[asn:14061, ipnet:139.59.160.0/20, country:US]; RCVD_COUNT_TWO(0.00)[2]; MID_RHS_MATCH_FROM(0.00)[]; RECEIVED_SPAMHAUS_PBL(0.00)[81.104.142.61:received] Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.33 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Jul 2020 19:12:51 -0000 > On 13 Jul 2020, at 15:05, Glen Barber wrote: >=20 > On Mon, Jul 13, 2020 at 01:58:21PM +0000, Glen Barber wrote: >> Hi, >>=20 >> This morning, one of our arm64 build machines panicked. It looks = like >> it is somehow reaper-related, but I am not entirely sure. Backtrace >> follows. Any thoughts? I'm not quite sure where to go from here... >> Thanks in advance for any input. >>=20 >> db> set $lines 0 >> db> bt >> Tracing pid 11 tid 100003 td 0xfffffd0001634000 >> db_trace_self() at db_stack_trace+0xf8 >> pc =3D 0xffff00000075fdac lr =3D 0xffff000000103e78 >> sp =3D 0xffff00011eca89b0 fp =3D 0xffff00011eca89e0 >>=20 >> db_stack_trace() at db_command+0x228 >> pc =3D 0xffff000000103e78 lr =3D 0xffff000000103af0 >> sp =3D 0xffff00011eca89f0 fp =3D 0xffff00011eca8ad0 >>=20 >> db_command() at db_command_loop+0x58 >> pc =3D 0xffff000000103af0 lr =3D 0xffff000000103898 >> sp =3D 0xffff00011eca8ae0 fp =3D 0xffff00011eca8b00 >>=20 >> db_command_loop() at db_trap+0xf4 >> pc =3D 0xffff000000103898 lr =3D 0xffff000000106c0c >> sp =3D 0xffff00011eca8b10 fp =3D 0xffff00011eca8d30 >>=20 >> db_trap() at kdb pc =3D 0xffff000000106c0c lr =3D = 0xffff000000463b0c >> sp =3D 0xffff00011eca8d40 fp =3D 0xffff00011eca8df0 >>=20 >> kdb_trap() at do_el1h_sync+0xf4 >> pc =3D 0xffff000000463b0c lr =3D 0xffff00000077b448 >> sp =3D 0xffff00011eca8e00 fp =3D 0xffff00011eca8e30 >>=20 >> do_el1h_sync() at handle_el1h_sync+0x78 >> pc =3D 0xffff00000077b448 lr =3D 0xffff000000762878 >> sp =3D 0xffff00011eca8e40 fp =3D 0xffff00011eca8f50 >>=20 >> handle_el1h_sync() at kdb_enter+0x34 >> pc =3D 0xffff000000762878 lr =3D 0xffff000000463168 >> sp =3D 0xffff00011eca8f60 fp =3D 0xffff00011eca8ff0 >>=20 >> kdb_enter() at vpanic+0x1b0 >> pc =3D 0xffff000000463168 lr =3D 0xffff000000417a74 >> sp =3D 0xffff00011eca9000 fp =3D 0xffff00011eca90b0 >>=20 >> vpanic() at panic+0x44 >> pc =3D 0xffff000000417a74 lr =3D 0xffff0000004178c0 >> sp =3D 0xffff00011eca90c0 fp =3D 0xffff00011eca9140 >>=20 >> panic() at __stack_chk_fail+0x10 >> pc =3D 0xffff0000004178c0 lr =3D 0xffff00000044ab6c >> sp =3D 0xffff00011eca9150 fp =3D 0xffff00011eca9150 >>=20 >> __stack_chk_fail() at putchar+0x2bc >> pc =3D 0xffff00000044ab6c lr =3D 0xffff000000469ce8 >> sp =3D 0xffff00011eca9160 fp =3D 0xffff00011eca91e0 >>=20 >> putchar() at 0x106 >> pc =3D 0xffff000000469ce8 lr =3D 0x0000000000000106 >> sp =3D 0xffff00011eca91f0 fp =3D 0x0000000000000000 >>=20 >> db> show proc 11 >> Process 11 (idle) at 0xfffffd0001630000: >> state: NORMAL >> uid: 0 gids: 0 >> parent: pid 0 at 0xffff0000010fae40 >> ABI: null >> reaper: 0xffff0000010fae40 reapsubtree: 11 >> sigparent: 20 >> vmspace: 0xffff000001109200 >> (map 0xffff000001109200) >> (map.pmap 0xffff0000011092c0) >> (pmap 0xffff000001109320) >> threads: 48 >> 100003 Run CPU -1 [idle: = cpu0] >> 100004 Run CPU 1 [idle: = cpu1] >> 100005 Run CPU 2 [idle: = cpu2] >> 100006 Run CPU 3 [idle: = cpu3] >> 100007 Run CPU 4 [idle: = cpu4] >> 100008 Run CPU 5 [idle: = cpu5] >> 100009 Run CPU 6 [idle: = cpu6] >> 100010 Run CPU 7 [idle: = cpu7] >> 100011 Run CPU 8 [idle: = cpu8] >> 100012 CanRun [idle: = cpu9] >> 100013 Run CPU 10 [idle: = cpu10] >> 100014 Run CPU 11 [idle: = cpu11] >> 100015 Run CPU 12 [idle: = cpu12] >> 100016 Run CPU 13 [idle: = cpu13] >> 100017 Run CPU 14 [idle: = cpu14] >> 100018 Run CPU 15 [idle: = cpu15] >> 100019 Run CPU 16 [idle: = cpu16] >> 100020 Run CPU 17 [idle: = cpu17] >> 100021 Run CPU 18 [idle: = cpu18] >> 100022 Run CPU 19 [idle: = cpu19] >> 100023 Run CPU 20 [idle: = cpu20] >> 100024 Run CPU 21 [idle: = cpu21] >> 100025 Run CPU 22 [idle: = cpu22] >> 100026 Run CPU 23 [idle: = cpu23] >> 100027 Run CPU 24 [idle: = cpu24] >> 100028 Run CPU 25 [idle: = cpu25] >> 100029 Run CPU 26 [idle: = cpu26] >> 100030 CanRun [idle: = cpu27] >> 100031 Run CPU 28 [idle: = cpu28] >> 100032 Run CPU 29 [idle: = cpu29] >> 100033 Run CPU 30 [idle: = cpu30] >> 100034 Run CPU 31 [idle: = cpu31] >> 100035 Run CPU 32 [idle: = cpu32] >> 100036 Run CPU 33 [idle: = cpu33] >> 100037 Run CPU 34 [idle: = cpu34] >> 100038 Run CPU 35 [idle: = cpu35] >> 100039 Run CPU 36 [idle: = cpu36] >> 100040 Run CPU 37 [idle: = cpu37] >> 100041 Run CPU 38 [idle: = cpu38] >> 100042 Run CPU 39 [idle: = cpu39] >> 100043 Run CPU 40 [idle: = cpu40] >> 100044 Run CPU 41 [idle: = cpu41] >> 100045 Run CPU 42 [idle: = cpu42] >> 100046 Run CPU 43 [idle: = cpu43] >> 100047 Run CPU 44 [idle: = cpu44] >> 100048 Run CPU 45 [idle: = cpu45] >> 100049 Run CPU 46 [idle: = cpu46] >> 100050 Run CPU 47 [idle: = cpu47] >>=20 >>=20 >=20 > I should have included this as well... >=20 > db> show panic > panic: Misaligned access from kernel space! How reproducible is this? The backtrace and panic messages don=E2=80=99t = line up, but that may be related __stack_chk_fail being in the trace. = This is called when a stack overflow is detected. I added more diagnostics to the kernel in r363191. Is it possible to try = upgrading the kernel to that? Andrew