From owner-freebsd-arm@freebsd.org Sun Mar 19 19:40:26 2017 Return-Path: Delivered-To: freebsd-arm@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 531D1D1377E for ; Sun, 19 Mar 2017 19:40:26 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: from asp.reflexion.net (outbound-mail-211-174.reflexion.net [208.70.211.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 050D4CEF for ; Sun, 19 Mar 2017 19:40:24 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: (qmail 12301 invoked from network); 19 Mar 2017 19:40:18 -0000 Received: from unknown (HELO mail-cs-02.app.dca.reflexion.local) (10.81.19.2) by 0 (rfx-qmail) with SMTP; 19 Mar 2017 19:40:18 -0000 Received: by mail-cs-02.app.dca.reflexion.local (Reflexion email security v8.30.2) with SMTP; Sun, 19 Mar 2017 15:40:18 -0400 (EDT) Received: (qmail 20831 invoked from network); 19 Mar 2017 19:40:18 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with (AES256-SHA encrypted) SMTP; 19 Mar 2017 19:40:18 -0000 Received: from [192.168.1.111] (c-67-170-167-181.hsd1.or.comcast.net [67.170.167.181]) by iron2.pdx.net (Postfix) with ESMTPSA id 8D0BEEC761E; Sun, 19 Mar 2017 12:40:17 -0700 (PDT) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 10.2 \(3259\)) Subject: Re: A potential fix for arm64's: sh`forkshell child-process path after fork sometimes has a bad stack pointer value From: Mark Millard In-Reply-To: Date: Sun, 19 Mar 2017 12:40:16 -0700 Cc: freebsd-arm@freebsd.org Content-Transfer-Encoding: quoted-printable Message-Id: <33D7B40E-9383-4C65-A212-A0F218CB3CFB@dsl-only.net> References: <2D04FF37-DEC8-42CE-961D-AE8CD58A0EAA@dsl-only.net> <93064627-5F72-4167-90B1-0A98ABF4C99C@dsl-only.net> <3BC697B9-4A3E-49FF-AB11-1106E2EF8399@dsl-only.net> To: =?utf-8?B?T3RhY8OtbGlv?= X-Mailer: Apple Mail (2.3259) X-BeenThere: freebsd-arm@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Porting FreeBSD to ARM processors." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 19 Mar 2017 19:40:26 -0000 On 2017-Mar-19, at 7:18 AM, Otac=C3=ADlio = wrote: > Em 14/02/2017 13:35, Mark Millard escreveu: >> The following change has let my test run for 8.5 hours so far without = a >> fork-failure in sh`forkshell : >>=20 >> # svnlite diff /usr/src/sys/arm64/arm64/swtch.S >> Index: /usr/src/sys/arm64/arm64/swtch.S >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >> --- /usr/src/sys/arm64/arm64/swtch.S (revision 312982) >> +++ /usr/src/sys/arm64/arm64/swtch.S (working copy) >> @@ -241,6 +241,12 @@ >> mov fp, #0 /* Stack traceback stops here. */ >> bl _C_LABEL(fork_exit) >> + /* >> + * Disable interrupts to avoid >> + * overwriting sp_el0 and spsr_el1 by an IRQ exception. >> + */ >> + msr daifset, #2 >> + >> /* Restore sp and lr */ >> ldp x0, x1, [sp] >> msr sp_el0, x0 >> @@ -263,12 +269,6 @@ >> ldp x28, x29, [sp, #TF_X + 28 * 8] >> /* Skip x30 as it was restored above as lr */ >> - /* >> - * Disable interrupts to avoid >> - * overwriting spsr_el1 by an IRQ exception. >> - */ >> - msr daifset, #2 >> - >> /* Restore elr and spsr */ >> ldp x0, x1, [sp, #16] >> msr elr_el1, x0 >>=20 >> I'm going to switch to attempting a self-hosted buildworld >> buildkernel again. >=20 > This patch or some other about this bug was committed to HEAD? Yes, "some other" in -r313772 (2017-Feb-15). See: = https://lists.freebsd.org/pipermail/svn-src-head/2017-February/097004.html= which in part says: Author: andrew Date: Wed Feb 15 14:56:47 2017 New Revision: 313772 URL:=20 https://svnweb.freebsd.org/changeset/base/313772 Log: Load the new sp_el0 with interrupts disabled in fork_trampoline. If an interrupt arrives in fork_trampoline after sp_el0 was written we may = then switch to a new thread, enter userland so change this stack pointer, = then return to this code with the wrong value. This fixes this case by = moving the load of sp_el0 until after interrupts have been disabled. =20 Reported by: Mark Millard (markmi at dsl-only.net) Sponsored by: ABT Systems Ltd Differential Revision: https://reviews.freebsd.org/D9593 Modified: head/sys/arm64/arm64/swtch.S =3D=3D=3D Mark Millard markmi at dsl-only.net