From owner-freebsd-hackers@freebsd.org Sun Feb 18 22:24:59 2018 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id AF711F0AA54; Sun, 18 Feb 2018 22:24:59 +0000 (UTC) (envelope-from trond@fagskolen.gjovik.no) Received: from smtp.fagskolen.gjovik.no (smtp.fagskolen.gjovik.no [IPv6:2001:700:1100:1:200:ff:fe00:b]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "smtp.fagskolen.gjovik.no", Issuer "Fagskolen i Gj??vik" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 28DCB6F5A2; Sun, 18 Feb 2018 22:24:59 +0000 (UTC) (envelope-from trond@fagskolen.gjovik.no) Received: from mail.fig.ol.no (localhost [127.0.0.1]) by mail.fig.ol.no (8.15.2/8.15.2) with ESMTPS id w1IMOps9029504 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Sun, 18 Feb 2018 23:24:51 +0100 (CET) (envelope-from trond@fagskolen.gjovik.no) Received: from localhost (trond@localhost) by mail.fig.ol.no (8.15.2/8.15.2/Submit) with ESMTP id w1IMOpFQ029501; Sun, 18 Feb 2018 23:24:51 +0100 (CET) (envelope-from trond@fagskolen.gjovik.no) X-Authentication-Warning: mail.fig.ol.no: trond owned process doing -bs Date: Sun, 18 Feb 2018 23:24:51 +0100 (CET) From: =?ISO-8859-1?Q?Trond_Endrest=F8l?= Sender: Trond.Endrestol@fagskolen.gjovik.no To: FreeBSD Current , FreeBSD Hackers Subject: Re: amd64 head -r329465 (non-debug build, but with symbols): "panic: spin lock held too long" during make check-old, reported during a sys_vfork In-Reply-To: Message-ID: References: <6907E068-C80A-44B8-A8AD-3EF27D52D127@yahoo.com> <20832C61-AA5D-41A6-8BF9-90CC87D17219@yahoo.com> <6D47FEC0-7991-4F76-AC31-2CC1E8934521@yahoo.com> User-Agent: Alpine 2.21 (BSF 202 2017-01-01) Organization: Fagskolen Innlandet OpenPGP: url=http://fig.ol.no/~trond/trond.key MIME-Version: 1.0 X-Spam-Status: No, score=-2.2 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on mail.fig.ol.no Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8BIT X-Content-Filtered-By: Mailman/MimeDel 2.1.25 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 18 Feb 2018 22:24:59 -0000 On Sun, 18 Feb 2018 22:33+0100, Mateusz Guzik wrote: > On Sun, Feb 18, 2018 at 9:38 PM, Trond Endrestøl < > Trond.Endrestol@fagskolen.gjovik.no> wrote: > > > On Sun, 18 Feb 2018 11:51-0800, Mark Millard wrote: > > > > > Note: -r329448 was reverted in -r329461 : racy. > > > > True. I got a crash when compiling r329451 while running r329449. > > I've now booted the r329422 ZFS BE and I'm attempting to build > > r329529. > > > > Looking around strongly suggests r329448 is the culprit. If you can verify > 329447 works fine we are mostly done here. I noticed no errors in r329447. When r329529 is built and installed, I'll try to incrementally build and install r329531. > Note the revision got reverted and different variant got in in r329531. > > That said, if r329447 works then the issue should be already fixed and in > particular fresh head should work fine. -- Trond. From owner-freebsd-hackers@freebsd.org Sun Feb 18 22:25:15 2018 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 16F1EF0AAA1; Sun, 18 Feb 2018 22:25:15 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: from mail-qt0-x244.google.com (mail-qt0-x244.google.com [IPv6:2607:f8b0:400d:c0d::244]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 9FE086F5D1; Sun, 18 Feb 2018 22:25:14 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: by mail-qt0-x244.google.com with SMTP id q18so10181839qtl.3; Sun, 18 Feb 2018 14:25:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=5Mz4rua7Pa3nzFZ50ABKH/hndRiO+i+rdmf7JPPiPnM=; b=M11eNmfCVxZAktlIlU8FBtpt/4EO+oV4Ra8LKDfqciV1MBvk5sQvRzvyimvhBvzJzT lm5gtu9GVSK12W0/VmDzxrkwPv64uDKLbIicBnyXg0k7+ssMCYyvdJPFxeRJy1BJaib5 9CX8eAZxy+ggQ7YHQG6fLKe5UWbFtMAMt3ylmjWRm1HUXfRjID/sJpMr/afHj1nNDhiP 5yLB+RsCCzxpSMPzFZJasxF5tD5ZJE9eJRTCuH+w2/GRba4M8+ifJDHOWAJt6xK5CvXX zNZTvLXVV3ys0lde0MW+JtCKuck/3rDKWyh90rsFW0AelgsVIZqgOMQTlg785ATL7wmJ 29oQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=5Mz4rua7Pa3nzFZ50ABKH/hndRiO+i+rdmf7JPPiPnM=; b=kZY/GSQHt72lRrlzdp7jZJ2WX2oPujv8/llCX8cAmbQJ3LQ6kfThnCikEo+U0fXcjV 8ByXhsSsobCnTAjvYSC7hd2r6TDRu+ey/C3NiYOgXcz2XelrGYR4krV4n8fVVXQZhjXP NTUn2htQ7xxtMzINFNspvAZ/JcZyLCdtwdCDWtaeiMRJ5R+42dZOFJKzP0G8kbnX2LUU NslEMSef8EXJg9aVOyeP3nCul4XDstKcnr8zZfWHnR5WEqFdjS2Obf/1hvvXBlbfO/gY 0GvAjGWlTjhqZWlqg/cnzqoWXQ+hIks+/970jOATc+diOf/nw0FsND50oyC6nmZxebYb 4deg== X-Gm-Message-State: APf1xPDdtgkQCcHwO+Iug6D/RBe5KZvTyWEIQ9goOFMdbbVIyhE9E8WE 0yW6hyZD4V6qQhKKqZPmPgYoMlEObbczTJFxXp1R0Q== X-Google-Smtp-Source: AH8x225eE+wjAhvjheRNf7o1LatxKVBlohyL7Y7OgqbzI4/6x7zoLT7/utKn3KzHbYvGH0feMdfYxMBbK/H5gQgIhA4= X-Received: by 10.200.48.13 with SMTP id f13mr21554173qte.140.1518992714326; Sun, 18 Feb 2018 14:25:14 -0800 (PST) MIME-Version: 1.0 Received: by 10.237.58.99 with HTTP; Sun, 18 Feb 2018 14:25:13 -0800 (PST) In-Reply-To: <24563F96-B1A3-48E6-ABE3-D77E0887FFEE@yahoo.com> References: <6907E068-C80A-44B8-A8AD-3EF27D52D127@yahoo.com> <20832C61-AA5D-41A6-8BF9-90CC87D17219@yahoo.com> <6D47FEC0-7991-4F76-AC31-2CC1E8934521@yahoo.com> <24563F96-B1A3-48E6-ABE3-D77E0887FFEE@yahoo.com> From: Mateusz Guzik Date: Sun, 18 Feb 2018 23:25:13 +0100 Message-ID: Subject: Re: amd64 head -r329465 (non-debug build, but with symbols): "panic: spin lock held too long" during make check-old, reported during a sys_vfork To: Mark Millard Cc: =?UTF-8?Q?Trond_Endrest=C3=B8l?= , FreeBSD Hackers , FreeBSD Current Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.25 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 18 Feb 2018 22:25:15 -0000 On Sun, Feb 18, 2018 at 10:50 PM, Mark Millard wrote: > > > On 2018-Feb-18, at 1:46 PM, Mark Millard wrote= : > > > On 2018-Feb-18, at 1:33 PM, Mateusz Guzik wrote: > > > >> On Sun, Feb 18, 2018 at 9:38 PM, Trond Endrest=C3=B8l < > >> Trond.Endrestol@fagskolen.gjovik.no> wrote: > >> > >>> On Sun, 18 Feb 2018 11:51-0800, Mark Millard wrote: > >>> > >>>> Note: -r329448 was reverted in -r329461 : racy. > >>> > >>> True. I got a crash when compiling r329451 while running r329449. > >>> I've now booted the r329422 ZFS BE and I'm attempting to build > >>> r329529. > >>> > >> > >> Looking around strongly suggests r329448 is the culprit. If you can > verify > >> 329447 works fine we are mostly done here. > >> > >> Note the revision got reverted and different variant got in in r329531= . > >> > >> That said, if r329447 works then the issue should be already fixed and > in > >> particular fresh head should work fine. > > > > My initial problem was with -r329465, which is after -r329461 reverted > > -r329488 . Trond reported in one note that he had problems with > > -r329464 , also after -r329488 was reverted. Trond has also reported > > -r329449 failed. > > Dumb typos above: I meant -r329448 instead of -r329488 both times. > > Ok, I think I see the bug: exit1 does: PROC_SLOCK(p); p->p_state =3D PRS_ZOMBIE; /* work continues */ pre-patch proc_to_reap does an equivalent of: if (p->p_state =3D=3D PRS_ZOMBIE) { PROC_SLOCK(p); PROC_SUNLOCK(p); .... reap; } It is possible the exiting thread will be caught just after setting the state to PRS_ZOMBIE. With the slock/sunlock cycle we guarantee the reaping thread will wait for it to finish. Without the cycle we can end up reaping the still exiting thread. I'll fix it soon(tm). --=20 Mateusz Guzik