From owner-svn-src-all@freebsd.org Mon Dec 16 19:39:55 2019 Return-Path: Delivered-To: svn-src-all@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id B083F1CD087; Mon, 16 Dec 2019 19:39:55 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: from mail-wr1-x441.google.com (mail-wr1-x441.google.com [IPv6:2a00:1450:4864:20::441]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 47cBSB5mLFz4FSB; Mon, 16 Dec 2019 19:39:54 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: by mail-wr1-x441.google.com with SMTP id c9so8737690wrw.8; Mon, 16 Dec 2019 11:39:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=lm02+9y8gY0tFXLCOtsh3zsT3YpTQ/7ULo+wLLenKZU=; b=tEO8Pb0SyaE4mRy7tS7OXK4TaM/6+1IKkwPRJeFCi6g1Z4vy+j56kJ+JH8VBsWaaGN gqTSAVbhgkaxeAvyOg+nN2ttZSv8pYlaj7KRsYmzYrX95yIxvK52L5PtkR2zWK2ObkLU eUGSBtMQClmYI1LCaUArJD74DOvQTIZn5JvteK1SQkYz2AF7CmAur2y4gNIcOONILWYL scZnk5Hmg080WoHHJ2R9EK9D30it3O+2ZtuAFHJqVLPosi8XHurH71SbYXTT++Q170c6 t5+RCNPYDZoo5ac86cQ7ePZCqTZl1zrgjkrWT27JYBvxcPrngzp5JP//QYwVanAllxhl 5EAA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=lm02+9y8gY0tFXLCOtsh3zsT3YpTQ/7ULo+wLLenKZU=; b=PKHjDPXdt12TMtvGoIxNS0XI12hGK0mZpJWcjok8ytKNbBV88wzoRM+ozLbg2N758C SjrER2nKxo9707Llkm3LkzXqMikhCRaP0H1kYeVCOISmz7rJC4ROkTxQzy0PjibfbgWI x86UiTa9aIkHdh7S/XLpD9TGDoITYuWFl9E2yxiskcR24NRjkVhD+qmSDe3ReBQMRR00 KeIQnqOZ+NOZza0xMwajmEZcdYSjDOHlib+5KTf9FGC3n5OuknpGMUFRnWPJvRWJoPg1 wKOMzZutS1Jh/NgKXI9incbrJLJVpdMNREfarhV46rmLcE3oE8rHjxhWv+vSVIwk4PLI 9vBA== X-Gm-Message-State: APjAAAW5qnnYqJxIcpPrv7HIfvNXLLqx10V89euyb6qTwZIIF4NVkrIo RdHBZy5yMNEw39hOMYZTifABdKv/WUJkcWBUKreF8Q== X-Google-Smtp-Source: APXvYqyWC85tSvbSsmJAfGBu5szooxAFn2I8XyQImJ6ET8sdbdc+M9WKUQQYPWoiXYZ6CiJQE2Ds/D/z0KuyX5EY0nU= X-Received: by 2002:a5d:4d8d:: with SMTP id b13mr33861558wru.6.1576525193269; Mon, 16 Dec 2019 11:39:53 -0800 (PST) MIME-Version: 1.0 Received: by 2002:a5d:6b02:0:0:0:0:0 with HTTP; Mon, 16 Dec 2019 11:39:52 -0800 (PST) In-Reply-To: References: <201912152126.xBFLQoJ7037440@repo.freebsd.org> From: Mateusz Guzik Date: Mon, 16 Dec 2019 20:39:52 +0100 Message-ID: Subject: Re: svn commit: r355784 - in head/sys: compat/linuxkpi/common/src dev/dpaa kern mips/nlm sys To: Jeff Roberson Cc: Ryan Libby , Ed Maste , Jeff Roberson , src-committers , svn-src-all , svn-src-head , Mark Johnston , Konstantin Belousov Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 47cBSB5mLFz4FSB X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20161025 header.b=tEO8Pb0S; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of mjguzik@gmail.com designates 2a00:1450:4864:20::441 as permitted sender) smtp.mailfrom=mjguzik@gmail.com X-Spamd-Result: default: False [-3.00 / 15.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20161025]; FROM_HAS_DN(0.00)[]; R_SPF_ALLOW(-0.20)[+ip6:2a00:1450:4000::/36]; FREEMAIL_FROM(0.00)[gmail.com]; MIME_GOOD(-0.10)[text/plain]; IP_SCORE(0.00)[ip: (2.38), ipnet: 2a00:1450::/32(-2.67), asn: 15169(-1.90), country: US(-0.05)]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; IP_SCORE_FREEMAIL(0.00)[]; TO_MATCH_ENVRCPT_SOME(0.00)[]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[gmail.com:+]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; RCVD_IN_DNSWL_NONE(0.00)[1.4.4.0.0.0.0.0.0.0.0.0.0.0.0.0.0.2.0.0.4.6.8.4.0.5.4.1.0.0.a.2.list.dnswl.org : 127.0.5.0]; RCPT_COUNT_SEVEN(0.00)[9]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US]; RCVD_COUNT_TWO(0.00)[2]; RCVD_TLS_ALL(0.00)[]; DWL_DNSWL_NONE(0.00)[gmail.com.dwl.dnswl.org : 127.0.5.0] X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 16 Dec 2019 19:39:55 -0000 The entire thing reads like a bug -- for each platform critical_exit can venture into mi_switch with interrupts disabled. On amd64 it just happens to do it after the count was updated, but still before they got-reenabled. Other platform also do critical enter/exit dance on each call instead of just on the 0<->1 transition of md_spinlock_count. iow each platform should be probably doing something like this instead: diff --git a/sys/amd64/amd64/machdep.c b/sys/amd64/amd64/machdep.c index 7aadd86e3cd6..c90fe8ef961d 100644 --- a/sys/amd64/amd64/machdep.c +++ b/sys/amd64/amd64/machdep.c @@ -1991,10 +1991,10 @@ spinlock_enter(void) td = curthread; if (td->td_md.md_spinlock_count == 0) { + critical_enter(); flags = intr_disable(); td->td_md.md_spinlock_count = 1; td->td_md.md_saved_flags = flags; - critical_enter(); } else td->td_md.md_spinlock_count++; } @@ -2009,8 +2009,8 @@ spinlock_exit(void) flags = td->td_md.md_saved_flags; td->td_md.md_spinlock_count--; if (td->td_md.md_spinlock_count == 0) { - critical_exit(); intr_restore(flags); + critical_exit(); } } On 12/16/19, Jeff Roberson wrote: > On Mon, 16 Dec 2019, Ryan Libby wrote: > >> On Mon, Dec 16, 2019 at 7:30 AM Ed Maste wrote: >>> >>> On Sun, 15 Dec 2019 at 16:27, Jeff Roberson wrote: >>>> >>>> Author: jeff >>>> Date: Sun Dec 15 21:26:50 2019 >>>> New Revision: 355784 >>>> URL: https://svnweb.freebsd.org/changeset/base/355784 >>>> >>>> Log: >>>> schedlock 4/4 >>> >>> FYI i386, arm, arm64, riscv fail to boot now, with "panic: invalid count >>> 2" >>> >>> Boot logs: >>> i386: https://ci.freebsd.org/job/FreeBSD-head-i386-test/7797/console >>> arm: >>> https://ci.freebsd.org/hwlab/job/FreeBSD-device-head-beaglebone-test/1317/artifact/device_tests/beaglebone.boot.log >>> arm64: >>> https://ci.freebsd.org/hwlab/job/FreeBSD-device-head-pinea64-test/1194/artifact/device_tests/pinea64.boot.log >>> riscv: >>> https://ci.freebsd.org/hwlab/job/FreeBSD-device-head-pinea64-test/1194/artifact/device_tests/pinea64.boot.log >>> >>> arm64 is: >>> >>> panic: invalid count 2 >>> cpuid = 0 >>> time = 1 >>> KDB: stack backtrace: >>> db_trace_self() at db_trace_self_wrapper+0x28 >>> pc = 0xffff0000007359ec lr = 0xffff000000106744 >>> sp = 0xffff000056b063c0 fp = 0xffff000056b065d0 >>> >>> db_trace_self_wrapper() at vpanic+0x18c >>> pc = 0xffff000000106744 lr = 0xffff000000408128 >>> sp = 0xffff000056b065e0 fp = 0xffff000056b06690 >>> >>> vpanic() at panic+0x44 >>> pc = 0xffff000000408128 lr = 0xffff000000407ed8 >>> sp = 0xffff000056b066a0 fp = 0xffff000056b06720 >>> >>> panic() at sched_switch+0x81c >>> pc = 0xffff000000407ed8 lr = 0xffff000000434264 >>> sp = 0xffff000056b06730 fp = 0xffff000056b06810 >>> >>> sched_switch() at mi_switch+0x170 >>> pc = 0xffff000000434264 lr = 0xffff000000413690 >>> sp = 0xffff000056b06820 fp = 0xffff000056b06840 >>> >>> mi_switch() at cpu_idle+0xc8 >>> pc = 0xffff000000413690 lr = 0xffff0000007400a0 >>> sp = 0xffff000056b06850 fp = 0xffff000056b06860 >>> >>> cpu_idle() at sched_idletd+0x380 >>> pc = 0xffff0000007400a0 lr = 0xffff000000436a90 >>> sp = 0xffff000056b06870 fp = 0xffff000056b06940 >>> >>> sched_idletd() at fork_exit+0x7c >>> pc = 0xffff000000436a90 lr = 0xffff0000003c7ba4 >>> sp = 0xffff000056b06950 fp = 0xffff000056b06980 >>> >>> fork_exit() at fork_trampoline+0x10 >>> pc = 0xffff0000003c7ba4 lr = 0xffff0000007521ac >>> sp = 0xffff000056b06990 fp = 0x0000000000000000 >>> >>> KDB: enter: panic >>> [ thread pid 11 tid 100003 ] >>> Stopped at 0 >>> db> >> >> It looks like amd64 vs i386, riscv, etc are using different motifs in >> spinlock_exit(). Perhaps we just need to rearrange them to drop the >> spinlock count before critical_exit(), like in amd64. > > It took me a moment to see why but I believe you are right. Interrupts > being disabled would prevent a local preemption with the flags out of sync > but critical_exit() might have owepreempt set so we will switch before > updating the count. > > Jeff > >> >> Ryan >> > -- Mateusz Guzik