From nobody Thu Mar 6 18:22:59 2025 X-Original-To: dev-commits-src-all@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4Z7yTH4524z5q1K0; Thu, 06 Mar 2025 18:22:59 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R10" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Z7yTH2wX8z3j9n; Thu, 06 Mar 2025 18:22:59 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1741285379; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=LnuVpKVt5g395bu6mixkhI7VjmSxoZFfNxrATa+lqso=; b=H2p9Nvuykrfvrfafy7X7I8LCBGop0qQwxZl77mHQMSUgmZi7swicQVNUhofJrtW1Gi6doB EJpneNzlDhPXgKYUXlzQoowVvDnxbnUVL2tVmJi9k+I8PpYs2zOn+/h7xZyiryZS9ze3aC 2Y4u4e7O1DDicJlzPKa12SOprC9s2WIOwXZKRB639MO3jvx5GW59kZGyJyhATiPTWLkSX9 n2WEHognQSQ6FwAFHZjsNrIf/Ux3sY1ZrUI6do3TYZSVch4X79Qu5kL3xWumZc06ztLK51 PUskU0QXv0G3wQ+aBkN4eL89MqRlyk4frD4ZupLbE3aDW07dL1kxTZzpVeT/dA== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1741285379; a=rsa-sha256; cv=none; b=VFi3kZhWA/NjRDZNLUy6AfwyUR4h6SEirzPifbrzeztivZFbgKxTQi+wRVcCkx2uddiND0 38ZS+dwjjcMIeoxUFSe2Lt+vtPKYPBWnUyjOPbVkdgUAuSlWu7E+nWsxydBXcZPeQcMv06 oLmdn7xxSQnRfjgbPzcpFFZYvpv4kl/DaELfCwUiKIEZuaVHXR6kqmk3Zmc4CsUkD5E+fI 5Ur0O2XSbrclYsPRZFA2R4rcG+xySCw6ypqXFM15WWKdmjkhPT4jGJIfjU4lpBFMWiVtg6 PWVkkQV4SZX3PyK0Kg8G7Z+yapJD1OrjA6ME0CpjNtlP3GRoQpAKm93iIgqIHQ== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1741285379; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=LnuVpKVt5g395bu6mixkhI7VjmSxoZFfNxrATa+lqso=; b=q7nsIhnf4wGoHXov7WsX9cN3IITQ0mjJ5nfuI+06RgzaixEd6rR/rLTgJW7oFV3AiUq3tR pD4IRO6/7ncjGtrS+YP5ETSsC3VeoGRrR0I9hFdrHtPeUv5Kh7FD3fELg75NGIeX6fFO6t 7OJQMm+M5cHk4iDW6hbm3toTsSD/dlppTz+/Iabk73wBg8Gr4Hqmt0nr/nJXVslP47juH/ lnDOYser2CUHTIwc20EeAvw+Zyyt3DZ45EdUcDhVlIlugoLpdSEfo/5YnJPZa1Qo8JrbBh TacfdojPnMClP0AihfXOM/9orcblHUSR0RAgK/2rorkJuUNppPd8Apuguc769Q== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4Z7yTH2PJDz11xw; Thu, 06 Mar 2025 18:22:59 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.18.1/8.18.1) with ESMTP id 526IMxGF045511; Thu, 6 Mar 2025 18:22:59 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.18.1/8.18.1/Submit) id 526IMxQT045508; Thu, 6 Mar 2025 18:22:59 GMT (envelope-from git) Date: Thu, 6 Mar 2025 18:22:59 GMT Message-Id: <202503061822.526IMxQT045508@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org From: John Baldwin Subject: git: ecb3a7d43dd6 - main - netmap: Disable a buggy and unsafe test (sync_kloop_conflict) List-Id: Commit messages for all branches of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-all List-Help: List-Post: List-Subscribe: List-Unsubscribe: X-BeenThere: dev-commits-src-all@freebsd.org Sender: owner-dev-commits-src-all@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: jhb X-Git-Repository: src X-Git-Refname: refs/heads/main X-Git-Reftype: branch X-Git-Commit: ecb3a7d43dd67809037f9066e7716a05c41d8d63 Auto-Submitted: auto-generated The branch main has been updated by jhb: URL: https://cgit.FreeBSD.org/src/commit/?id=ecb3a7d43dd67809037f9066e7716a05c41d8d63 commit ecb3a7d43dd67809037f9066e7716a05c41d8d63 Author: John Baldwin AuthorDate: 2025-03-06 18:22:25 +0000 Commit: John Baldwin CommitDate: 2025-03-06 18:22:25 +0000 netmap: Disable a buggy and unsafe test (sync_kloop_conflict) This test starts two threads to verify that two concurrent threads cannot enter the kernel loop on the same netmap context. The test even has a comment about a potential race condition where the first thread enters the loop and is stopped before the second thread tries to enter the loop. It claims it is fixed by the use of a semaphore. Unfortunately, the semaphore doesn't close the race. In the CI setup for CHERI, we run the testsuite once a week against various architectures using single CPU QEMU instances. Across multiple recent runs of the plain "aarch64" test the job ran for an entire day before QEMU was killed by a timeout. The last messages logged were from this test: 734.881045 [1182] generic_netmap_attach Emulated adapter for tap3312 created (prev was NULL) 734.882340 [ 321] generic_netmap_register Emulated adapter for tap3312 activated 734.882675 [2224] netmap_csb_validate csb_init for kring tap3312 RX0: head 0, cur 0, hwcur 0, hwtail 0 734.883042 [2224] netmap_csb_validate csb_init for kring tap3312 TX0: head 0, cur 0, hwcur 0, hwtail 1023 734.915397 [ 820] netmap_sync_kloop kloop busy_wait 1, direct_tx 0, direct_rx 0, na_could_sleep 0 736.901945 [ 820] netmap_sync_kloop kloop busy_wait 1, direct_tx 0, direct_rx 0, na_could_sleep 0 From the timestamps, the synchronous kloop was entered twice 2 seconds apart. This corresponds to the 2 second timeout on the semaphore in the test. What appears to have happened is that th1 started and entered the kernel where it spun in an endless busy loop. This starves th2 so it _never_ runs. Once the semaphore times out, th1 is preempted to run the main thread which invokes the ioctl to stop the busy loop. th1 then exits the loop and returns to userland to exit. Only after this point does th2 actually run and execute the ioctl to enter the kernel. Since th1 has already exited, th2 doesn't error and enters its own happy spin loop. The main thread hangs forever in pthread_join, and the process is unkillable (the busy loop in the kernel doesn't check for any pending signals so kill -9 is ignored and ineffective). I don't see a way to fix this test, so I've just disabled it. There is no good way to ensurce concurrency on a single CPU system when one thread wants to sit in a spin loop. Someone should fix the netmap kloop to respond to kill -9 in which case kyua could perhaps at least timeout the individual test process and kill it. Reviewed by: vmaffione Obtained from: CheriBSD Sponsored by: AFRL, DARPA Differential Revision: https://reviews.freebsd.org/D49220 --- tests/sys/netmap/ctrl-api-test.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/tests/sys/netmap/ctrl-api-test.c b/tests/sys/netmap/ctrl-api-test.c index 8d33b4c58d2a..6b45dbb1cfea 100644 --- a/tests/sys/netmap/ctrl-api-test.c +++ b/tests/sys/netmap/ctrl-api-test.c @@ -1596,6 +1596,7 @@ sync_kloop_csb_enable(struct TestContext *ctx) return sync_kloop_start_stop(ctx); } +#if 0 static int sync_kloop_conflict(struct TestContext *ctx) { @@ -1640,6 +1641,14 @@ sync_kloop_conflict(struct TestContext *ctx) /* Wait for one of the two threads to fail to start the kloop, to * avoid a race condition where th1 starts the loop and stops, * and after that th2 starts the loop successfully. */ + /* + * XXX: This doesn't fully close the race. th2 might fail to + * start executing since th1 can enter the kernel and hog the + * CPU on a single-CPU system until the semaphore timeout + * awakens this thread and it calls sync_kloop_stop. Once th1 + * exits the kernel, th2 can finally run and will then loop + * forever in the ioctl handler. + */ clock_gettime(CLOCK_REALTIME, &to); to.tv_sec += 2; ret = sem_timedwait(&sem, &to); @@ -1674,6 +1683,7 @@ sync_kloop_conflict(struct TestContext *ctx) ? 0 : -1; } +#endif static int sync_kloop_eventfds_mismatch(struct TestContext *ctx) @@ -2079,7 +2089,9 @@ static struct mytest tests[] = { decltest(sync_kloop_eventfds_all_direct_rx), decltest(sync_kloop_nocsb), decltest(sync_kloop_csb_enable), +#if 0 decltest(sync_kloop_conflict), +#endif decltest(sync_kloop_eventfds_mismatch), decltest(null_port), decltest(null_port_all_zero),