From nobody Mon Dec 13 15:45:07 2021 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 1096518E9CAE; Mon, 13 Dec 2021 15:45:10 +0000 (UTC) (envelope-from jhb@FreeBSD.org) Received: from smtp.freebsd.org (smtp.freebsd.org [96.47.72.83]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4JCQnK4tmKz3p5m; Mon, 13 Dec 2021 15:45:09 +0000 (UTC) (envelope-from jhb@FreeBSD.org) Received: from [IPV6:2601:648:8601:8b20:b8bf:4bdc:6c55:cbbe] (unknown [IPv6:2601:648:8601:8b20:b8bf:4bdc:6c55:cbbe]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) (Authenticated sender: jhb) by smtp.freebsd.org (Postfix) with ESMTPSA id 199AF27F56; Mon, 13 Dec 2021 15:45:09 +0000 (UTC) (envelope-from jhb@FreeBSD.org) Message-ID: <1db0942e-0e66-4337-ce2f-4e1005107435@FreeBSD.org> Date: Mon, 13 Dec 2021 07:45:07 -0800 List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:91.0) Gecko/20100101 Thunderbird/91.4.0 Content-Language: en-US To: Gleb Smirnoff Cc: "freebsd-current@freebsd.org" , x11@FreeBSD.org From: John Baldwin Subject: smr inp breaks some jail use cases and panics with i915kms don't switch to the console anymore Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1639410309; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=YJ22Q5zdjxhfowYtVZIEmNZOSG16cSE2z7m8/oETB8Q=; b=xFd8Y7VGFCQjNSmAPbMHjQEL0LrtZ3uNvK009NwGmGDPsmC3k3nH2HovPAxZNBEBBgDAqb i4ujasSKktGyEXjX/WX/zSscfk83etuoF9UJPaVRRWjT07MwoCQYeKA4DQMErNvQynUDuU kDrrzWXF8RJqE2l6+izbeMX0okj27m38tg8fesW8wEn0N72yzxahzSwGMK4gtObxg8slLq YufCRXoh4S6GgqLuAzFda0ANyo6FoCox/UgQBpliI+IRvl+8Bw57z+lCFsm58PJdnekB1H 0xI+Y69fO5McTfXp+TuRiER0TtxvsBXCx5uP91UDpZkUUXn83AqPeJ0PWpAnCw== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1639410309; a=rsa-sha256; cv=none; b=QpBCJsJ9qCgAMWsdZMHh95GEhgwT2v4B8VCYgSiSBeQI3yWpIBtE/f9T3MRjUUCC6H8Imw U0n0T05yjrqlOvUqJ7DKOfpaHmUHjSmps85X15LkULsRbmO42mvcDlZyfWhwdcp2NQukZy vEiRZNinevU7q9IRrJslWhG7NzfA0+Z8kh4PtE+fxlPrEZvD9CUw3C7csc76dAKQjcdxvT 88TTOjVvWKmEWwDjrK5mpwK5z4v2UN/7BVitt5bc2fyAqI2DL4iekGtnEkiPf4Lq+L+mQz 9AjiCdGPCnlss1Vk8MypqHDcauli8e+KyoGAc56rkb351cV9P9kSDVpmgJhtzQ== ARC-Authentication-Results: i=1; mx1.freebsd.org; none X-ThisMailContainsUnwantedMimeParts: N This weekend I upgraded my FreeBSD laptop and kicked off a poudriere build of the packages I use. My laptop kept "freezing" during the package builds however. Initially due to messages in /var/log/messages I thought it was running out of swap and killing the display server. After poking it at off and on over the weekend I finally narrowed it down to building the devel/apr1 port, and built it on the console (rather than X) and was greeted with the following panic: panic: malloc(M_WAITOK) with sleeping prohibited cpuid = 7 time = 1639374072 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe001e5b55b0 vpanic() at vpanic+0x17f/frame 0xfffffe001e5b5600 panic() at panic+0x43/frame 0xfffffe001e5b5660 malloc_dbg() at malloc_dbg+0xd4/frame 0xfffffe001e5b5680 malloc() at malloc+0x2d/frame 0xfffffe001e5b56c0 intel_atomic_state_alloc() at intel_atomic_state_alloc+0x20/frame 0xfffffe001e5b56e0 drm_client_modeset_commit_atomic() at drm_client_modeset_commit_atomic+0x30/frame 0xfffffe001e5b5750 drm_client_modeset_commit_force() at drm_client_modeset_commit_force+0x6f/frame 0xfffffe001e5b5790 drm_fb_helper_restore_fbdev_mode_unlocked() at drm_fb_helper_restore_fbdev_mode_unlocked+0x82/frame 0xfffffe001e5b57c0 vt_kms_postswitch() at vt_kms_postswitch+0x18b/frame 0xfffffe001e5b57f0 vt_window_switch() at vt_window_switch+0x261/frame 0xfffffe001e5b5830 vtterm_cngrab() at vtterm_cngrab+0x4f/frame 0xfffffe001e5b5850 cngrab() at cngrab+0x26/frame 0xfffffe001e5b5870 vpanic() at vpanic+0xee/frame 0xfffffe001e5b58c0 panic() at panic+0x43/frame 0xfffffe001e5b5920 witness_checkorder() at witness_checkorder+0xd1c/frame 0xfffffe001e5b5ae0 __mtx_lock_flags() at __mtx_lock_flags+0x94/frame 0xfffffe001e5b5b30 prison_check_ip4() at prison_check_ip4+0x51/frame 0xfffffe001e5b5b60 in_pcblookup_hash_locked() at in_pcblookup_hash_locked+0x2b6/frame 0xfffffe001e5b5bc0 in_pcblookup_mbuf() at in_pcblookup_mbuf+0x84/frame 0xfffffe001e5b5c00 tcp_input_with_port() at tcp_input_with_port+0x635/frame 0xfffffe001e5b5d50 tcp_input() at tcp_input+0xb/frame 0xfffffe001e5b5d60 ip_input() at ip_input+0x25e/frame 0xfffffe001e5b5de0 swi_net() at swi_net+0x1a1/frame 0xfffffe001e5b5e60 ithread_loop() at ithread_loop+0x279/frame 0xfffffe001e5b5ef0 fork_exit() at fork_exit+0x80/frame 0xfffffe001e5b5f30 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe001e5b5f30 --- trap 0x61e8cb8b, rip = 0x8b48000000f890ff, rsp = 0x52ff38244c8d4800, rbp = 0x245c8948ccc35f20 --- So there are two things here. The root issue is that the devel/apr1 port runs a configure test for TCP_NDELAY being inherited by accepted sockets. This test panics because prison_check_ip4() tries to lock a prison mutex to walk the IPs assigned to a jail, but the caller (in_pcblookup_hash()) has done an smr_enter() which is a critical_enter(): (kgdb) p panicstr $1 = 0xffffffff81ea90b0 "acquiring blockable sleep lock with spinlock or critical section held (sleep mutex) jail mutex @ /usr/src/sys/netinet/in_jail.c:418" (kgdb) frame 39 #39 0xffffffff80dbcf71 in prison_check_ip4 (cred=, ia=ia@entry=0xfffffe001e5b5b90) at /usr/src/sys/netinet/in_jail.c:418 418 mtx_lock(&pr->pr_mtx); (kgdb) l 413 KASSERT(ia != NULL, ("%s: ia is NULL", __func__)); 414 415 pr = cred->cr_prison; 416 if (!(pr->pr_flags & PR_IP4)) 417 return (0); 418 mtx_lock(&pr->pr_mtx); 419 if (!(pr->pr_flags & PR_IP4)) { 420 mtx_unlock(&pr->pr_mtx); 421 return (0); 422 } (kgdb) up #41 0xffffffff80dc5cb4 in in_pcblookup_hash (pcbinfo=0xfffffe0022db7748, faddr=..., fport=2166892021, laddr=..., lport=0, lookupflags=, numa_domain=56 '8', ifp=) at /usr/src/sys/netinet/in_pcb.c:2387 2387 inp = in_pcblookup_hash_locked(pcbinfo, faddr, fport, laddr, lport, (kgdb) l 2382 struct ifnet *ifp, uint8_t numa_domain) 2383 { 2384 struct inpcb *inp; 2385 2386 smr_enter(pcbinfo->ipi_smr); 2387 inp = in_pcblookup_hash_locked(pcbinfo, faddr, fport, laddr, lport, 2388 lookupflags & INPLOOKUP_WILDCARD, ifp, numa_domain); 2389 if (inp != NULL) { 2390 if (__predict_false(inp_smr_lock(inp, 2391 (lookupflags & INPLOOKUP_LOCKMASK)) == false)) However, it was a bit harder to see this originally as the 915kms driver tries to do a malloc(M_WAITOK) from cn_grab() when entering DDB which recursively panics (even a malloc(M_NOWAIT) from cn_grab() is probably a bad idea). When it panicked in X the result was that the screen just froze on whatever it had most recently drawn and the machine looked hung. (The fact that that sysbeep is off so I couldn't tell if typing in commands was doing anything vs emitting errors probably didn't improve trying to diagnose the hang as "sitting in ddb" initially, though I don't know if DDB itself emits a beep for invalid commands, etc.) -- John Baldwin