From owner-freebsd-net@freebsd.org Wed Oct 21 22:05:06 2020 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 692A242F06B for ; Wed, 21 Oct 2020 22:05:06 +0000 (UTC) (envelope-from rysto32@gmail.com) Received: from mail-ed1-x542.google.com (mail-ed1-x542.google.com [IPv6:2a00:1450:4864:20::542]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4CGl0d6HYXz3bN1 for ; Wed, 21 Oct 2020 22:05:05 +0000 (UTC) (envelope-from rysto32@gmail.com) Received: by mail-ed1-x542.google.com with SMTP id l16so4127949eds.3 for ; Wed, 21 Oct 2020 15:05:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=OKuST+K01ev7xsxzh4aX3LoSmD5qxcOfx2ciexncdNg=; b=op8NEwmNWjduKJ5AfnksmhWEcozmFP3/HCTuDUGt+z6bQpV6434TTAPunAlcWvz2vW Oct8CO2QbvXjOkTz3O0o3sabysV9ntjDQGRsSB0LPNgSJfyhcqRbqlQdy6xSYBIrEb+X tmyP1hKBhW5XgCxbZjjhzFtrV+n3taVctJ3pzwLDJnFVuUdPBIpmnCDh87O9e+nYbyoZ aFMWpxy9VTbnkZPSown3f4EhgdkhRjOisXWYp/QtVYcZkFKjdlOymyinJcThlKsIqpZw fADvf2KHkghknOgNXlgUQScTmtrjOX9pc6OsB28eQJND5YTae8IjbcTIeINmciKmUeu4 oK/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=OKuST+K01ev7xsxzh4aX3LoSmD5qxcOfx2ciexncdNg=; b=VyjbM+Es3w3U5LBN6lS9CyzYilStT9TJyqPUeAuRYrXWnv/7/NyV9PxixhQffGDe48 Gqnom0RgW75R9ufGLXhlAWCzL5ZATME/pJDDD35aplfnfD0/6sQTK3ApcbBnQLV+X9td shwkt84wt8+OMOCfQcZ+LokKJ8mXTyJ1jp8OybHKxXgNUn2Q70gojx6BVzM+YeJe0X5q 6HS1aADhk2wM5Y5bEICNE2JuZ1ad0gHwwB2m/1uMfHUDBxye+soWA2O4uYFQWo7Fp3pG pUEgLza+zIQCQAZxJrgfBmjHp0CzFIeL/RL2KgtwaYks5I6daEP3Ls6xx/SXMSEEzK1D 6Osg== X-Gm-Message-State: AOAM530P03FQKPBOx366upa0txXO9mtfgFkWf0bWsQ1p62P1gfui1m1r PvP2MfnQN8GYJETXwaGCCSzfPFDN9mpMckqvTgMMdydQf84= X-Google-Smtp-Source: ABdhPJyuROXwlXAvz04X0XQgt2ia3If/QGWK3zOfOfNwgu63sprPXHeonMb5X57Op9KQRb8+jEhhH58AtTeKPKfP5Z4= X-Received: by 2002:a05:6402:1cbb:: with SMTP id cz27mr4889158edb.38.1603317903954; Wed, 21 Oct 2020 15:05:03 -0700 (PDT) MIME-Version: 1.0 From: Ryan Stone Date: Wed, 21 Oct 2020 18:04:52 -0400 Message-ID: Subject: Panic in in6_joingroup_locked To: freebsd-net Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 4CGl0d6HYXz3bN1 X-Spamd-Bar: --- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20161025 header.b=op8NEwmN; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of rysto32@gmail.com designates 2a00:1450:4864:20::542 as permitted sender) smtp.mailfrom=rysto32@gmail.com X-Spamd-Result: default: False [-3.13 / 15.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-0.98)[-0.979]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20161025]; RCVD_TLS_ALL(0.00)[]; FROM_HAS_DN(0.00)[]; R_SPF_ALLOW(-0.20)[+ip6:2a00:1450:4000::/36:c]; FREEMAIL_FROM(0.00)[gmail.com]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-net@freebsd.org]; NEURAL_HAM_LONG(-0.99)[-0.991]; RCPT_COUNT_ONE(0.00)[1]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[gmail.com:+]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; RCVD_IN_DNSWL_NONE(0.00)[2a00:1450:4864:20::542:from]; NEURAL_HAM_SHORT(-0.16)[-0.160]; TO_MATCH_ENVRCPT_ALL(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US]; RCVD_COUNT_TWO(0.00)[2]; MAILMAN_DEST(0.00)[freebsd-net]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 21 Oct 2020 22:05:06 -0000 Today at $WORK we saw a panic due to a race between in6_joingroup_locked and if_detach_internal. This happened on a branch that's about 2 years behind head, but the relevant code in head does not appear to have changed. The backtrace of the panic was this: panic: Fatal trap 9: general protection fault while in kernel mode Stack: -------------------------------------------------- kernel:trap_fatal+0x96 kernel:trap+0x76 kernel:in6_joingroup_locked+0x2c7 kernel:in6_joingroup+0x46 kernel:in6_update_ifa+0x18e5 kernel:in6_ifattach+0x4d0 kernel:in6_if_up+0x99 kernel:if_up+0x7d kernel:ifhwioctl+0xcea kernel:ifioctl+0x2c9 kernel:kern_ioctl+0x29b kernel:sys_ioctl+0x16d kernel:amd64_syscall+0x327 We panic'ed here, because the memory pointed to by ifma has been freed and filled with 0xdeadc0de: https://svnweb.freebsd.org/base/head/sys/netinet6/in6_mcast.c?revision=365071&view=markup#l421 Another thread was in the process of trying to destroy the same interface. It had the following backtrace at the time of the panic: #0 sched_switch (td=0xfffffea654845aa0, newtd=0xfffffea266fa9aa0, flags=) at /b/mnt/src/sys/kern/sched_ule.c:2423 #1 0xffffffff80643071 in mi_switch (flags=, newtd=0x0) at /b/mnt/src/sys/kern/kern_synch.c:605 #2 0xffffffff80693234 in sleepq_switch (wchan=0xffffffff8139cc90 , pri=0) at /b/mnt/src/sys/kern/subr_sleepqueue.c:612 #3 0xffffffff806930c3 in sleepq_wait (wchan=0xffffffff8139cc90 , pri=0) at /b/mnt/src/sys/kern/subr_sleepqueue.c:691 #4 0xffffffff8063fcb3 in _sx_xlock_hard (sx=, x=, opts=0, timo=0, file=, line=) at /b/mnt/src/sys/kern/kern_sx.c:936 #5 0xffffffff8063f313 in _sx_xlock (sx=0xffffffff8139cc90 , opts=0, timo=, file=0xffffffff80ba6d2a "/b/mnt/src/sys/net/i f_vlan.c", line=668) at /b/mnt/src/sys/kern/kern_sx.c:352 #6 0xffffffff807558b2 in vlan_ifdetach (arg=, ifp=0xfffff8049b2ce000) at /b/mnt/src/sys/net/if_vlan.c:668 #7 0xffffffff80747676 in if_detach_internal (vmove=0, ifp=, ifcp=) at /b/mnt/src/sys/net/if.c:1203 #8 if_detach (ifp=0xfffff8049b2ce000) at /b/mnt/src/sys/net/if.c:1060 #9 0xffffffff80756521 in vlan_clone_destroy (ifc=0xfffff802f29dbe80, ifp=0xfffff8049b2ce000) at /b/mnt/src/sys/net/if_vlan.c:1102 #10 0xffffffff8074dc57 in if_clone_destroyif (ifc=0xfffff802f29dbe80, ifp=0xfffff8049b2ce000) at /b/mnt/src/sys/net/if_clone.c:330 #11 0xffffffff8074dafe in if_clone_destroy (name=) at /b/mnt/src/sys/net/if_clone.c:288 #12 0xffffffff8074b2fd in ifioctl (so=0xfffffea6363806d0, cmd=2149607801, data=, td=0xfffffea654845aa0) at /b/mnt/src/sys/net/if. c:3077 #13 0xffffffff806aab1c in fo_ioctl (fp=, com=, active_cred=, td=, data= ) at /b/mnt/src/sys/sys/file.h:396 #14 kern_ioctl (td=0xfffffea654845aa0, fd=4, com=, data=) at /b/mnt/src/sys/kern/sys_generic.c:938 #15 0xffffffff806aa7fe in sys_ioctl (td=0xfffffea654845aa0, uap=0xfffffea653441b30) at /b/mnt/src/sys/kern/sys_generic.c:846 #16 0xffffffff809ceab8 in syscallenter (td=) at /b/mnt/src/sys/amd64/amd64/../../kern/subr_syscall.c:187 #17 amd64_syscall (td=0xfffffea654845aa0, traced=0) at /b/mnt/src/sys/amd64/amd64/trap.c:1196 #18 fast_syscall_common () at /b/mnt/src/sys/amd64/amd64/exception.S:505 Frame 7 was at this point in if_detach_internal https://svnweb.freebsd.org/base/head/sys/net/if.c?revision=366230&view=markup#l1206 As you can see, a couple of lines up if_purgemaddrs() was called and freed all multicast addresses assigned to the interface, which destroyed the multicast address being added out from under in6_joingroup_locked. I see two potential paths forward: either the wacky locking in in6_getmulti() gets fixed so that we don't have to do the "drop the lock to call a function that acquires that lock again" dance that opens up this race condition, or we fix if_addmulti so that it adds an additional reference to the address if retifma is non-NULL. The second option would be a KPI change that would have a nasty side effect of leaking the address if an existing caller wasn't fixed, but on the other hand the current interface seems pretty useless if it can't actually guarantee that the address you asked for will exist when you get around to trying to manipulate it. Does anybody have any thoughts on this?