From owner-freebsd-net@freebsd.org Wed Jul 22 22:15:17 2020 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id A1B5C36298B; Wed, 22 Jul 2020 22:15:17 +0000 (UTC) (envelope-from jmg@gold.funkthat.com) Received: from gold.funkthat.com (gate2.funkthat.com [208.87.223.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "gate2.funkthat.com", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4BBqXN6QTJz4qhT; Wed, 22 Jul 2020 22:15:16 +0000 (UTC) (envelope-from jmg@gold.funkthat.com) Received: from gold.funkthat.com (localhost [127.0.0.1]) by gold.funkthat.com (8.15.2/8.15.2) with ESMTPS id 06MMF9EO080536 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Wed, 22 Jul 2020 15:15:09 -0700 (PDT) (envelope-from jmg@gold.funkthat.com) Received: (from jmg@localhost) by gold.funkthat.com (8.15.2/8.15.2/Submit) id 06MMF9Am080535; Wed, 22 Jul 2020 15:15:09 -0700 (PDT) (envelope-from jmg) Date: Wed, 22 Jul 2020 15:15:09 -0700 From: John-Mark Gurney To: "Bjoern A. Zeeb" Cc: freebsd-net@freebsd.org, freebsd-current@freebsd.org Subject: Re: somewhat reproducable vimage panic Message-ID: <20200722221509.GI4213@funkthat.com> Mail-Followup-To: "Bjoern A. Zeeb" , freebsd-net@freebsd.org, freebsd-current@freebsd.org References: <20200721091654.GC4213@funkthat.com> <20200721113153.42d83119@x23> <20200721202323.GE4213@funkthat.com> <38F5A3A6-B578-4BA4-8F69-C248163CB6E0@libassi.se> <20200722060514.GF4213@funkthat.com> <20200722193443.GG4213@funkthat.com> <6C149617-55BB-4A87-B993-195E5E133790@lists.zabbadoz.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <6C149617-55BB-4A87-B993-195E5E133790@lists.zabbadoz.net> X-Operating-System: FreeBSD 11.3-STABLE amd64 X-PGP-Fingerprint: D87A 235F FB71 1F3F 55B7 ED9B D5FF 5A51 C0AC 3D65 X-Files: The truth is out there X-URL: https://www.funkthat.com/ X-Resume: https://www.funkthat.com/~jmg/resume.html X-TipJar: bitcoin:13Qmb6AeTgQecazTWph4XasEsP7nGRbAPE X-to-the-FBI-CIA-and-NSA: HI! HOW YA DOIN? can i haz chizburger? User-Agent: Mutt/1.6.1 (2016-04-27) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3 (gold.funkthat.com [127.0.0.1]); Wed, 22 Jul 2020 15:15:09 -0700 (PDT) X-Rspamd-Queue-Id: 4BBqXN6QTJz4qhT X-Spamd-Bar: + Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=none (mx1.freebsd.org: domain of jmg@gold.funkthat.com has no SPF policy when checking 208.87.223.18) smtp.mailfrom=jmg@gold.funkthat.com X-Spamd-Result: default: False [1.93 / 15.00]; MID_RHS_MATCH_FROM(0.00)[]; ARC_NA(0.00)[]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; TO_DN_SOME(0.00)[]; NEURAL_HAM_LONG(-0.28)[-0.283]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[funkthat.com]; AUTH_NA(1.00)[]; NEURAL_SPAM_MEDIUM(0.67)[0.675]; NEURAL_SPAM_SHORT(0.34)[0.336]; TO_MATCH_ENVRCPT_SOME(0.00)[]; R_SPF_NA(0.00)[no SPF record]; FORGED_SENDER(0.30)[jmg@funkthat.com,jmg@gold.funkthat.com]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:32354, ipnet:208.87.216.0/21, country:US]; FROM_NEQ_ENVFROM(0.00)[jmg@funkthat.com,jmg@gold.funkthat.com]; RCVD_TLS_ALL(0.00)[]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 Jul 2020 22:15:17 -0000 Bjoern A. Zeeb wrote this message on Wed, Jul 22, 2020 at 20:43 +0000: > On 22 Jul 2020, at 19:34, John-Mark Gurney wrote: > > > John-Mark Gurney wrote this message on Tue, Jul 21, 2020 at 23:05 > > -0700: > >> Peter Libassi wrote this message on Wed, Jul 22, 2020 at 06:54 +0200: > >>> Is this related to > >>> > >>> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=234985 > >>> and > >>> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=238326 > >>> > >> > >> Definitely not 234985.. I'm using ue interfaces, and so they don't > >> get destroyed while the jail is going away... > >> > >> I don't think it's 238326 either. This is 100% reliable and it's in > >> the IP multicast code.. It looks like in_multi isn't holding an > >> interface or address lock waiting for things to free up... > > > > Did a little more poking, and it looks like the vnet is free'd before > > the ifnet is free'd causing this problem: > > (kgdb) print inm->inm_ifp[0].if_refcount > > $5 = 1 > > (kgdb) print inm->inm_ifp[0].if_vnet[0] > > $6 = {vnet_le = {le_next = 0xdeadc0dedeadc0de, le_prev = > > 0xdeadc0dedeadc0de}, > > vnet_magic_n = 3735929054, vnet_ifcnt = 3735929054, > > vnet_sockcnt = 3735929054, vnet_state = 3735929054, > > vnet_data_mem = 0xdeadc0dedeadc0de, vnet_data_base = > > 16045693110842147038, > > vnet_shutdown = 222} > > > > So the multicast code is fine, it holds and releases a reference to > > ifnet.. > > > > The issue is that the reference to the ifnet doesn't involve a > > reference to the vnet/prison. > > Does it need to? The ifnet cannot go away while something holds a > reference to it, right? It's the other way around that's the problem.. the ifnet is holding an invalid vnet pointer that got free'd. Maybe the problem isn't the tear down, but that the vnet pointer isn't changed/restored before the free? > Sounds more like the teardown order is wrong (again)? > > There should be no more multicast when IP etc. is gone. That means MC > doesn???t properly cleanup itself. Don't know, just know that it's easy to trigger right now... I haven't tested on prior releases, but if you'd like me to, it isn't too hard for me to test... > I guess I should go back now and re-read your original problem statement > on how you trigger this.. So, it's pretty easy to trigger, just attach a couple USB ethernet adapters, in my case, they were ure, but likely any two spare ethernet interfaces will work, and wire them back to back.. Run the script attached earlier in the thread, providing it the name of the two interfaces as arguments, and run it a few times. You might get failures or not. It shouldn't matter. After a few runs, it'll panic... I just tested this (to make sure my ure changes weren't causing addition problems) using FreeBSD-13.0-CURRENT-amd64-20200625-r362596-memstick.img.xz, so it's stock reproducable. Thanks for looking into this! -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not."