From owner-svn-src-head@freebsd.org Mon May 21 13:54:51 2018 Return-Path: Delivered-To: svn-src-head@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B6966EF1A48; Mon, 21 May 2018 13:54:50 +0000 (UTC) (envelope-from jonlooney@gmail.com) Received: from mail-wm0-f52.google.com (mail-wm0-f52.google.com [74.125.82.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 1F5137EA15; Mon, 21 May 2018 13:54:49 +0000 (UTC) (envelope-from jonlooney@gmail.com) Received: by mail-wm0-f52.google.com with SMTP id a67-v6so25352552wmf.3; Mon, 21 May 2018 06:54:49 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=8id7mhg7Rh873030XhOfVYKAuW/VRt2gt4DFQfv6Ew4=; b=kUWDr3RQJfk5b8XsNOyOQlFTaNc8uehCZH/es7CvaoEurfPWsrWjN1+oq4Ibzr8vXh ORfDq7wUEMkHxorqXM8GoFJQwcEyGuIi/CN0396MLUq1LqIZYMsVX9IbYh0zvT0vQ8hl 1xCxIz7whODWxz5ESoASkNb/FzhK4k2v3tBRxeO8q0bofteTaFkwpUemmxzMsARW0YyW kymLvIIyGBxxqPDnznpuDe+aXOgCdwhHmB90qr/qJBs9GE1dUu3TuUpAIm8vuP02nbya whcVFbjEVc0aGBpaDLa8hn7XHKAHGFCrbxMHlsa6Ee13uXiwFVbVtUHOXL3HIOwox33+ +zGg== X-Gm-Message-State: ALKqPwe5X68hMbauRo5dYkCjx0NPwABZivVHdZsqZzvWwHFQ/MmRSj3N l2FERFGnIFyezWf67EakVpgS2nFC X-Google-Smtp-Source: AB8JxZrZvLzz9teyu7f1yerqbOC4rwls03y24eeNH5qHDKDGY2G8G4EBe/aTwNFllof5KkbH0nx8Ew== X-Received: by 2002:a50:9ea3:: with SMTP id a32-v6mr24500798edf.78.1526910882695; Mon, 21 May 2018 06:54:42 -0700 (PDT) Received: from mail-wm0-f41.google.com (mail-wm0-f41.google.com. [74.125.82.41]) by smtp.gmail.com with ESMTPSA id c10-v6sm7265331edq.13.2018.05.21.06.54.42 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 21 May 2018 06:54:42 -0700 (PDT) Received: by mail-wm0-f41.google.com with SMTP id o78-v6so26856957wmg.0; Mon, 21 May 2018 06:54:42 -0700 (PDT) X-Received: by 2002:a1c:618b:: with SMTP id v133-v6mr10883675wmb.75.1526910882377; Mon, 21 May 2018 06:54:42 -0700 (PDT) MIME-Version: 1.0 Received: by 10.223.197.4 with HTTP; Mon, 21 May 2018 06:54:41 -0700 (PDT) In-Reply-To: <201805200438.w4K4c4BQ073120@repo.freebsd.org> References: <201805200438.w4K4c4BQ073120@repo.freebsd.org> From: "Jonathan T. Looney" Date: Mon, 21 May 2018 09:54:41 -0400 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: svn commit: r333915 - head/sys/netinet To: Matt Macy Cc: src-committers , svn-src-all@freebsd.org, svn-src-head@freebsd.org Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.26 X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 May 2018 13:54:51 -0000 Summary: I'm not sure this addresses all the subtleties of INPCB destruction. I think it would benefit from wider review. I suggest it be reverted in the meantime. On Sun, May 20, 2018 at 12:38 AM, Matt Macy wrote: > + > +/* > + * Unconditionally schedule an inpcb to be freed by decrementing its > + * reference count, which should occur only after the inpcb has been > detached > + * from its socket. If another thread holds a temporary reference > (acquired > + * using in_pcbref()) then the free is deferred until that reference is > + * released using in_pcbrele(), but the inpcb is still unlocked. Almost > all > + * work, including removal from global lists, is done in this context, > where > + * the pcbinfo lock is held. > + */ > +void > +in_pcbfree(struct inpcb *inp) > +{ > + struct inpcbinfo *pcbinfo = inp->inp_pcbinfo; > + > + KASSERT(inp->inp_socket == NULL, ("%s: inp_socket != NULL", > __func__)); > + KASSERT((inp->inp_flags2 & INP_FREED) == 0, > + ("%s: called twice for pcb %p", __func__, inp)); > + if (inp->inp_flags2 & INP_FREED) { > INP_WUNLOCK(inp); > + return; > + } > This check no longer serves its purpose; however, I believe it points to a deeper problem: INP_FREED is not set until the deferred epoch context. So, other threads may not know that another thread has already called in_pcbfree. This changes things, such as the behavior of in_pcbrele_[rw]locked(). In the worst case, if someone calls in_pcbfree() twice, the two calls to in_pcbremlists() will result in running `LIST_REMOVE(inp, inp_list)` twice, resulting in memory corruption. I haven't undertaken to audit all the code. However, the fact that in_pcbrele_[rw]locked() will no longer immediately return 1 may change logic such that this is now possible. > + > +#ifdef INVARIANTS > + if (pcbinfo == &V_tcbinfo) { > + INP_INFO_LOCK_ASSERT(pcbinfo); > + } else { > + INP_INFO_WLOCK_ASSERT(pcbinfo); > + } > +#endif > + INP_WLOCK_ASSERT(inp); > + /* Remove first from list */ > + INP_LIST_WLOCK(pcbinfo); > + inp->inp_gencnt = ++pcbinfo->ipi_gencnt; > + in_pcbremlists(inp); > in_pcbremlists() also increments inp->inp_gencnt(); therefore, it is not necessary to increment it above. > + INP_LIST_WUNLOCK(pcbinfo); > + RO_INVALIDATE_CACHE(&inp->inp_route); > + INP_WUNLOCK(inp); > + epoch_call(net_epoch_preempt, &inp->inp_epoch_ctx, > in_pcbfree_deferred); > } > > /* > > Modified: head/sys/netinet/in_pcb.h > ============================================================ > ================== > --- head/sys/netinet/in_pcb.h Sun May 20 04:32:48 2018 (r333914) > +++ head/sys/netinet/in_pcb.h Sun May 20 04:38:04 2018 (r333915) > @@ -328,6 +328,7 @@ struct inpcb { > LIST_ENTRY(inpcb) inp_list; /* (p/l) list for all PCBs for > proto */ > /* (p[w]) for list iteration */ > /* (p[r]/l) for addition/removal */ > + struct epoch_context inp_epoch_ctx; > }; > #endif /* _KERNEL */ > > >