From owner-svn-src-stable@freebsd.org Tue Mar 6 16:39:35 2018 Return-Path: Delivered-To: svn-src-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4CC28F410FA; Tue, 6 Mar 2018 16:39:35 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from mail.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id A921B7F01E; Tue, 6 Mar 2018 16:39:34 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from ralph.baldwin.cx (ralph.baldwin.cx [66.234.199.215]) by mail.baldwin.cx (Postfix) with ESMTPSA id 3D30E10AC13; Tue, 6 Mar 2018 11:39:33 -0500 (EST) From: John Baldwin To: Eitan Adler Cc: src-committers , svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: Re: svn commit: r330451 - in stable/11/sys: dev/iwm dev/otus dev/usb/wlan net80211 Date: Tue, 06 Mar 2018 08:26:43 -0800 Message-ID: <6465173.s2nWvWCLOs@ralph.baldwin.cx> User-Agent: KMail/4.14.10 (FreeBSD/11.1-STABLE; KDE/4.14.30; amd64; ; ) In-Reply-To: References: <201803050754.w257swAE001435@repo.freebsd.org> <1861296.ksaTdANMae@ralph.baldwin.cx> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.4.3 (mail.baldwin.cx); Tue, 06 Mar 2018 11:39:33 -0500 (EST) X-Virus-Scanned: clamav-milter 0.99.2 at mail.baldwin.cx X-Virus-Status: Clean X-BeenThere: svn-src-stable@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: SVN commit messages for all the -stable branches of the src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 Mar 2018 16:39:35 -0000 On Monday, March 05, 2018 07:13:59 PM Eitan Adler wrote: > On 5 March 2018 at 10:08, John Baldwin wrote: > > On Monday, March 05, 2018 07:54:58 AM Eitan Adler wrote: > >> Author: eadler > >> Date: Mon Mar 5 07:54:57 2018 > >> New Revision: 330451 > >> URL: https://svnweb.freebsd.org/changeset/base/330451 > >> > >> Log: > >> MFC r306837: > >> > >> [net80211] extend the ieee80211_rx_stats struct to include more information. > > > > Have you thought about the KBI implications of this change and some of the > > other changes you've merged? > > I do have a copy of the modules from 11.1 and have loaded them at > various points in time after merging. That said, I am not perfect. > Unfortunately, my -STABLE box did not have fully functioning drivers > before these changes so its difficult to test anything beyond "loads > and does not panic.". I havn't done so today yet, but that will happen > soon. > > Ensuring these things work through code inspection is certainly > possible and I've skipped over several changes as a result. Loading a module doesn't alone doesn't actually test for breakage. That only breaks if you remove symbols. Changing the semantics of how functions work or the layout of a structure that is passed between structures is an ABI change even if it doesn't change the function signature. In this case, this change inserts new fields and changes the size of 'struct ieee80211_rx_stats'. Thus if a otus(4) or iwm(4) driver built against the prior revision (330450) is used with a kernel built with this revision (330451), the driver will pass in an 'rxs' structure in a call to ieee80211_input_mimo() that is smaller, and in the first few lines of ieeee80211_input_mimo() these statements will overflow that pointer and read stack garbage: 85 int 86 ieee80211_input_mimo(struct ieee80211_node *ni, struct mbuf *m, 87 struct ieee80211_rx_stats *rx) 88 { 89 struct ieee80211_rx_stats rxs; 90 91 if (rx) { 92 memcpy(&rxs, rx, sizeof(*rx)); 93 } else { Furthermore, the now garbage 'rxs' (since fields in *rx are in different places so even the bits of 'rxs' that isn't stack garbage will be wrong) will now be used by other routines in net80211. Your next MFC commit then causes another ABI breakage as it removes the 'rx' parameter from the function. You can't do that. If you don't know how to recognize those types of ABI breakages, then you probably shouldn't be MFC'ing changes without getting some sort of review. It also tends to be a lot harder to find these breakages in code one didn't write. The fact that there are back-to-back ABI breakages also suggests that it is much better to consolidate MFCs into larger commits because you limit the amount of compatibilty ABI shims you have to provide. I think for net80211 you need to generate a diff of all of the wireless related changes you have MFC'd to stable/11 and then review that for ABI changes and come up with a plan for how to restore the ABI and get re@'s approval. (kib@ is a pretty good resource for devising ABI shims.) I think that going forward you shouldn't MFC changes if you aren't certain about the ABI implications until you have had someone review them. Batching up changes into a single diff is also helpful since if an API changes back and forth in HEAD multiple times, collapsing them means that for stable you may only need a single compat shim rather than several. -- John Baldwin