From owner-svn-src-all@FreeBSD.ORG Thu Apr 2 13:59:05 2015 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id EE5AEAAB; Thu, 2 Apr 2015 13:59:04 +0000 (UTC) Received: from relay.mailchannels.net (nov-007-i588.relay.mailchannels.net [46.232.183.142]) by mx1.freebsd.org (Postfix) with ESMTP id CED7C8D3; Thu, 2 Apr 2015 13:58:59 +0000 (UTC) X-Sender-Id: duocircle|x-authuser|hippie Received: from smtp4.ore.mailhop.org (ip-10-33-12-218.us-west-2.compute.internal [10.33.12.218]) by relay.mailchannels.net (Postfix) with ESMTPA id DE49D48DA; Thu, 2 Apr 2015 13:58:47 +0000 (UTC) X-Sender-Id: duocircle|x-authuser|hippie Received: from smtp4.ore.mailhop.org ([TEMPUNAVAIL]. [10.83.15.107]) (using TLSv1 with cipher DHE-RSA-AES256-SHA) by 0.0.0.0:2500 (trex/5.4.8); Thu, 02 Apr 2015 13:58:48 +0000 X-MC-Relay: Neutral X-MailChannels-SenderId: duocircle|x-authuser|hippie X-MailChannels-Auth-Id: duocircle X-MC-Loop-Signature: 1427983128063:2896112750 X-MC-Ingress-Time: 1427983128062 Received: from c-73-34-117-227.hsd1.co.comcast.net ([73.34.117.227] helo=ilsoft.org) by smtp4.ore.mailhop.org with esmtpsa (TLSv1.2:DHE-RSA-AES256-GCM-SHA384:256) (Exim 4.82) (envelope-from ) id 1Ydfdn-0001nk-99; Thu, 02 Apr 2015 13:58:39 +0000 Received: from revolution.hippie.lan (revolution.hippie.lan [172.22.42.240]) by ilsoft.org (8.14.9/8.14.9) with ESMTP id t32DwTRK063916; Thu, 2 Apr 2015 07:58:29 -0600 (MDT) (envelope-from ian@freebsd.org) X-Mail-Handler: DuoCircle Outbound SMTP X-Originating-IP: 73.34.117.227 X-Report-Abuse-To: abuse@duocircle.com (see https://support.duocircle.com/support/solutions/articles/5000540958-duocircle-standard-smtp-abuse-information for abuse reporting information) X-MHO-User: U2FsdGVkX1/blXF53oipMzf/nUqSsYyG Message-ID: <1427983109.82583.115.camel@freebsd.org> Subject: Re: svn commit: r280971 - in head: contrib/ipfilter/tools share/man/man4 sys/contrib/ipfilter/netinet sys/netinet sys/netipsec sys/netpfil/pf From: Ian Lepore To: Mateusz Guzik Date: Thu, 02 Apr 2015 07:58:29 -0600 In-Reply-To: <20150402135157.GB549@dft-labs.eu> References: <201504012226.t31MQedN044443@svn.freebsd.org> <1427929676.82583.103.camel@freebsd.org> <20150402123522.GC64665@FreeBSD.org> <20150402133751.GA549@dft-labs.eu> <20150402134217.GG64665@FreeBSD.org> <20150402135157.GB549@dft-labs.eu> Content-Type: text/plain; charset="us-ascii" X-Mailer: Evolution 3.12.10 FreeBSD GNOME Team Port Mime-Version: 1.0 Content-Transfer-Encoding: 7bit X-AuthUser: hippie Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, Gleb Smirnoff , src-committers@freebsd.org X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Apr 2015 13:59:05 -0000 On Thu, 2015-04-02 at 15:51 +0200, Mateusz Guzik wrote: > On Thu, Apr 02, 2015 at 04:42:17PM +0300, Gleb Smirnoff wrote: > > On Thu, Apr 02, 2015 at 03:37:51PM +0200, Mateusz Guzik wrote: > > M> On Thu, Apr 02, 2015 at 03:35:22PM +0300, Gleb Smirnoff wrote: > > M> > On Wed, Apr 01, 2015 at 05:07:56PM -0600, Ian Lepore wrote: > > M> > I> > Author: glebius > > M> > I> > Date: Wed Apr 1 22:26:39 2015 > > M> > I> > New Revision: 280971 > > M> > I> > URL: https://svnweb.freebsd.org/changeset/base/280971 > > M> > I> > > > M> > I> > Log: > > M> > I> > o Use new function ip_fillid() in all places throughout the kernel, > > M> > I> > where we want to create a new IP datagram. > > M> > I> > o Add support for RFC6864, which allows to set IP ID for atomic IP > > M> > I> > datagrams to any value, to improve performance. The behaviour is > > M> > I> > controlled by net.inet.ip.rfc6864 sysctl knob, which is enabled by > > M> > I> > default. > > M> > I> > o In case if we generate IP ID, use counter(9) to improve performance. > > M> > I> > o Gather all code related to IP ID into ip_id.c. > > M> > I> > > > M> > I> > Differential Revision: https://reviews.freebsd.org/D2177 > > M> > I> > Reviewed by: adrian, cy, rpaulo > > M> > I> > Tested by: Emeric POUPON > > M> > I> > Sponsored by: Netflix > > M> > I> > Sponsored by: Nginx, Inc. > > M> > I> > Relnotes: yes > > M> > I> > > > M> > I> [...] > > M> > I> > +void > > M> > I> > +ip_fillid(struct ip *ip) > > M> > I> > +{ > > M> > I> > + > > M> > I> > + /* > > M> > I> > + * Per RFC6864 Section 4 > > M> > I> > + * > > M> > I> > + * o Atomic datagrams: (DF==1) && (MF==0) && (frag_offset==0) > > M> > I> > + * o Non-atomic datagrams: (DF==0) || (MF==1) || (frag_offset>0) > > M> > I> > + */ > > M> > I> > + if (V_ip_rfc6864 && (ip->ip_off & htons(IP_DF)) == htons(IP_DF)) > > M> > I> > + ip->ip_id = 0; > > M> > I> > + else if (V_ip_do_randomid) > > M> > I> > + ip->ip_id = ip_randomid(); > > M> > I> > + else { > > M> > I> > + counter_u64_add(V_ip_id, 1); > > M> > I> > + ip->ip_id = htons((*(uint64_t *)zpcpu_get(V_ip_id)) & 0xffff); > > M> > I> > + } > > M> > I> > +} > > M> > I> > + > > M> > I> > > M> > I> This is completely bogus. It's a big opacity violation (it relies on > > M> > I> what should be opaque private internal implementation details of > > M> > I> counter(9)). The fact that the counter api doesn't provide a function > > M> > I> for retrieving one cpu's counter value should be a big clue there -- the > > M> > I> fact that you know the internals doesn't make it okay to reach behind > > M> > I> the counter and grab a value like that. It may not even be safe to do > > M> > I> so on any given architecture; it certainly isn't safe on arm, and that > > M> > I> line of code above will work only by accident because you're throwing > > M> > I> way all but 16 bits. > > M> > > > M> > I though about providing that API, but since it isn't safe in general, > > M> > I decided to not do that. > > M> > > > M> > I> But even more importantly, this WILL result in multiple threads using > > M> > I> the same value at the same time... > > M> > I> > > M> > I> - Thread A on CPU 1 and thread B on CPU 2 both begin executing here at > > M> > I> the same time, and both get through counter_u64_add(). > > M> > I> - Thread A keeps running and uses CPU 1's new value, call it 27. > > M> > I> - Thread B gets prempted between counter_u64_add() and zpcpu_get(). > > M> > I> When it resumes it's now on CPU 1, so it retrieves value 27 as well. > > M> > > > M> > This was already discussed in this thread: > > M> > > > M> > https://lists.freebsd.org/pipermail/svn-src-head/2015-March/069864.html > > M> > > > M> > > M> For this particular use-case you never care what CPU you are executing > > M> on, you only want to obtain a unique number. > > M> > > M> per-cpu counters can serve this purpose no problem, just provide an > > M> operation which guarantees to return the new value of the counter it > > M> incremented. Should be easily achieved with e.g. just pinning curthread > > M> to the cpu it executes on for the duration of inc + fetch. > > > > I'd ask to pay attention to this particular email: > > > > https://lists.freebsd.org/pipermail/svn-src-head/2015-March/069966.html > > > > Just to justify probabilities, risks and countermeasures. > > > > For those, who don't believe in theory and prefers practice: > > > > https://lists.freebsd.org/pipermail/svn-src-head/2015-March/070091.html > > > > Note that Emeric was the one who observed collisions for the ip_id++ > > code, that we used before. > > > > Well in that case this at least deserves a comment in the code. Everyone > spotting that counter_u64_add + zpcpu_get will think it's a bug. > Because it IS a bug. That isn't changed by the fact that it works reliably on one platform due to what should be an opque implementation detail, and works by accident on other platforms (at least until the details of their implementations change in the future). As soon as somebody sees this code, thinks it's a good way to do things, and cut and pastes it into another venue and removes the & 0xffff, it just turns into a bug on every platform except amd64. -- Ian