From owner-freebsd-hackers@FreeBSD.ORG Fri Nov 9 14:03:14 2012 Return-Path: Delivered-To: freebsd-hackers@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id A5F4AB6E; Fri, 9 Nov 2012 14:03:14 +0000 (UTC) (envelope-from fabien.thomas@netasq.com) Received: from work.netasq.com (gwlille.netasq.com [91.212.116.1]) by mx1.freebsd.org (Postfix) with ESMTP id 39F128FC0A; Fri, 9 Nov 2012 14:03:13 +0000 (UTC) Received: from [10.2.1.1] (unknown [10.2.1.1]) by work.netasq.com (Postfix) with ESMTPSA id 364CC27052C5; Fri, 9 Nov 2012 15:03:07 +0100 (CET) Subject: Re: [patch] reducing arp locking Mime-Version: 1.0 (Apple Message framework v1283) Content-Type: text/plain; charset=iso-8859-1 From: Fabien Thomas In-Reply-To: <509CE6A2.2040200@FreeBSD.org> Date: Fri, 9 Nov 2012 15:03:06 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: <8169CD67-E444-46FC-A7C8-DD6FB59091E1@netasq.com> References: <509AEDAC.10002@FreeBSD.org> <509B884F.7040106@networx.ch> <509B88B1.3070905@FreeBSD.org> <49EE4F42-6162-40F4-9DE0-1ACA1289B225@netasq.com> <509CC776.9010200@FreeBSD.org> <37E1F76F-D951-4B36-AF00-039DA1CC5CF3@netasq.com> <509CE6A2.2040200@FreeBSD.org> To: Alexander V. Chernikov X-Mailer: Apple Mail (2.1283) Cc: freebsd-net@FreeBSD.org, Andre Oppermann , freebsd-hackers@FreeBSD.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 09 Nov 2012 14:03:14 -0000 Le 9 nov. 2012 =E0 12:18, Alexander V. Chernikov a =E9crit : > On 09.11.2012 13:59, Fabien Thomas wrote: >>=20 >> Le 9 nov. 2012 =E0 10:05, Alexander V. Chernikov a =E9crit : >>=20 >>> On 09.11.2012 12:51, Fabien Thomas wrote: >>>>=20 >>>> Le 8 nov. 2012 =E0 11:25, Alexander V. Chernikov a =E9crit : >>>>=20 >>>>> On 08.11.2012 14:24, Andre Oppermann wrote: >>>>>> On 08.11.2012 00:24, Alexander V. Chernikov wrote: >>>>>>> Hello list! >>>>>>>=20 >>>>>>> Currently we need to acquire 2 read locks to perform simple = 6-byte >>>>>>> copying from arp record to packet >>>>>>> ethernet header. >>>>>>>=20 >>>>>>> It seems that acquiring lle lock for fast path (main traffic = flow) is >>>>>>> not necessary even with >>>>>>> current code. >>>>>>>=20 >>>>>>> My tests shows ~10% improvement with this patch applied. >>>>>>>=20 >>>>>>> If nobody objects I plan to commit this change at the end of = next week. >>>>>>=20 >>>>>> This is risky and prone to race conditions. The copy of the MAC = address >>>>>> should be done while the table read lock is held to protect = against the >>>>> It is done exactly as you say: table read lock is held. >>>>=20 >>>> How do you protect from entry update if i've a ref to the entry ? >>>> You can end up doing bcopy of a partial mac address. >>> I see no problems in copying incorrect mac address in that case: >>> if host mac address id updated, this is, most likely, another host, = and several packets being lost changes nothing. >>=20 >> Sending packet to a bogus mac address is not really nothing :) >>=20 >>>=20 >>> However, there can be some realistic scenario where this can be the = case (L2 load balancing/failover). I'll update in_arpinput() to do lle = removal/insertion in that case. >>>=20 >>>> la_preempt modification is also write access to an unlocked = structure. >>> This one changes nothing: >>> current code does this under _read_ lock. >>=20 >> Under the table lock not the entry lock ? > lle entry is read-locked while la_preempt is modified. >=20 >> Table lock is here to protect the table if I've understood the code = correctly. > Yes. >> If i get an exclusive reference to the entry you will end up reading = and writing to the entry without any lock. > Yes. And the only single drawback in worst case can be sending a bit = more packets to right (but probably expired) MAC address. Or partial copy of the new mac. >=20 > I'm talking about the following: > ARP stack is just IP -> 6 bytes mapping, there is no reason to make it = unnecessary complicated like rte, with references being held by upper = layer stack. It does not contain interface pointer, etc.. >=20 > We may need to r/w lock entry, but for 'control plane' code only. > If one acquired exclusive lock and wants to change STATIC flag to = non-static or change lle address - this is simply wrong and has to be = handled by acquiring table wlock. >=20 > Current ARP code has some flaws like handling arp expiration, but this = patch doesn't change much here.. In in_arpinput only exclusive access to the entry is taken during the = update no IF_AFDATA_LOCK that's why i was surprised. ; >=20 >>=20 >>>=20 >>>>=20 >>>>=20 >>>>>=20 >>>>>> entry going away. You can either return with table lock held and = drop >>>>>> it after the copy, or you could a modified lookup function that = takes a >>>>>> pointer for the copy destination, do the copy with the read lock, = and then >>>>>> return. If no entry is found an error is returned and obviously = no copy >>>>>> is done. >>>>>>=20 >>>>>=20 >>>>>=20 >>>>> -- >>>>> WBR, Alexander >>>>>=20 >>>>>=20 >>>>> _______________________________________________ >>>>> freebsd-hackers@freebsd.org mailing list >>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-hackers >>>>> To unsubscribe, send any mail to = "freebsd-hackers-unsubscribe@freebsd.org" >>>>=20 >>>>=20 >>>=20 >>=20 >>=20 >=20 >=20 > --=20 > WBR, Alexander >=20 >=20