From owner-freebsd-net@freebsd.org Sat Aug 27 16:50:34 2016 Return-Path: Delivered-To: freebsd-net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5D69FB77B6E; Sat, 27 Aug 2016 16:50:34 +0000 (UTC) (envelope-from hoomanfazaeli@gmail.com) Received: from mail-wm0-x232.google.com (mail-wm0-x232.google.com [IPv6:2a00:1450:400c:c09::232]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id E5C92B14; Sat, 27 Aug 2016 16:50:33 +0000 (UTC) (envelope-from hoomanfazaeli@gmail.com) Received: by mail-wm0-x232.google.com with SMTP id q128so28333579wma.1; Sat, 27 Aug 2016 09:50:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-transfer-encoding; bh=XIyBGPPtOdpenvDuBzNQB63wPTxJG01Xirg97z0qSzE=; b=zkmbvhB33KouET7IL0S+7e3i03MV7sZcQNg7LapkW44LD7KyFOMgW6i5duCVoY9pR7 vNotXn4iQnKBXkiJmShZ7xrIuiEXG7TDLzMI7JLHFPqFYC1oau66yP036w2vr6EiCpy2 Fy+cKQOQgt/ZrfbK+M+HLE+d2+muMIxy2oK1HWOYLeZA7Fq3UjVlrwZNtNPBS7GU355T CyJQOeGBA/WXq92dtn6oFY0FvgYw3K1kqydQEbP9gfQEw+vO4KlDn7F205cmTy5IFJDe 9qZGv1WPMs+JXTwayFrWIDsbzZCYTgn66jRdE8OCVeHJUwZszg98ClEl9m6kAKxPekgf gnhA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:message-id:date:from:user-agent:mime-version:to :cc:subject:references:in-reply-to:content-transfer-encoding; bh=XIyBGPPtOdpenvDuBzNQB63wPTxJG01Xirg97z0qSzE=; b=B7IhWBzGvZkgdfyGJ+LFz5fzb5K+DPuey65ND+ApcK7Jtrp8vJvC/CysVAc7Odz+oA 3bU2mvVuuCH34hPWe8w9sV3sSJsYStHwAYWS6zb+pN5GjGCdwd2Gl/rj99F+6eiOd3eb PCuESX0jmUdDCZpMcY0Sx3y0HSPNgpNm/eC6V1vcxZ4s5mTmzkNq/vR4JWZpzquWXHUh sojOa0QwYquicSA5ALB716mYyxo1TtAW4x4SMbZMp4oX7HwGiOuC8Q+PVitd2mYudPfC jmebiyoLqe033VJqBQF1Z3+HiulWs3MvvwJylwm1qeeIH5Vw8IC+4LpIxGW3ZCnfbNkr LRxw== X-Gm-Message-State: AE9vXwOZTzfpLcC+YOBQ2wXcL4OwsVaUFsTKDbPxPd7hLRY+igE+vAcPWGkGjDIjc05GdQ== X-Received: by 10.28.39.134 with SMTP id n128mr3490014wmn.60.1472316632128; Sat, 27 Aug 2016 09:50:32 -0700 (PDT) Received: from [192.168.2.30] ([2.190.218.206]) by smtp.googlemail.com with ESMTPSA id i1sm25447334wjf.12.2016.08.27.09.50.29 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 27 Aug 2016 09:50:31 -0700 (PDT) Message-ID: <57C1C4D3.8060604@gmail.com> Date: Sat, 27 Aug 2016 21:20:27 +0430 From: Hooman Fazaeli User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130215 Thunderbird/17.0.3 MIME-Version: 1.0 To: "Alexander V. Chernikov" CC: FreeBSD Net , freebsd-hackers@freebsd.org, freebsd-arch@freebsd.org Subject: Re: projects/routing announcement/status References: <6151261453419663@web14j.yandex.ru> In-Reply-To: <6151261453419663@web14j.yandex.ru> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 27 Aug 2016 16:50:34 -0000 On 2016-01-22 03:11, Alexander V. Chernikov wrote: > I would like to introduce routing rework which started as projects/routing SVN branch. > It has been around for quite a long time, some of the code has made its way to HEAD, but there hasn't been any public announcements. > > So, what is projects/routing about? > > First, it is about bringing more scalability by solving most annoying problems on packet output path. > To be more specific, it eliminates 2 out of 4 locks, converts other 2 to rmlock(9) and adds infrastructure to reduce locking to single rmlock for certain traffic types. > With these changes, OS is able to forward 12MPPS on 16-core box for both IPv4/IPv6 which is 6-10 times better than stock HEAD. > > Second, it eases hacking by avoiding direct access to route/lltable internals and providing higher level API instead. > > Third, it is about bringing advanced features like route multipath, and even more speed by adding modular lookup API permitting to use different route lookup algorithms based on server role. > > Description with graphs and links is available at: http://wiki.freebsd.org/ProjectsRoutingProposal > Used API is described in http://wiki.freebsd.org/ProjectsRoutingProposal/API > Current status is available at http://wiki.freebsd.org/ProjectsRoutingProposal/ConversionStatus > > It is probably much more convenient to read project details on wiki, however I’ll try to summarise the most important things here (wiki readers can skip till the end). > > Typical packet processing (forwarding for router, or output for web server) path consists of: > > doing routing lookup (radix read rwlock + routing entry (rte) mutex lock) > (optionally) interface address (ifa) atomic refcount acquire/release > doing link level entry (lle, llentry) lookup (afdata read rwlock + llentry read (or write) lock) > > > Most annoying one is the rtentry mutex. The only goal of this mutex is to provide rtentry refcounting so consumer code can use it without the risk of rtentrry being deleted. > We solve this by saving all needed data into on-stack optimised structure instead of refcounting. > Additionally, we are trying to pre-calculate the data we need to pass by using special next-hop structures instead of route entries. > Several different (in terms of returned info and relative overhead) functions for retrieving routing data are provided. > Most of the consumers have already been switched to the new KPI. Actual output/forward path are not converted yet. > > It should be noted, that since individual rtentries are not returned, it is not possible to do per-ifa output packet accouting (can be observed in netstat -s). > > Route table lock is switched to ipfw-like dual-locking mode (read rmlock() for data path, rwlock for config changes, route export, etc..). > The reasons of having rwlock are to 1) provide serialization for things in control plane not directly used for data path and 2) avoid acquiring contested/sleeping locks for rmlock. See projects/routing r287078 for an example. > > Lltable entry locks were eliminated in r291853, r292155. > > Lltable lock is also planned to be converted to dual-locking model, with the similar reasoning. > However, instead of (ab)using AFDATA lock, it needs to be converted to per-lltable set of locks. > > > Open problems: > SCTP/Flowtable references rtentries directly. It is not possible to convert ip[6]_output() path without dealing with that. > > Brief merge plan: > Discuss/merge new routing KPI for data path > Discuss/merge lltable dual-lock (WIP) > Discuss/merge explicit nexthop changes > Discuss/merge IPv4/IPv6 output path (along with converted sctp/flowtable) > Discuss/merge route table dual-lock > > Current outstanding reviews (I encourage you to take a look at these) > > D5009 (IPv4 fast forwarding conversion) > D5010 (IPv6 forwarding conversion) > D4794 (Deal with per-ifa output counters) > D4962 (new LLE lookup functions, no sockaddrs in lltable data path) > D4751 (move all lltable code to separate files) > > _______________________________________________ > freebsd-arch@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-arch > To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" First, thanks for the effort. I personally very much appreciate any improvements made to the network related stuff. Second have you considered replacing the existing radix tree with a faster data structure, specially the Luigi DXR tables? (http://info.iet.unipi.it/~luigi/papers/20120601-dxr.pdf ) I apologize if the question is not much relevant to your work. -- Best regards Hooman Fazaeli