Date: Thu, 23 Jan 1997 01:32:31 +1100 (EST) From: proff@suburbia.net To: hackers@freebsd.org Subject: Re: socket lookups: a proposal (fwd) Message-ID: <19970122143231.16798.qmail@suburbia.net>
next in thread | raw e-mail | index | archive | help
>From owner-netdev@roxanne.nuclecu.unam.mx Wed Jan 22 10:23:26 1997 Return-Path: <owner-netdev@roxanne.nuclecu.unam.mx> Delivered-To: proff@suburbia.net Received: (qmail 10048 invoked from network); 22 Jan 1997 10:23:13 -0000 Received: from roxanne.nuclecu.unam.mx (132.248.29.2) by suburbia.net with SMTP; 22 Jan 1997 10:23:13 -0000 Received: (from root@localhost) by roxanne.nuclecu.unam.mx (8.6.12/8.6.11) id DAA19506 for netdev-outgoing; Wed, 22 Jan 1997 03:55:53 -0600 Received: from zwei.siemens.at (zwei.siemens.at [193.81.246.12]) by roxanne.nuclecu.unam.mx (8.6.12/8.6.11) with ESMTP id DAA19501 for <netdev@roxanne.nuclecu.unam.mx>; Wed, 22 Jan 1997 03:55:45 -0600 Received: from pc5829.hil.siemens.at (root@[10.1.143.100]) by zwei.siemens.at (8.7.5/8.7.3) with ESMTP id KAA01806; Wed, 22 Jan 1997 10:52:46 +0100 (MET) Received: from localhost (mingo@localhost) by pc5829.hil.siemens.at (8.6.12/8.6.9) with SMTP id KAA11881; Wed, 22 Jan 1997 10:52:31 +0100 Date: Wed, 22 Jan 1997 10:52:31 +0100 (MET) From: Ingo Molnar <mingo@pc5829.hil.siemens.at> To: netdev@roxanne.nuclecu.unam.mx, Eric Schenk <erics@dna.lth.se> cc: Pedro Roque <roque@di.fc.ul.pt>, "David S. Miller" <davem@caip.rutgers.edu> Subject: Re: socket lookups: a proposal In-Reply-To: <199701212336.AAA23162@regin> Message-ID: <Pine.LNX.3.95.970122103441.11631B-100000@pc5829.hil.siemens.at> Sender: owner-netdev@roxanne.nuclecu.unam.mx Precedence: bulk Reply-To: netdev@roxanne.nuclecu.unam.mx, Ingo Molnar <mingo@pc5829.hil.siemens.at> On Wed, 22 Jan 1997, Eric Schenk wrote: > >I don't really know a thing on how a processor cache operates (URLs are > >welcomed) but common sense would say that you are right on that... > >Unless you use a slab allocator to take your memory from a contiguos > >memory pool... > > Hmm. I'm not up on how strong the guarantees provided by the slab > allocator can be. David, you want to comment on this? my problem with the SLAB allocator is this radix tree case is that it doesnt 'compress' cache elements. ie. we will have a typical distance between nodes elements (even though we have fix slab pools of tightly packed node structures), due to the statistic pressure of creation and deletition of sockets. say, any workload that has a 'changing number of sockets', where we can have any number of sockets between say .. N and 2*N [this is arbitary, just for an example], and say N is ... 10000. Now, in the above workload. If we have a temporary peak of 20000 sockets, the SLAB cache is tightly filled. Now when we drop back to 10000 sockets, we still have almost the same number of slabs allocated. [because the deallocation happens random, and a slab can only be deallocated when we have the slab >completely< free. slabs are static in the sense that you cannot move cache elements] Not only the total memory usage, but the 'cache footprint' in any workload is dominated too by the maximum peak socket usage. [ in the above example, N sockets will most probably use 90% cache lines of the 2*N case, because elements do not fill a whole cache line and elements were deallocated randomly ] Thus to overcome this problem i propose to always choose the order of the radix tree in such a way, that a whole cache line is filled by exactly one node. Say: struct radix { struct radix * leaf[N]; struct socket * sock_addr; } N choosen in such a way that: sizeof(sock_addr) + sizeof(struct radix *) * N = cache_line_size This leads to a bit more memory usage in the 'many leaf' case, but IMHO utilizes caches much better. Another thing ... i do not like O(long N) for EVERY packet lookup, when we can have O(1) :). There aint as much workloads in the world, lets identify and use them ... maybe someone finds a cool hash function. -- mingo
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?19970122143231.16798.qmail>