From owner-freebsd-smp  Thu Dec  5 19:36:06 1996
Return-Path: <owner-smp>
Received: (from root@localhost)
          by freefall.freebsd.org (8.8.4/8.8.4) id TAA25621
          for smp-outgoing; Thu, 5 Dec 1996 19:36:06 -0800 (PST)
Received: from friley216.res.iastate.edu (friley216.res.iastate.edu [129.186.78.216])
          by freefall.freebsd.org (8.8.4/8.8.4) with ESMTP id TAA25594
          for <freebsd-smp@freebsd.org>; Thu, 5 Dec 1996 19:35:59 -0800 (PST)
Received: from friley216.res.iastate.edu (loopback [127.0.0.1]) by friley216.res.iastate.edu (8.8.3/8.7.3) with ESMTP id VAA00697; Thu, 5 Dec 1996 21:35:29 -0600 (CST)
Message-Id: <199612060335.VAA00697@friley216.res.iastate.edu>
X-Mailer: exmh version 1.6.9 8/22/96
To: Peter Wemm <peter@spinner.dialix.com>
cc: freebsd-smp@freebsd.org
Subject: Re: make locking more generic? 
In-reply-to: Your message of Fri, 06 Dec 1996 11:02:54 +0800.
             <199612060302.LAA12276@spinner.DIALix.COM> 
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Date: Thu, 05 Dec 1996 21:35:29 -0600
From: Chris Csanady <ccsanady@friley216.res.iastate.edu>
Sender: owner-smp@freebsd.org
X-Loop: FreeBSD.org
Precedence: bulk


>Well, there are several things to consider before writing it off as
>"stupid", it's bloody quick compared to what something like STREAMS
>would be like, which is much more MP-safe.

I would not say so in comparison with that. :)  Just that it could be done
better..

>With the present "design", an mbuf is allocated in the network soft
>interrupt layer, and that mbuf is passed all the way up to the socket
>buffers, right through the ip and tcp layers without queueing it anywhere.
>The netisr handler literally "runs" the upwards direction tcp/ip "engine".
>It's pretty much the same in the other direction, but from memory one
>direction has one queueing stage, the other doesn't.
>
>STREAMS, on the other hand, has "modules" with an incoming and outgoing
>queue.  The network driver allocates an mblk/dblk and queues it in it's
>upstream neighbor's queue.  The upstream module's service routine is run
>which dequeues it, processes it, and enqueues it on the upstream side...
>And so on right up to the stream head.  On the plus side, it's got a lot
>of potential for putting hooks in for fine-grain locking on each queue and
>data block, but the overheads are incredible.
>
>I'm not familiar with Linux's design, I think that it's a simplified
>(read: the crap has been stripped out and optimised) version of the
>mbuf cluster model, where large buffers are used and passed around.
>I do not know if they queue on entry to each logical "component" of
>the protocol stack, but I suspect not, since they are after speed.
>
>Calling it a simplified version of mbuf clusters is probably not going
>to be popular, but that's what a casual glance suggested to me.  We have
>two major types of mbufs.. "standard" small ones 128 bytes long, with about
>106 bytes of data space or so, and "clusters" where the mbuf points to a
>2K or 4K page of data.  I think the Linux model is like the latter where
>there is either a seperate header and data chunk, or the header is at the
>start of the data "page".  If this is the case, their system probably won't
>lend itself to multi-threading any better than ours.
>
>I suspect that we're going to be stuck with a giant "networking lock"
>to cover everything from the soft network interrupts through the mbuf
>code, through to the protocol engine and high level socket buffers.

Perhaps this would be fixed as a side effect to implementing some of the
things that Van Jacobson was talking about.  I believe the structure that
he talks about would lend itself fairly well to finer grained locking.
(I could be wrong..)  But anyway, for those who are interested..

http://www.noao.edu/~rstevens/vanj.93sep07.txt
ftp://ftp.ee.lbl.gov/talks/vj-nkarch.ps.Z
ftp://ftp.ee.lbl.gov/talks/vj-nws93-1.ps.Z

This sounds like it would be a fun project.  Although, currently I dont know
much about our net code.  I hope to read TCP/IP Illustrated 1 & 2 over
christmas break though. :)

Chris

>
>There may be room to have a decoupling layer in between the network cards
>and the network "protocol engine" as such, and the same at the top end.
>This may allow us to get away with running the soft net processing for the
>cards in parallel with the network "engine" as such.  This will require
>a locked queueing stage to get data from one to the other.
>
>Cheers,
>-Peter