From owner-freebsd-net@FreeBSD.ORG  Sun Jun 12 12:43:15 2011
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id CBB12106564A
	for <freebsd-net@freebsd.org>; Sun, 12 Jun 2011 12:43:15 +0000 (UTC)
	(envelope-from luigi@onelab2.iet.unipi.it)
Received: from onelab2.iet.unipi.it (onelab2.iet.unipi.it [131.114.59.238])
	by mx1.freebsd.org (Postfix) with ESMTP id 913B78FC16
	for <freebsd-net@freebsd.org>; Sun, 12 Jun 2011 12:43:15 +0000 (UTC)
Received: by onelab2.iet.unipi.it (Postfix, from userid 275)
	id 0505B7300A; Sun, 12 Jun 2011 14:59:30 +0200 (CEST)
Date: Sun, 12 Jun 2011 14:59:29 +0200
From: Luigi Rizzo <rizzo@iet.unipi.it>
To: freebsd-net@freebsd.org
Message-ID: <20110612125929.GA76345@onelab2.iet.unipi.it>
References: <BANLkTinuOS_yZYrqZ4cmU4cim+KFHNA=hQ@mail.gmail.com>
	<alpine.BSF.2.00.1106111645010.44950@fledge.watson.org>
	<20110611181352.GA67777@onelab2.iet.unipi.it>
	<BANLkTi=VspGAjP2W9ttLHpw+cH1SESyVFQ@mail.gmail.com>
	<20110612062211.GA31301@itx>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20110612062211.GA31301@itx>
User-Agent: Mutt/1.4.2.3i
Subject: virtual NIC rings (Re: FreeBSD I/OAT (QuickData now?) driver)
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 12 Jun 2011 12:43:15 -0000

Two previous discussions on direct NIC access evolved on how
to share one interface among multiple clients (including the
host stack) some of which can be less trusted than others.

I have not addressed these problems yet in netmap, because
i see them as orthogonal to the API for accessing data, 
nor i think that the number of virtual queues supported by
the hardware is fundamental in solving the problem.

As i see it, the important thing is understand who does:

- replication (needed for multicast, broadcast, sniffing)
  The main reason to put the NIC in charge of replication is
  that it saves contention on locks. Other than that, there are
  only disadvantages, because the PCI-e bus has a lot less capacity
  than the memory bus, and (perhaps) a CPU-assisted replication
  could be based on shared readonly copies.

- packet filtering (which packet goes where);
  some nics already support routing of packets to queues according to
  various fields. This is nice to have, but not essential.
  I'd argue that it mostly saves lock contention, but otherwise,
  with the table sizes available in the NIC, even filtering in software
  is not prohibitively expensive.

- "access control" (is client X allowed to send/receive certain traffic).
  Do we want an untrusted client to be able to send packets with
  someone else's source MAC or IP ?  Perhaps the right answer is
  "who cares, it can happen anyways on the PC next to you", and
  this saves a per-packet decision.  On the incoming side the problem
  is easier, as programming the packet filter/replicator is done
  once in a while and not on a per-packet basis.

- memory protection.
  Right now netmap uses a single buffer area for all rings on all
  cards. While this is extremely efficient at runtime (you can move
  packets around by just shuffling buffer indexes; validity checks
  require a single comparison and perhaps a lookup; you can do
  zero-copy forwarding, or modify buffers if needed), it is clearly
  vulnerable to data corruption from misbehaving processes. It will
  be interesting to see how we can insert some protection without
  killing the option of zero-copy operation.

Once we have addresses these issues, of course any hardware
support will be welcome.

Just to refer to the one implementation i know, netmap creates one
virtual ring for each hardware ring, plus one ring pair for the
host stack.  I think the next step would be to put some additional
field (e.g. a MAC address) to drive the filtering and replication
engines (either on-nic, or done in software).  Then one could e.g.
say "give me all traffic for MAC X," and have the kernel decide
whether this needs partial or full traffic replication, etc.

cheers
luigi

http://info.iet.unipi.it/~luigi/netmap/