Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 21 Sep 2001 10:46:37 -0700
From:      Terry Lambert <tlambert2@mindspring.com>
To:        Andrew Gallatin <gallatin@cs.duke.edu>
Cc:        Luigi Rizzo <luigi@info.iet.unipi.it>, hackers@freebsd.org, wpaul@freebsd.org
Subject:   Re: any reason to use m_devget in the "dc" driver ?
Message-ID:  <3BAB7CFD.D5776F77@mindspring.com>
References:  <200109210703.JAA57756@info.iet.unipi.it> <15275.15530.678683.65377@grasshopper.cs.duke.edu>

next in thread | previous in thread | raw e-mail | index | archive | help
Andrew Gallatin wrote:
> I imagine that this was done to follow alignment constraints on
> non-i386 platforms where having the ip header misaligned is fatal.
> (the tulip is not capable of byte granularity DMA, so you can't
> intentionally misalign the ethernet header & end up with an aligned IP
> header)

This is the reason: the ethernet header is 14 bytes.


> I imagine the i386 should be made an exception. See rev 1.17 of
> sys/dev/nge/if_nge.c

I disagree with this code; the elemenets in the header
are referenced multiple times.  If you are doing the
checksum check, you might as well be relocating the data,
as well.  The change I would make would be to integrate
the checksum calculation with the m_devget(), to ensure
a single pass, in the case that m_devget() must be used
to get aligned packet payload, and the checksum has not
been offloaded to hardware.

When Bill finishes the Tigon III driver, he will find out
that it does not have the firmware problem the Tigon II
has, and that he can actually leave the checksum offload
active, and still be able to use VLANs (something that you
can not do with the Tigon II without serious changes to
the firmware).

IMO, in the vast majority of cases, it makes sense to do
the m_devget(), even though it looks like you can do the
unaligned access in 2 bus cycles instead of 1, and come
out ahead, for the IP and TCP header elements.  This could
be fixed by reducing the references to the elements so
that they are extracted only once, at which point, the
cost breaks even at a 32 byte or larger payload size, and
it is better to do the unaligned references than the copy
(assuming checksum offload, rather than opportunistic copy
at checksum calculation time).  Reordering the code to do
it this way is an ...interesting... exercise, since you
have to make some assumptions that may not be valid (e.g.
that it is an IP payload) before you get to the ipinput
code.

In any case, I think it would be useful to turn on the CR
bit that causes unaligned access faults on the 486 and
above Intel processors, as well (this discussion has taken
place before).

-- Terry

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3BAB7CFD.D5776F77>