Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 26 Jun 2006 17:05:26 +1000
From:      Michael Vince <mv@thebeastie.org>
To:        Nikolas Britton <nikolas.britton@gmail.com>
Cc:        performance@freebsd.org, freebsd-stable@freebsd.org, Sean Bryant <bryants@gmail.com>
Subject:   Re: Gigabit ethernet very slow.
Message-ID:  <449F8736.3080508@thebeastie.org>
In-Reply-To: <ef10de9a0606251523h4102e782m1fe2403c57c80e57@mail.gmail.com>
References:  <ef10de9a0606250157jce24553h52e67db7a9f76b03@mail.gmail.com>	<f6791cc60606250835p51c966e7xa12fb241c9aaab8d@mail.gmail.com>	<ef10de9a0606250930k6b655e2bkb81694905454bf58@mail.gmail.com> <ef10de9a0606251523h4102e782m1fe2403c57c80e57@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Nikolas Britton wrote:

> On 6/25/06, Nikolas Britton <nikolas.britton@gmail.com> wrote:
>
>> On 6/25/06, Sean Bryant <bryants@gmail.com> wrote:
>> > /dev/zero not exactly the best way to test sending data across the
>> > network. Especially since you'll be reading a 8k chunks.
>> >
>> > I could be wrong, strong possibility that I am. I only got 408mb when
>> > doing a /dev/zero test. I've managed to saturate though. Using other
>> > software that I wrote.
>> > On 6/25/06, Nikolas Britton <nikolas.britton@gmail.com> wrote:
>> > > What's up with my computer, it's only getting 30MB/s?
>> > >
>> > > hostB: nc -4kl port > /dev/null
>> > > hostA: nc host port < /dev/zero
>> > >
>>
>> 408MByte/s or 408Mbit/s and what measuring stick are you using? I'm
>> trying to rule in/out problems with the disks, I'm only getting
>> ~25MB/s on a 6 disk RAID0 over the network... would it be better to
>> setup an memory backed disk, md(4) , to read from?
>>
>>
>
> Now I'm getting 523.2Mbit/s (65.4MB/s) with netcat, I wiped out the
> FreeBSD 6.1/amd64 install with FreeBSD 6.1/i386... and...
>
> After a kernel rebuild (recompiled nc too):
> CPUTYPE?=athlon-mp
> CFLAGS+= -mtune=athlon64
> COPTFLAGS+= -mtune=athlon64
>
> I'm up to 607.2Mbit/s (75.9MB/s). What else can I do to get that
> number higher, and how can I get interrupts lower?
>
> Before recompile:
> load averages:  0.94,  0.91,  0.66
> CPU states:  2.6% user,  0.0% nice, 21.5% system, 64.6% interrupt, 
> 11.3% idle
> -------------------
> After recompile:
> load averages:  0.99,  0.96,  0.76
> CPU states:  3.0% user,  0.0% nice, 33.7% system, 58.2% interrupt,  
> 5.1% idle
>
Out of interested I tried the same test with nc but with dd in the pipe 
or by watching it by pftop.

According to pftop (with modulate state rules) I am able to get about 
85megs/sec when I don't have dd running. dd does indeed eats a fair 
amount of cpu (40%) on the AMD64 6-stable machine.

With a dd pipe I am able to get roughly 70megs/sec  between 2 Dell 
machines, one of them being AMD64 (I ran dd on this one as its has 2 
CPUs). pftop confirms this figure as well.

cat /dev/zero | dd | nc host 3000
2955297+0 records in
2955297+0 records out
1513112064 bytes transferred in 20.733547 secs (72978930 bytes/sec)

These machines are also doing regular work and not idle.

I tested on another remote network setup as well, with a 3 FreeBSD 
setup, 1 client  one FreeBSD gateway and 3rd server. (host-A 
----host-B----host-C) HostA is the only one using 6-stable all others 
are 6.1.
None of these machines have polling and are all em devices (Dell servers).


Going from C to A (via B) gives 50megs/sec
host-C#cat /dev/zero | dd | nc host-A 3000
15000154+0 records in
15000153+0 records out
7680078336 bytes transferred in 152.320171 secs (50420626 bytes/sec)


Between them directly they all appear to give around 55-85megs/sec.

The shocker I found was sending data from hostA to hostC which appears 
to only give 1 meg/sec
host-A#cat /dev/zero | dd | nc host-C 3000
40135+0 records in
40134+0 records out
20548608 bytes transferred in 19.250176 secs (1067450 bytes/sec)

Host-A to Host-B. Actually all tests sending data from outside into 
anything past Host-B's internal network interface caused a massive drop 
in performance 800kbytes/sec
host-A#cat /dev/zero | dd | nc host-B(internal interface ip) 3000
58041+0 records in
58040+0 records out
29716480 bytes transferred in 36.137952 secs (822307 bytes/sec)

Going from Host-A to Host-B's external interface gives still gives fast 
results around 60megs/sec
host-A#cat /dev/zero | dd | nc host-B(external interface ip) 3000
4984545+0 records in
4984544+0 records out
2552086528 bytes transferred in 40.569696 secs (62906227 bytes/sec)

Speed from host-B (gateway) to Host-A is still ok at around 50megs/sec
host-B#cat /dev/zero | dd | nc host-A 3000
8826036+0 records in
8826035+0 records out
4518929920 bytes transferred in 80.471211 secs (56155858 bytes/sec)

Connecting from the internal server to the internal gateway ip gives a 
good speed around 70megs/sec
host-C#cat /dev/zero | dd | nc host-B(internal interface ip) 3000
6176688+0 records in
6176688+0 records out
3162464256 bytes transferred in 42.100412 secs (75117181 bytes/sec)

Interestingly connecting to the external interface of the gateway from 
the internal machine still gave good speeds around 70megs/sec
host-C# cat /dev/zero | dd | nc nc host-B(external interface ip) 3000
7107351+0 records in
7107351+0 records out
3638963712 bytes transferred in 49.451670 secs (73586265 bytes/sec)

I used to run the gateway with polling but ditched it when upgrading 
from 6.0 to 6.1 since the improved em driver came into 6.1
Would any one have any explaination as to why incomming data from Host A 
thru B to its most distant interface from Host-A would give such poor 
performance (1meg/sec) while going the other way seems to be fine?
Obviously its something going on inside the FreeBSD kernel as interface 
to interface tests are fine.

Its a its a Dell 1850 with 6.1-release-amd64 with pf rules enabled. The 
only only special kernel changes are FAST_IPSEC.
I tested with these sysctls 0-1 and they made no difference.
net.isr.direct=1
net.inet.ip.fastforwarding=1

Mike







Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?449F8736.3080508>