Date: Tue, 9 Oct 2007 19:38:24 +0200 From: "Petr Holub" <hopet@ics.muni.cz> To: <performance@freebsd.org> Cc: rdivacky@freebsd.org Subject: Myrinet 10GE performance on 7.0-CURRENT Message-ID: <036c01c80a9b$3145b640$5317fb93@KLOBOUCEK>
next in thread | raw e-mail | index | archive | help
Dear all, I've performed inital set of experiments with FreeBSD 7.0-CURRENT (built on Oct 8th) with Myrinet 10GE cards. Kernel is based on GENERIC with the following options disabled: #options INVARIANTS #options INVARIANT_SUPPORT #options WITNESS #options WITNESS_SKIPSPIN and SCHED_ULE instead of SCHED_4BSD. Userland is built using production malloc.c (MALLOC_PRODUCTION defined in lib/libc/stdlib/malloc.c) dmesg output is available at the end of the email (basically, 2x dual-core Intel Xeon 5160 @ GHz, identical machines for both sending and receiving running identical systems). The two machines are connected point to point using LR XFPs and about 4m of fiber. The following tunables have been set: net.inet.tcp.sendspace: 8388608 net.inet.tcp.recvspace: 8388608 net.inet.udp.recvspace: 8388608 net.inet.raw.recvspace: 8388608 kern.ipc.maxsockbuf: 10000000 on both sender and receiver. sender: [root@synchro-brno ~]# iperf -c 192.168.1.1 -u -l 8500 -i 1 -t 15 -b 9G -w 2M ------------------------------------------------------------ Client connecting to 192.168.1.1, UDP port 5001 Sending 8500 byte datagrams UDP buffer size: 2.00 MByte ------------------------------------------------------------ [ 3] local 192.168.1.2 port 55844 connected with 192.168.1.1 port 5001 [ 3] 0.0- 1.0 sec 1.07 GBytes 9.21 Gbits/sec [ 3] 1.0- 2.0 sec 1.07 GBytes 9.20 Gbits/sec [ 3] 2.0- 3.0 sec 1.07 GBytes 9.20 Gbits/sec [ 3] 3.0- 4.0 sec 1.07 GBytes 9.20 Gbits/sec [ 3] 4.0- 5.0 sec 1.07 GBytes 9.20 Gbits/sec [ 3] 5.0- 6.0 sec 1.07 GBytes 9.21 Gbits/sec [ 3] 6.0- 7.0 sec 1.07 GBytes 9.20 Gbits/sec [ 3] 7.0- 8.0 sec 1.07 GBytes 9.20 Gbits/sec [ 3] 8.0- 9.0 sec 1.07 GBytes 9.20 Gbits/sec [ 3] 9.0-10.0 sec 1.07 GBytes 9.20 Gbits/sec [ 3] 10.0-11.0 sec 1.07 GBytes 9.20 Gbits/sec [ 3] 11.0-12.0 sec 1.07 GBytes 9.20 Gbits/sec [ 3] 12.0-13.0 sec 1.07 GBytes 9.20 Gbits/sec [ 3] 13.0-14.0 sec 1.07 GBytes 9.21 Gbits/sec [ 3] 0.0-15.0 sec 16.1 GBytes 9.20 Gbits/sec [ 3] Sent 2030369 datagrams [ 3] Server Report: [ 3] 0.0-15.0 sec 16.1 GBytes 9.20 Gbits/sec 0.002 ms 1655/2030369 (0.082%) receiver: [root@synchro-plzen ~]# iperf -s -u -l 8500 -i 1 ------------------------------------------------------------ Server listening on UDP port 5001 Receiving 8500 byte datagrams UDP buffer size: 8.00 MByte (default) ------------------------------------------------------------ [ 3] local 192.168.1.1 port 5001 connected with 192.168.1.2 port 55844 [ 3] 0.0- 1.0 sec 1.07 GBytes 9.21 Gbits/sec 0.004 ms 0/135463 (0%) [ 3] 1.0- 2.0 sec 1.07 GBytes 9.20 Gbits/sec 0.003 ms 0/135343 (0%) [ 3] 2.0- 3.0 sec 1.07 GBytes 9.20 Gbits/sec 0.002 ms 0/135363 (0%) [ 3] 3.0- 4.0 sec 1.07 GBytes 9.21 Gbits/sec 0.002 ms 0/135368 (0%) [ 3] 4.0- 5.0 sec 1.07 GBytes 9.20 Gbits/sec 0.003 ms 0/135337 (0%) [ 3] 5.0- 6.0 sec 1.07 GBytes 9.21 Gbits/sec 0.002 ms 0/135374 (0%) [ 3] 6.0- 7.0 sec 1.07 GBytes 9.20 Gbits/sec 0.002 ms 0/135336 (0%) [ 3] 7.0- 8.0 sec 1.07 GBytes 9.20 Gbits/sec 0.002 ms 0/135355 (0%) [ 3] 8.0- 9.0 sec 1.07 GBytes 9.20 Gbits/sec 0.002 ms 0/135306 (0%) [ 3] 9.0-10.0 sec 1.07 GBytes 9.20 Gbits/sec 0.002 ms 0/135355 (0%) [ 3] 10.0-11.0 sec 1.07 GBytes 9.20 Gbits/sec 0.003 ms 0/135329 (0%) [ 3] 11.0-12.0 sec 1.06 GBytes 9.09 Gbits/sec 0.003 ms 1655/135337 (1.2%) [ 3] 12.0-13.0 sec 1.07 GBytes 9.20 Gbits/sec 0.002 ms 0/135344 (0%) [ 3] 13.0-14.0 sec 1.07 GBytes 9.21 Gbits/sec 0.004 ms 0/135397 (0%) [ 3] 0.0-15.0 sec 16.1 GBytes 9.20 Gbits/sec 0.002 ms 1655/2030369 (0.082%) CPU-wise, iperf takes 200% WCPU, about 36% is system time, 14% user time, 1.5% interrupt and 48.6% idle. Sometimes, I can observe behavior, when after some time, performance drops from >9 Gbps to about 8.7 Gbps, as shown below: [root@synchro-brno ~]# iperf -c 192.168.1.1 -u -l 8500 -i 1 -t 60 -b 9900M -w 2M ------------------------------------------------------------ Client connecting to 192.168.1.1, UDP port 5001 Sending 8500 byte datagrams UDP buffer size: 2.00 MByte ------------------------------------------------------------ [ 3] local 192.168.1.2 port 60761 connected with 192.168.1.1 port 5001 [ 3] 0.0- 1.0 sec 1.12 GBytes 9.64 Gbits/sec [ 3] 1.0- 2.0 sec 1.12 GBytes 9.63 Gbits/sec [ 3] 2.0- 3.0 sec 1.12 GBytes 9.63 Gbits/sec [ 3] 3.0- 4.0 sec 1.12 GBytes 9.63 Gbits/sec [ 3] 4.0- 5.0 sec 1.12 GBytes 9.63 Gbits/sec [ 3] 5.0- 6.0 sec 1.12 GBytes 9.64 Gbits/sec [ 3] 6.0- 7.0 sec 1.12 GBytes 9.64 Gbits/sec [ 3] 7.0- 8.0 sec 1.12 GBytes 9.64 Gbits/sec [ 3] 8.0- 9.0 sec 1.12 GBytes 9.63 Gbits/sec [ 3] 9.0-10.0 sec 1.12 GBytes 9.60 Gbits/sec [ 3] 10.0-11.0 sec 1.01 GBytes 8.71 Gbits/sec [ 3] 11.0-12.0 sec 1.01 GBytes 8.71 Gbits/sec [ 3] 12.0-13.0 sec 1.01 GBytes 8.71 Gbits/sec [ 3] 13.0-14.0 sec 1.01 GBytes 8.71 Gbits/sec [ 3] 14.0-15.0 sec 1.01 GBytes 8.71 Gbits/sec [ 3] 15.0-16.0 sec 1.01 GBytes 8.71 Gbits/sec [ 3] 16.0-17.0 sec 1.01 GBytes 8.71 Gbits/sec [ 3] 17.0-18.0 sec 1.01 GBytes 8.71 Gbits/sec [ 3] 18.0-19.0 sec 1.01 GBytes 8.71 Gbits/sec [ 3] 19.0-20.0 sec 1.01 GBytes 8.71 Gbits/sec [ 3] 20.0-21.0 sec 1.01 GBytes 8.71 Gbits/sec [ 3] 21.0-22.0 sec 1.01 GBytes 8.71 Gbits/sec [ 3] 0.0-22.9 sec 24.3 GBytes 9.11 Gbits/sec [ 3] Sent 3063929 datagrams [ 3] Server Report: [ 3] 0.0-22.9 sec 24.3 GBytes 9.11 Gbits/sec 0.003 ms 0/3063928 (0%) [ 3] 0.0-22.9 sec 1 datagrams received out-of-order Sometimes I can get also very close to wirespeed 9.90 Gbps when systat -ifstat 1 says about 9.97 Gbps on both sender and receiver (without any packet loss!). However, as shown above, this is not stable and it seems to fluctuate in longer time between 8.7, 9.6, and 9.9 Gbps (e.g. it can run on each speed for couple of tens of seconds and then the performance changes either upwards or downwards). As shown above, there are also sometimes some random packet losses. BTW, with WITNESS and INVARIANTS enabled, I can do about 2.8 Gbps and iperf eats about 200% WCPU while about 50% time is spent in system. I will do more testing tomorrow. If you have some ideas for further tuning and experiments, let me know. Petr ==================== Machine info follows =================================== [root@synchro-brno ~]# dmesg Copyright (c) 1992-2007 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 7.0-CURRENT #1: Tue Oct 9 16:59:21 CEST 2007 root@:/usr/obj/usr/src/sys/GENERIC WARNING: WITNESS option enabled, expect reduced performance. Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) Xeon(R) CPU 5160 @ 3.00GHz (3000.12-MHz K8-class CPU) Origin = "GenuineIntel" Id = 0x6f6 Stepping = 6 Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,C MOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> Features2=0x4e3bd<SSE3,RSVD2,MON,DS_CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,DCA> AMD Features=0x20100800<SYSCALL,NX,LM> AMD Features2=0x1<LAHF> Cores per package: 2 usable memory = 4281200640 (4082 MB) avail memory = 4119842816 (3928 MB) ACPI APIC Table: <PTLTD APIC > FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 cpu2 (AP): APIC ID: 6 cpu3 (AP): APIC ID: 7 ioapic0 <Version 2.0> irqs 0-23 on motherboard ioapic1 <Version 2.0> irqs 24-47 on motherboard kbd1 at kbdmux0 ath_hal: 0.9.20.3 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, RF5413) acpi0: <PTLTD RSDT> on motherboard acpi0: [ITHREAD] acpi0: Power Button (fixed) Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x1008-0x100b on acpi0 cpu0: <ACPI CPU> on acpi0 est0: <Enhanced SpeedStep Frequency Control> on cpu0 p4tcc0: <CPU Frequency Thermal Control> on cpu0 cpu1: <ACPI CPU> on acpi0 est1: <Enhanced SpeedStep Frequency Control> on cpu1 p4tcc1: <CPU Frequency Thermal Control> on cpu1 cpu2: <ACPI CPU> on acpi0 est2: <Enhanced SpeedStep Frequency Control> on cpu2 p4tcc2: <CPU Frequency Thermal Control> on cpu2 cpu3: <ACPI CPU> on acpi0 est3: <Enhanced SpeedStep Frequency Control> on cpu3 p4tcc3: <CPU Frequency Thermal Control> on cpu3 pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 pci0: <ACPI PCI bus> on pcib0 pcib1: <ACPI PCI-PCI bridge> at device 2.0 on pci0 pci1: <ACPI PCI bus> on pcib1 pcib2: <ACPI PCI-PCI bridge> irq 16 at device 0.0 on pci1 pci2: <ACPI PCI bus> on pcib2 pcib3: <ACPI PCI-PCI bridge> irq 16 at device 0.0 on pci2 pci3: <ACPI PCI bus> on pcib3 pcib4: <ACPI PCI-PCI bridge> irq 18 at device 2.0 on pci2 pci4: <ACPI PCI bus> on pcib4 em0: <Intel(R) PRO/1000 Network Connection Version - 6.5.3> port 0x2000-0x201f m em 0xd9200000-0xd921ffff irq 18 at device 0.0 on pci4 em0: Ethernet address: 00:30:48:33:86:5e em0: [FILTER] em1: <Intel(R) PRO/1000 Network Connection Version - 6.5.3> port 0x2020-0x203f m em 0xd9220000-0xd923ffff irq 19 at device 0.1 on pci4 em1: Ethernet address: 00:30:48:33:86:5f em1: [FILTER] pcib5: <ACPI PCI-PCI bridge> at device 0.3 on pci1 pci5: <ACPI PCI bus> on pcib5 pcib6: <ACPI PCI-PCI bridge> at device 4.0 on pci0 pci6: <ACPI PCI bus> on pcib6 pci6: <network, ethernet> at device 0.0 (no driver attached) pcib7: <ACPI PCI-PCI bridge> at device 6.0 on pci0 pci7: <ACPI PCI bus> on pcib7 pci0: <base peripheral> at device 8.0 (no driver attached) uhci0: <UHCI (generic) USB controller> port 0x1800-0x181f irq 17 at device 29.0 on pci0 uhci0: [GIANT-LOCKED] uhci0: [ITHREAD] usb0: <UHCI (generic) USB controller> on uhci0 usb0: USB revision 1.0 uhub0: <Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb0 uhub0: 2 ports with 2 removable, self powered uhci1: <UHCI (generic) USB controller> port 0x1820-0x183f irq 19 at device 29.1 on pci0 uhci1: [GIANT-LOCKED] uhci1: [ITHREAD] usb1: <UHCI (generic) USB controller> on uhci1 usb1: USB revision 1.0 uhub1: <Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb1 uhub1: 2 ports with 2 removable, self powered uhci2: <UHCI (generic) USB controller> port 0x1840-0x185f irq 18 at device 29.2 on pci0 uhci2: [GIANT-LOCKED] uhci2: [ITHREAD] usb2: <UHCI (generic) USB controller> on uhci2 usb2: USB revision 1.0 uhub2: <Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb2 uhub2: 2 ports with 2 removable, self powered uhci3: <UHCI (generic) USB controller> port 0x1860-0x187f irq 16 at device 29.3 on pci0 uhci3: [GIANT-LOCKED] uhci3: [ITHREAD] usb3: <UHCI (generic) USB controller> on uhci3 usb3: USB revision 1.0 uhub3: <Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1> on usb3 uhub3: 2 ports with 2 removable, self powered ehci0: <EHCI (generic) USB 2.0 controller> mem 0xd9600400-0xd96007ff irq 17 at d evice 29.7 on pci0 ehci0: [GIANT-LOCKED] ehci0: [ITHREAD] usb4: EHCI version 1.0 usb4: companion controllers, 2 ports each: usb0 usb1 usb2 usb3 usb4: <EHCI (generic) USB 2.0 controller> on ehci0 usb4: USB revision 2.0 uhub4: <Intel EHCI root hub, class 9/0, rev 2.00/1.00, addr 1> on usb4 uhub4: 8 ports with 8 removable, self powered pcib8: <ACPI PCI-PCI bridge> at device 30.0 on pci0 pci8: <ACPI PCI bus> on pcib8 vgapci0: <VGA-compatible display> port 0x3000-0x30ff mem 0xd0000000-0xd7ffffff,0 xd9300000-0xd930ffff irq 18 at device 1.0 on pci8 isab0: <PCI-ISA bridge> at device 31.0 on pci0 isa0: <ISA bus> on isab0 atapci0: <Intel 63XXESB2 SATA300 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177, 0x376,0x1890-0x189f at device 31.2 on pci0 ata0: <ATA channel 0> on atapci0 ata0: [ITHREAD] ata1: <ATA channel 1> on atapci0 ata1: [ITHREAD] pci0: <serial bus, SMBus> at device 31.3 (no driver attached) acpi_button0: <Power Button> on acpi0 atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0 atkbd0: <AT Keyboard> irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] atkbd0: [ITHREAD] sio0: configured irq 4 not in bitmap of probed irqs 0 sio0: port may not be enabled sio0: configured irq 4 not in bitmap of probed irqs 0 sio0: port may not be enabled sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 sio0: type 16550A sio0: [FILTER] sio1: configured irq 3 not in bitmap of probed irqs 0 sio1: port may not be enabled sio1: configured irq 3 not in bitmap of probed irqs 0 sio1: port may not be enabled sio1: <16550A-compatible COM port> port 0x2f8-0x2ff irq 3 on acpi0 sio1: type 16550A sio1: [FILTER] fdc0: <floppy drive controller> port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0 fdc0: [FILTER] orm0: <ISA Option ROMs> at iomem 0xc0000-0xcafff,0xcb000-0xd2fff on isa0 ppc0: cannot reserve I/O port range sc0: <System console> at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 Timecounters tick every 1.000 msec ad0: 239372MB <WDC WD2500YS-01SHB1 20.06C06> at ata0-master SATA150 ad1: 239372MB <WDC WD2500YS-01SHB1 20.06C06> at ata0-slave SATA150 SMP: AP CPU #1 Launched! SMP: AP CPU #2 Launched! SMP: AP CPU #3 Launched! WARNING: WITNESS option enabled, expect reduced performance. Trying to mount root from ufs:/dev/ad1s1a mxge0: <Myri10G-PCIE-8A> mem 0xd8000000-0xd8ffffff,0xd9000000-0xd90fffff irq 16 at device 0.0 on pci6 mxge0: [ITHREAD] mxge0: Ethernet address: 00:60:dd:47:6b:f3 mxge0: link state changed to UP [root@synchro-brno ~]# kldstat Id Refs Address Size Name 1 8 0xffffffff80100000 b1af40 kernel 2 1 0xffffffffb09d3000 88aa if_mxge.ko 3 1 0xffffffffb09dc000 a472 zlib.ko 4 1 0xffffffffb19eb000 ca52 mxge_ethp_z8e.ko 5 1 0xffffffffb19f9000 c8fd mxge_eth_z8e.ko [root@synchro-brno ~]# ifconfig mxge0 mxge0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000 options=1bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM, TSO4> ether 00:60:dd:47:6b:f3 inet 192.168.1.2 netmask 0xffffff00 broadcast 192.168.1.255 media: Ethernet 10Gbase-LR (autoselect <full-duplex>) status: active
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?036c01c80a9b$3145b640$5317fb93>