Date: Tue, 4 Aug 2009 10:14:45 +0200 From: Invernizzi Fabrizio <fabrizio.invernizzi@telecomitalia.it> To: Ray Kinsella <raykinsella78@gmail.com> Cc: "freebsd-performance@freebsd.org" <freebsd-performance@freebsd.org> Subject: RE: Test on 10GBE Intel based network card Message-ID: <36A93B31228D3B49B691AD31652BCAE9A456967237@GRFMBX702BA020.griffon.local> In-Reply-To: <584ec6bb0908030914m74b79dceq9af2581e1b02449a@mail.gmail.com> References: <36A93B31228D3B49B691AD31652BCAE9A4560DF911@GRFMBX702BA020.griffon.local> <584ec6bb0908030819vee58480p43989b742e1b7fd2@mail.gmail.com> <584ec6bb0908030914m74b79dceq9af2581e1b02449a@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
--_002_36A93B31228D3B49B691AD31652BCAE9A456967237GRFMBX702BA02_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Ray, >To me it looks like interrupt coalescing is not switched on for some reaso= n. >Are you passing any parameters to the driver in boot.conf. This is my loader.conf kern.ipc.nmbclusters=3D65635 kern.hz=3D1000 net.bpf_jitter.enable=3D1 # net.graph.threads=3D32 # if_em_load=3D"YES" # NETGRAPH TUNING net.graph.maxdata=3D1024 kern.ipc.somaxconn=3D4096 net.inet.tcp.recvspace=3D78840 net.inet.tcp.sendspace=3D78840 kern.ipc.shmmax=3D67108864 kern.ipc.shmmni=3D200 kern.ipc.shmseg=3D128 kern.ipc.semmni=3D70 net.local.stream.sendspace=3D82320 net.local.stream.recvspace=3D82320 net.inet.tcp.local_slowstart_flightsize=3D10 net.inet.tcp.nolocaltimewait=3D1 net.inet.tcp.hostcache.expire=3D3900 kern.maxusers=3D512 kern.ipc.nmbclusters=3D32768 kern.ipc.maxsockets=3D81920 kern.ipc.maxsockbuf=3D1048576 net.inet.tcp.tcbhashsize=3D4096 net.inet.tcp.hostcache.hashsize=3D1024 >Could you retest with vmstat switched on "vmstat 3" and send us the output= . >I expect we are going to see alot of interrupts. Sending 535714 pps (64bytes-long) INTRUDER-64# vmstat 3 procs memory page disks faults cp= u r b w avm fre flt re pi po fr sr da0 da1 in sy cs us= sy id 0 0 0 95420K 7203M 19 0 0 0 17 0 0 0 642 66 1078 0= 2 98 0 0 0 95420K 7203M 0 0 0 0 0 0 2 0 18 65 527 0= 0 100 0 0 0 95420K 7203M 0 0 0 0 0 0 0 0 18 67 527 0= 0 100 0 0 0 95420K 7203M 0 0 0 0 0 0 0 0 17 64 525 0= 0 100 0 0 0 95420K 7203M 0 0 0 0 0 0 0 0 31526 64 31402 = 0 87 13 0 0 0 95420K 7203M 0 0 0 0 0 0 0 0 36767 64 33320 = 0 99 1 0 0 0 95420K 7203M 423 0 0 0 406 0 0 0 36174 384 28107 = 0 99 1 0 0 0 95420K 7203M 0 0 0 0 0 0 0 0 36706 64 27043 = 0 99 1 0 0 0 95420K 7203M 0 0 0 0 0 0 0 0 34006 64 13117 = 0 91 9 2 0 0 95420K 7203M 0 0 0 0 0 0 0 0 17 64 550 0= 1 99 0 0 0 95420K 7203M 0 0 0 0 3 0 3 0 19 68 507 0= 0 100 dev.ix.0.%desc: Intel(R) PRO/10GbE PCI-Express Network Driver, Version - 1.= 7.4 dev.ix.0.%driver: ix dev.ix.0.%location: slot=3D0 function=3D0 dev.ix.0.%pnpinfo: vendor=3D0x8086 device=3D0x10c6 subvendor=3D0x8086 subde= vice=3D0xa15f class=3D0x020000 dev.ix.0.%parent: pci3 dev.ix.0.stats: -1 dev.ix.0.debug: -1 dev.ix.0.flow_control: 0 dev.ix.0.enable_lro: 1 Adaptive Interrupt Mitigation is enabled: dev.ix.0.enable_aim: 1 I did not changed AIM settings since is quite complex to tune them up. I t= ried to reverse engineering the algorithm of AIM (see attached picture) but= I can't obtain tangible improvements playing with these paramenters. My un= derstanding is that I should reduce the low_latency, but it seems not to wo= rk. dev.ix.0.low_latency: 128 dev.ix.0.ave_latency: 400 dev.ix.0.bulk_latency: 1200 Not sure about this dev.ix.0.hdr_split: 0 Not sure about the meaning of dev.ix.0.rx_processing_limit: 100 These are the settings I am using in the ixgbe driver: #define DEFAULT_TXD 1024 #define PERFORM_TXD 2048 #define MAX_TXD 4096 #define MIN_TXD 64 #define DEFAULT_RXD 1024 #define PERFORM_RXD 2048 #define MAX_RXD 4096 #define MIN_RXD 64 #define IXGBE_TX_CLEANUP_THRESHOLD (adapter->num_tx_desc / 1) #define IXGBE_TX_OP_THRESHOLD (adapter->num_tx_desc / 4) I see that I had a good performance improvement setting IXGBE_TX_CLEANUP_TH= RESHOLD to the tx queue size. This is clear to understand since this (greatly) minimizes context switchin= g in send process reducing the number of time the txq function is called. (= Of course with this settings send latency is increased, but this is not an = issue). I can't understand the meaning of these parameters and if they could help: /* * This parameter controls the maximum no of times the driver will loop in * the isr. Minimum Value =3D 1 */ #define MAX_LOOP 10 /* * This parameter controls the duration of transmit watchdog timer. */ #define IXGBE_TX_TIMEOUT 5 /* set to 5 seconds */ Fabrizio Questo messaggio e i suoi allegati sono indirizzati esclusivamente alle per= sone indicate. La diffusione, copia o qualsiasi altra azione derivante dall= a conoscenza di queste informazioni sono rigorosamente vietate. Qualora abb= iate ricevuto questo documento per errore siete cortesemente pregati di dar= ne immediata comunicazione al mittente e di provvedere alla sua distruzione= , Grazie. This e-mail and any attachments is confidential and may contain privileged = information intended for the addressee(s) only. Dissemination, copying, pri= nting or use by anybody else is unauthorised. If you are not the intended r= ecipient, please delete this message and any attachments and advise the sen= der by return e-mail, Thanks. --_002_36A93B31228D3B49B691AD31652BCAE9A456967237GRFMBX702BA02_--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?36A93B31228D3B49B691AD31652BCAE9A456967237>