Date: Wed, 19 Aug 2009 22:27:00 +0600 From: =?UTF-8?B?0JTQvNC40YLRgNC40Lkg0JfQsNC80YPRgNCw0LXQsg==?= <gigabyte.tmn@gmail.com> To: <alexpalias-bsdnet@yahoo.com> Cc: freebsd-net@freebsd.org Subject: Re: em driver input errors Message-ID: <000e01ca20e9$e19caa10$1e010a0a@in72.ru> References: <24727.68667.qm@web56404.mail.re3.yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Hello Alex. What sheduler are you using? ULE or 4BSD Have you NIC IRQ sharing with other hardware? What HZ value? 1000? >Thanks for the suggestion. >From a "clean" box: >dev.em.0.rx_int_delay: 0 >dev.em.0.tx_int_delay: 66 >dev.em.0.rx_abs_int_delay: 66 >dev.em.0.tx_abs_int_delay: 66 >I reset all the values (errors still appearing), then tried your suggestion >(rx_int_delay=600, rx_abs_int_delay=1000). This has reduced the number of > >interrupts for em0 (from about 7200/sec to around 6500/sec). After some >time, I started getting errors again. mmm, try the maximum value 67108, what hapens... > But that has made me try this also: >dev.em.0.tx_int_delay=600 >dev.em.0.tx_abs_int_delay=1000 I think it's a bad idea, but don't know because: >Meaning using your suggested values for tx too. Now em0 is seeing about >1800 interrupts/second, which is way better, but after some time I saw >errors >again... >From the output of "netstat -nI em0 -w 5": maybe mistake, did you meen "netstat -w5 em0" ? I have PPPoE concenrator based on S3000AHV motherboard with Core2Quad 6600 and four (to load all cores in CPU) Intel PCI-E x1 and PCI-E x4 NIC's My load: bras1 [/usr/home/dm]# netstat -w5 em0 input (Total) output packets errs bytes packets errs bytes colls 943831 0 803741196 932221 0 766771487 0 ^C bras1 [/usr/home/dm]# netstat -w1 -Iem0 input (em0) output packets errs bytes packets errs bytes colls 24067 0 20593033 17152 0 17361755 0 ^C bras1 [/usr/home/dm]# netstat -w1 -Ilagg0 input (lagg0) output packets errs bytes packets errs bytes colls 47085 0 38454150 46708 0 38128482 0 44888 0 36087138 44714 0 35985529 0 49607 0 40467232 49326 0 40227456 0 ^C bras1 [/usr/home/dm]# netstat -w5 -Ilagg0 input (lagg0) output packets errs bytes packets errs bytes colls 230260 0 187650240 228911 0 186485136 0 238023 0 194650670 236648 0 193471650 0 218424 0 175576014 216860 0 174282762 0 ^C The lagg0 interface includes em0, em1, em2, em3 for lacp protocol, and comunicates with cisco 2960G switch. vmstat -i says: interrupt total rate irq4: sio0 95234 0 irq19: atapci1 8430157 1 cpu0: timer 1275549106 258 irq256: em0 2329917460 472 irq257: em1 645070135 130 irq258: em2 3527395550 715 irq259: em3 3923746474 795 cpu1: timer 1275548822 258 cpu3: timer 1275548798 258 cpu2: timer 1275548865 258 Total 15536850601 3149 And i have't any problems. I think i select the good hardware. > input (em0) output > packets errs bytes packets errs bytes colls > 87267 0 50372599 106931 0 81598993 0 > 86496 0 50990332 105467 0 80064657 0 > 81726 3056 49876613 99080 0 73273640 0 > 90425 0 59172531 105299 0 77110096 0 > 120292 0 70369292 109597 0 78626248 0 >... a few minutes pass with zero errors ... > 89646 0 56951878 111240 0 86493393 0 > 86031 0 53549721 108695 0 83592747 0 > 77760 3054 48505562 96912 0 73185576 0 > 87508 0 56116394 106094 0 79130608 0 > 89031 0 56490982 103039 0 77398567 0 >What's interesting is that I'm seeing errors in a 80k packets/5 sec (so >around 16k packets/s) zone, but no errors at 120k packets/5sec (24kpps). Yes, it's not normaly. >Interrupts total (as reported by systat): around 13500/second. I would >estimate the old IRQ load at around 30000-35000/second, which doesn't seem >too >much to me, for a dual xeon machine. I think it depends by motherbord, what full hardware specification are you using? with chips names >Speaking of which, I did compile the kernel with "options DEVICE_POLLING", >but enabling polling only made the errors appear more often, and in greater > >numbers. I don't use polling on FBSD 7.x, it's usable on FBSD older versions > - 1 x dual-port gigabit interface, PCI-X Maybe I have this card. And it works unstable, i don't remember what happens, but i seen by tcpdump "truncated IP, missing XX bytes" Good luck.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?000e01ca20e9$e19caa10$1e010a0a>