Date: Wed, 16 Apr 2008 12:29:19 -0400 From: "Alexander Sack" <pisymbol@gmail.com> To: freebsd-drivers@freebsd.org Cc: freebsd-net@freebsd.org, freebsd-hackers@freebsd.org Subject: bge dropping packets issue Message-ID: <3c0b01820804160929i76cc04fdy975929e2a04c0368@mail.gmail.com>
next in thread | raw e-mail | index | archive | help
Hello: Sorry for cross posting but this seems to be both a driver and network/kernel issue so I figure I actually thought all lists seemed appropriate. I'm investigating an issue we are seeing with 6.1-RELEASE and the bge driver dropping packets sporadically at 100MBps speed. The machine is a 2-way Intel dual-core running 64-bit FreeBSD-6.1 Release with SMP/8GB RAM. I would post dmesg but currently I'm running a test and has a lot of instrumentation in it. Anyway, what I'm seeing with a SmartBit traffic generator connected to 4 bge cards (all BCOM_DEVICEID_BCM5704C) is sporadic packet drops as recorded by the firmware in its statistics structure (as pulled out by bge_tick()), i.e. this isn't malloc starvation of allocating mbuf clusters, etc. The firmware seems to just drop packets occasionally (depending on workload). Its get mainly aggravated when heavy disk I/O occurs from generating a network report which entails gzip'ing a very large dumpfile in /tmp and then anonymously ftping it via another interface (em). DEVICE_POLLING is being used: # sysctl -a | grep kern.polling kern.polling.idlepoll_sleeping: 1 kern.polling.stalled: 3 kern.polling.suspect: 1023 kern.polling.phase: 0 kern.polling.enable: 1 kern.polling.handlers: 6 kern.polling.residual_burst: 0 kern.polling.pending_polls: 0 kern.polling.lost_polls: 24436 kern.polling.short_ticks: 592 kern.polling.reg_frac: 20 kern.polling.user_frac: 50 kern.polling.idle_poll: 0 kern.polling.each_burst: 32 kern.polling.burst_max: 1000 kern.polling.burst: 1000 After looking at the driver for a bit, I believe the issue maybe from RX chain starvation which causes the firmware to drop packets when bge_rxeof() can not keep up. The driver uses a global locking scheme which may contribute to some of these robustness issues (this is a generalization on my part without hard facts so take it with a grain of salt, I just notice things like bge_tick() being called every cycle and competing with the ISR when it may not have too or may not have too for its entire duration, updating stats for example). My main question is currently the RX chain slots are set to a global define BGE_SSLOTS (if_bgedevreg.h) which is 256. Technically this card I believe can do up to 512 slots and the comment above said these are tunable yet not exposed via SYSCTL. Does anyone know why its not 512 by default? Is there any harm in setting it to 512 instead of 256? Why not make it tunable (512 as max)? I've increased the SSLOTS to 512 so there are more RX chain slots available (as I currently understand it, I don't have specs) and the kern.polling.each_burst to 150 (max) in an effort to try to keep the BGE driver in bge_rxeof() and to experiment a bit! This is the first exposure to this code so be gentle! :D! Has anyone seen this problem before with bge? Am I barking up the wrong tree with my initial investigation? Does anyone know if its even possible to achieve 100% packet capture with bge at its supported speeds (10/100/1000)? Thanks! -aps
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3c0b01820804160929i76cc04fdy975929e2a04c0368>