Date: Tue, 28 Mar 2006 16:38:11 +0200 From: Pieter de Goeje <pieter@degoeje.nl> To: JoaoBR <joao@matik.com.br> Cc: pyunyh@gmail.com, freebsd-stable@freebsd.org Subject: Re: new sk driver [was: nve timeout (and down) regression?] Message-ID: <200603281638.11650.pieter@degoeje.nl> In-Reply-To: <200603280740.18951.joao@matik.com.br> References: <20060324223317.2069564f@it.buh.tecnik93.com> <200603280221.28996.pieter@degoeje.nl> <200603280740.18951.joao@matik.com.br>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tuesday 28 March 2006 12:40, you wrote: <snip> > probably you do not have the traffic to make the box crash or less then > 1/2GB of RAM in use The box has 1GB RAM. Traffic is approx. 2-3Mbit/s. > > in fact the problem does not happen on UP machines, only some times a > device timeout which only ocasionally cause rx/tx to stop > > The problem is appearing on SMP machines > > when you have less then 2Gb of RAM the problem ocurres once a day or so and > seems to depend on memory use and amount of traffic > > soon the traffic reaches more than 1Mbit/s the crash is predictable and you > can wait to see The box has actually crashed once, but I am not sure it was because of the NIC. ~> uptime 4:19PM up 3 days, 9:59, 1 user, load averages: 1.38, 1.20, 1.03 > > on 4GB of Ram machines and more traffic the crash is imediatly and worse > when the box crashed under load (4-6Mbit/s) and comes back then the high > demand strokes it and it crashes in minutes or imediatly soon the network > is up > > so probably mpsafenet may help by not processing concurrent packets but > this is a workaround not a solution (for me) Agreed. > > last time I checked mpsafenet=0 almost cut 1Mbit/s of traffic and the > overall performance/response was bad, higher HZ did not resolved anything > and disabling polling made it still worse (I have other NICs installed), > the machines are working as GW I can't really tell if the performance is impaired by mpsafenet=0, because the box is mostly busy doing userland stuff. Typical traffic looks like this: ~> netstat -w 1 input (Total) output packets errs bytes packets errs bytes colls 1186 0 97134 1302 0 276430 0 1206 0 97484 1382 0 264315 0 1193 0 97048 1366 0 278901 0 1198 0 98251 1403 0 273428 0 1205 0 99283 1393 0 270364 0 1162 0 94746 1376 0 265909 0 1162 0 93011 1420 0 258514 0 1187 0 94366 1467 0 263162 0 1178 0 93441 1441 0 248875 0 1176 0 93116 1484 0 266285 0 1146 0 91615 1424 0 256180 0 1222 0 96597 1560 0 432862 0 1222 0 93796 1591 0 444466 0 This is all UDP. The traffic generates around 2000 interrupts/sec on sk. > > until january the machines didn't crashed, only timeouts and rx/tx stops > I used Pyun's driver and the timeouts went away, thank's again! > > so then I got confused by some if_sk talks on stable and thought the driver > was comitted and the boxes started crashing until I got it last week and > reused Pyun's driver again and my sk problems are gone again, the machines > are stable for 4/5 days now I'm going to test the new driver to see if I can disable mpsafenet. To be specific on the NIC: skc0@pci0:10:0: class=0x020000 card=0x811a1043 chip=0x432011ab rev=0x13 hdr=0x00 vendor = 'Marvell Semiconductor (Was: Galileo Technology Ltd)' device = '88E8001/8003/8010 Gigabit Ethernet Controller with Integrated PHY (copper)' class = network subclass = ethernet Pieter de Goeje
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200603281638.11650.pieter>