Date: Sun, 02 Sep 2012 10:51:16 +0200 From: Ragnar Lonn <ragnar@gatorhole.com> To: freebsd-hardware@freebsd.org Subject: Re: Load testing knocks out network Message-ID: <50431E04.5050207@gatorhole.com> In-Reply-To: <CAHMRaQfLGY%2BYeDkG7K1GJQ-pmAi6rgT6-gthKQ3j7rSyzr7qVA@mail.gmail.com> References: <CAHMRaQfLGY%2BYeDkG7K1GJQ-pmAi6rgT6-gthKQ3j7rSyzr7qVA@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi Andy, I work for an online load testing service (loadimpact.com) and what we see is that the most common cause when a server crashes during a load test, is that it runs out of some vital system resource. Usually system memory, but network connections (sockets/file descriptors) is also a likely cause. You should have gotten some kind of error messages in the system log, but if the problem is easily repeatable I would set up monitoring of at least memory and file descriptors, and see if you are near the limits when the machine freezes. Regards, /Ragnar On 09/01/2012 10:14 PM, Andy Young wrote: > Last night one our servers went offline while I was load testing it. When I > got to the datacenter to check on it, the server seemed perfectly fine. > Everything was running on it, there were no panics or any other sign of a > hard crash. The only problem is the network was unreachable. I couldn't > connect to the box even from a laptop directly attached to the ethernet > port. I couldn't connect to anything from the box either. It was if the > network controller had seized up. I restarted netif and it didn't make a > difference. Rebooting the machine however, solved the issue and everything > went back to working great. I restarted the load testing and reproduced the > problem twice more this morning so at least its repeatable. It feels like a > network controller / driver issue to me for a couple reasons. First, the > problem affects the entire system. We're running FreeBSD 9 with about a > half dozen jails. Most of the jails are running Apache but the one I was > load testing was running Jetty. However, if it was my application code > crashing I would expect the problem to at least be isolated to the jail > that hosts it. Instead, the entire machine and all jails in it lose access > to the network. > > Apart from not being able to access the network, I don't see any other > signs of problems. This is the first major problem I've had to debug in > FreeBSD so I'm not a debugging expert by any means. There are no error > messages in /var/log/messages or dmesg apart from syslogd not being able to > reach the network. If anyone has ideas on where I can look for more > evidence of what is going wrong, I would really appreciate it. > > We're running FreeBSD 9.0-RELEASE-p3. The network controller is a Intel(R) > PRO/1000 Network Connection version - 2.2.5 configured with 6 ips using > aliases, five of which are used for jails. > > Thank you for the help!! > > Andy > _______________________________________________ > freebsd-hardware@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hardware > To unsubscribe, send any mail to "freebsd-hardware-unsubscribe@freebsd.org"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?50431E04.5050207>