Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 02 Sep 2012 10:51:16 +0200
From:      Ragnar Lonn <ragnar@gatorhole.com>
To:        freebsd-hardware@freebsd.org
Subject:   Re: Load testing knocks out network
Message-ID:  <50431E04.5050207@gatorhole.com>
In-Reply-To: <CAHMRaQfLGY%2BYeDkG7K1GJQ-pmAi6rgT6-gthKQ3j7rSyzr7qVA@mail.gmail.com>
References:  <CAHMRaQfLGY%2BYeDkG7K1GJQ-pmAi6rgT6-gthKQ3j7rSyzr7qVA@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Hi Andy,

I work for an online load testing service (loadimpact.com) and what we 
see is that the most common cause when a server crashes during a load 
test, is that it runs out of some vital system resource. Usually system 
memory, but network connections (sockets/file descriptors) is also a 
likely cause.

You should have gotten some kind of error messages in the system log, 
but if the problem is easily repeatable I would set up monitoring of at 
least memory and file descriptors, and see if you are near the limits 
when the machine freezes.

Regards,

   /Ragnar


On 09/01/2012 10:14 PM, Andy Young wrote:
> Last night one our servers went offline while I was load testing it. When I
> got to the datacenter to check on it, the server seemed perfectly fine.
> Everything was running on it, there were no panics or any other sign of a
> hard crash. The only problem is the network was unreachable. I couldn't
> connect to the box even from a laptop directly attached to the ethernet
> port. I couldn't connect to anything from the box either. It was if the
> network controller had seized up. I restarted netif and it didn't make a
> difference. Rebooting the machine however, solved the issue and everything
> went back to working great. I restarted the load testing and reproduced the
> problem twice more this morning so at least its repeatable. It feels like a
> network controller / driver issue to me for a couple reasons. First, the
> problem affects the entire system. We're running FreeBSD 9 with about a
> half dozen jails. Most of the jails are running Apache but the one I was
> load testing was running Jetty. However, if it was my application code
> crashing I would expect the problem to at least be isolated to the jail
> that hosts it. Instead, the entire machine and all jails in it lose access
> to the network.
>
> Apart from not being able to access the network, I don't see any other
> signs of problems. This is the first major problem I've had to debug in
> FreeBSD so I'm not a debugging expert by any means. There are no error
> messages in /var/log/messages or dmesg apart from syslogd not being able to
> reach the network. If anyone has ideas on where I can look for more
> evidence of what is going wrong, I would really appreciate it.
>
> We're running FreeBSD 9.0-RELEASE-p3. The network controller is a Intel(R)
> PRO/1000 Network Connection version - 2.2.5 configured with 6 ips using
> aliases, five of which are used for jails.
>
> Thank you for the help!!
>
> Andy
> _______________________________________________
> freebsd-hardware@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-hardware
> To unsubscribe, send any mail to "freebsd-hardware-unsubscribe@freebsd.org"




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?50431E04.5050207>