Date: Tue, 20 Sep 2016 16:57:12 -0400 From: Anton Yuzhaninov <citrin+bsd@citrin.ru> To: "freebsd-questions@freebsd.org" <freebsd-questions@freebsd.org> Subject: Re: Server gets a high load, but no CPU use, and then later stops respond on the network Message-ID: <68f553b9-8546-7707-df86-88851b3283f8@citrin.ru> In-Reply-To: <20160913232351.GA36091@putsch.kolbu.ws> References: <20160913232351.GA36091@putsch.kolbu.ws>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2016-09-13 19:23, Stxe5le Bordal Kristoffersen wrote: > about once a day, but not in any pattern, it starts getting a load of 5-10 > and usually stops responding over the network before I notice it. Does it stop responding completely (including ping) or only some services and ssh doesn't respond? > From googling a bit, I have tried to disable msix on the igb network > interface, and increased the nmbclusters with no apparent change in behaviour. > (kern.ipc.nmbclusters="1000000" and hw.igb.enable_msix=0 in loader.conf) kern.ipc.nmbclusters on modern FreeBSD version autotuned to very big value and manual increasing is rarely need. Disabling msix on igb is also unlikely need. > All I see is that the igb0 taskq pid is almost always in the RUN state when > the machine is having trouble. There is no igb0 taskq in top output below. To see and inspect how top output looks when machine stops responding it is useful to run top from cron and log output. Example script for top logging: https://bitbucket.org/snippets/citrin/BpeXb In top output you should look at WCPU and STATE for kernel threads and for unresponding network daemons. Also do you have network load graph (bytes and packets per second) for this host (I saw munin in process list) - may be load is too high in moments when host not responding. Do you use firewalls or netgraph? Which is the primary function of this server?
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?68f553b9-8546-7707-df86-88851b3283f8>