Date: Wed, 21 Sep 2016 09:35:03 +0200 From: Michael Schuster <michaelsprivate@gmail.com> To: Stxe5le Bordal Kristoffersen <chiller@putsch.kolbu.ws> Cc: freeBSD Mailing List <freebsd-questions@freebsd.org> Subject: Re: Server gets a high load, but no CPU use, and then later stops respond on the network Message-ID: <CADqw_gLiL=RDmdfOpr5Y-eWqzjDJmvhAvfTR8mc9bWQa8Kungg@mail.gmail.com> In-Reply-To: <20160913232351.GA36091@putsch.kolbu.ws> References: <20160913232351.GA36091@putsch.kolbu.ws>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi, While I'm not very familiar with FreeBSD internals, I'd like to point out two things that I think may be relevant: 1) note that '[idle]' seems to be the only thread/process doing significant work - at a guess, I'd say that's the kernel doing work that cannot be ascribed to anything else ... housekeeping? (someone who knows FreeBSD better will have to answer that) On Wed, Sep 14, 2016 at 1:23 AM, Stxe5le Bordal Kristoffersen < chiller@putsch.kolbu.ws> wrote: > PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU > COMMAND > 11 root 24 155 ki31 0K 384K CPU23 23 1206.3 2396.63% > [idle] > 5 root 1 -16 - 0K 16K ipmire 14 100:17 0.00% > [ipmi0: kcs] > 0 root 407 -8 - 0K 6512K - 22 56:20 0.00% > [kernel] > 7 root 2 -16 - 0K 32K umarcl 3 6:21 0.00% > [pagedaemon] > 18 root 1 16 - 0K 16K syncer 14 3:37 0.00% > [syncer] > 12 root 38 -76 - 0K 608K WAIT 255 3:04 0.00% > [intr] > 2 root 6 -16 - 0K 96K - 0 2:41 0.00% > [cam] > 14 root 1 -16 - 0K 16K - 16 1:40 0.00% > [rand_harvestq] > 3 root 9 -8 - 0K 176K tx->tx 20 1:13 0.00% > [zfskern] > 17 root 1 -16 - 0K 16K vlruwt 13 1:10 0.00% > [vnlru] > 762 root 1 20 0 50040K 15212K select 18 0:10 0.00% > /usr/local/bin/perl -wT /usr/local/sbin/munin-node > 620 root 1 20 0 14520K 2044K select 20 0:06 0.00% > /usr/sbin/syslogd -s > 15 root 40 -68 - 0K 640K - 0 0:05 0.00% > [usb] > 686 root 1 20 0 26128K 18044K select 15 0:05 0.00% > /usr/sbin/ntpd -c /etc/ntp.conf -p /var/run/ntpd.pid -f /var/db/ntpd.drift > 823 root 1 20 0 24156K 5420K select 13 0:02 0.00% > sendmail: accepting connections (sendmail) > 6 root 1 -16 - 0K 16K idle 16 0:02 0.00% > [enc_daemon0] > 16 root 1 -16 - 0K 16K psleep 19 0:00 0.00% > [bufdaemon] > 830 root 1 20 0 16624K 712K nanslp 16 0:00 0.00% > /usr/sbin/cron -s > 826 smmsp 1 20 0 24156K 1056K pause 23 0:00 0.00% > sendmail: Queue runner@00:30:00 for /var/spool/clientmqueue (sendmail) > 52778 chiller 1 20 0 31060K 5356K pause 23 0:00 0.00% > -zsh (zsh) > 800 root 1 20 0 61316K 5164K select 17 0:00 0.00% > /usr/sbin/sshd > 1 root 1 20 0 9492K 460K wait 22 0:00 0.00% > [init] > 54395 chiller 1 23 0 31060K 5388K pause 23 0:00 0.00% > -zsh (zsh) > 52777 chiller 1 20 0 86584K 7576K select 23 0:00 0.00% > sshd: chiller@pts/0 (sshd) > 54394 chiller 1 20 0 86584K 7616K select 19 0:00 0.00% > sshd: chiller@pts/1 (sshd) > 473 root 1 20 0 13628K 4504K select 22 0:00 0.00% > /sbin/devd > 54441 root 1 20 0 24392K 4064K pause 15 0:00 0.00% -su > (zsh) > 52774 root 1 20 0 86584K 7532K select 19 0:00 0.00% > sshd: chiller [priv] (sshd) > 54050 root 1 20 0 24392K 4064K ttyin 20 0:00 0.00% -su > (zsh) > 13 root 3 -8 - 0K 48K - 4 0:00 0.00% > [geom] > 54389 root 1 20 0 86584K 7568K select 13 0:00 0.00% > sshd: chiller [priv] (sshd) > > [...] > > 2) look at 'sr' (using a fixed-width font probably helps). In Solaris (which is where I come from ... a long time ago ;-)) this is "scan rate", ie the number of pages (per second) the paging mechanism is looking at - (again on Solaris) this would mean that your system is under some kind of fairly constant memory pressure - where from I cannot even guess, and given the "avm" and "fre" columns, this does look very strange ... but that's what I'd continue my investigation with. pusen# vmstat 1 > procs memory page disks faults > cpu > r b w avm fre flt re pi po fr sr da0 da1 in sy cs > us sy id > 0 0 0 858M 1449M 335 0 0 1 355 4954 0 0 1917 4403 5302 > 0 0 99 > 0 0 0 858M 1449M 0 0 0 0 0 265 0 0 1 120 80 > 0 0 100 > 0 0 0 858M 1449M 0 0 0 0 0 265 0 0 0 124 81 > 0 0 100 > 0 0 0 858M 1449M 0 0 0 0 0 265 0 0 1 120 68 > 0 0 100 > 0 0 0 858M 1449M 0 0 0 0 0 265 0 0 1 127 92 > 0 0 100 > 0 0 0 858M 1449M 0 0 0 0 0 265 0 0 0 120 91 > 0 0 100 > 0 0 0 858M 1449M 0 0 0 0 0 265 0 0 1 121 82 > 0 0 100 > 0 0 0 858M 1449M 0 0 0 0 0 265 0 0 0 120 75 > 0 0 100 > 0 0 0 858M 1449M 0 0 0 0 0 265 0 0 2 121 96 > 0 0 100 > 0 0 0 858M 1449M 0 0 0 0 0 265 0 0 1 126 83 > 0 0 100 > 0 0 0 858M 1449M 0 0 0 0 0 217 0 0 1 121 68 > 0 0 100 > 0 0 0 858M 1449M 0 0 0 0 0 215 0 0 1 120 88 > 0 0 100 > 0 0 0 858M 1449M 0 0 0 0 0 215 0 0 0 121 92 > 0 0 100 > 0 0 0 858M 1449M 0 0 0 0 0 215 0 0 0 120 83 > 0 0 100 > 0 0 0 858M 1449M 0 0 0 0 0 215 0 0 1 127 90 > 0 0 100 > 0 0 0 858M 1449M 0 0 0 0 0 196 0 0 5 120 94 > 0 0 100 > 0 0 0 858M 1449M 0 0 0 0 0 196 0 0 1 121 80 > 0 0 100 > 0 0 0 858M 1449M 0 0 0 0 0 196 0 0 0 123 79 > 0 0 100 > 1 0 0 858M 1449M 0 0 0 0 0 196 0 0 2 121 76 > 0 0 100 > 1 0 0 858M 1449M 0 0 0 0 0 196 0 0 4 118 106 > 0 0 100 > 1 0 0 858M 1449M 0 0 0 0 0 196 0 0 0 112 87 > 0 0 100 > HTH Michael -- Michael Schuster http://recursiveramblings.wordpress.com/ recursion, n: see 'recursion'
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CADqw_gLiL=RDmdfOpr5Y-eWqzjDJmvhAvfTR8mc9bWQa8Kungg>