From owner-freebsd-stable@FreeBSD.ORG Wed Aug 13 14:51:53 2014 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 8A4C3D20 for ; Wed, 13 Aug 2014 14:51:53 +0000 (UTC) Received: from mail.intermedix.com (mail.epbs.com [66.210.191.9]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "Barracuda/emailAddress=sales@barracuda.com", Issuer "Barracuda/emailAddress=sales@barracuda.com" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 523F521A2 for ; Wed, 13 Aug 2014 14:51:52 +0000 (UTC) X-ASG-Debug-ID: 1407941510-0499563739005d0001-BIHDGU Received: from mailgate00.corp.okcyok1.priv.intermedix.com (mailgate00.epbs.com [10.130.4.34]) by mail.intermedix.com with ESMTP id Ug9ShHbOpCbFxqsr for ; Wed, 13 Aug 2014 09:51:50 -0500 (CDT) X-Barracuda-Envelope-From: Steve.Polyack@intermedix.com X-ASG-Whitelist: Client X-WSS-ID: 0NA92MC-01-EXD-02 X-M-MSG: Received: from exchange02.epbs.com (exchange02.okcyok0.priv.intermedix.com [192.168.25.29]) (using TLSv1 with cipher RC4-MD5 (128/128 bits)) (No client certificate requested) by mailgate00.corp.okcyok1.priv.intermedix.com (Postfix) with ESMTPS id 26379B906E6 for ; Wed, 13 Aug 2014 09:51:48 -0500 (CDT) Received: from EXCHANGE03.epbs.com ([0000:0000:0000:0000:0000:0000:0.0.0.1]) by exchange02.epbs.com ([192.168.25.29]) with mapi; Wed, 13 Aug 2014 09:51:50 -0500 From: "Polyack, Steve" To: "freebsd-stable@freebsd.org" Date: Wed, 13 Aug 2014 09:51:49 -0500 Subject: vmdaemon CPU usage and poor performance in 10.0-RELEASE Thread-Topic: vmdaemon CPU usage and poor performance in 10.0-RELEASE X-ASG-Orig-Subj: vmdaemon CPU usage and poor performance in 10.0-RELEASE Thread-Index: Ac+3BhpMG9AHR2QkTlmqCBG7zbyIHA== Message-ID: <4D557EC7CC2A544AA7C1A3B9CBA2B36726098846B4@exchange03.epbs.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Barracuda-Connect: mailgate00.epbs.com[10.130.4.34] X-Barracuda-Start-Time: 1407941510 X-Barracuda-URL: http://192.168.25.21:8000/cgi-mod/mark.cgi X-Virus-Scanned: by bsmtpd at intermedix.com X-Barracuda-BRTS-Status: 1 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 13 Aug 2014 14:51:53 -0000 All, We have a handful of database servers running FreeBSD 10.0-RELEASE and Post= greSQL 9.3.4. The servers have 128GB or 256GB of RAM. After a short perio= d of running and some "light" load (load average of 6 on a 16-core w/ HT bo= x), the system performance becomes terrible. Logging in and simply attempt= ing to run some commands results in the command prompt hanging for several = seconds before anything is returned (even for something simple like `test`)= . If we run `top -SH` when the system performance is bad, we can see the v= mdaemon thread/kernel process is using 100% CPU. Eventually the vmdaemon C= PU usage drops and system performance seems to return to normal, but this d= oesn't last long. Has anyone seen this behavior? I found a bug that seems to describe the sa= me problem (https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D190300), bu= t it's been open for 2 months without any responses. For what it's worth, = the problem does not seem as bad when we reduce postgresql shared buffers f= rom 128GB to 8GB, but it's still there. I can't imagine what vmdaemon is spending all of its time doing - `top -SH`= output and memory usage looks like: last pid: 97787; load averages: 6.03, 6.17, 6.03 = up 19= +15:46:19 08:50:07 345 processes: 41 running, 220 sleeping, 78 waiting, 6 lock CPU: 16.0% user, 0.0% nice, 10.0% system, 0.0% interrupt, 74.0% idle Mem: 6314M Active, 234G Inact, 4078M Wired, 2720M Cache, 1704M Buf, 2719M F= ree Swap: 16G Total, 329M Used, 16G Free, 2% Inuse PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND 11 root 155 ki31 0K 512K CPU26 26 448.8H 95.07% idle{id= le: cpu26} 11 root 155 ki31 0K 512K CPU29 29 449.1H 92.48% idle{id= le: cpu29} 6 root -16 - 0K 16K CPU27 27 81.5H 89.70% vmdaemo= n 11 root 155 ki31 0K 512K RUN 1 459.2H 89.60% idle{id= le: cpu1} 11 root 155 ki31 0K 512K RUN 14 452.9H 89.26% idle{id= le: cpu14} 11 root 155 ki31 0K 512K CPU5 5 458.7H 89.16% idle{id= le: cpu5} 11 root 155 ki31 0K 512K CPU31 31 449.1H 88.96% idle{id= le: cpu31} 11 root 155 ki31 0K 512K RUN 6 458.2H 88.87% idle{id= le: cpu6} 11 root 155 ki31 0K 512K CPU11 11 453.0H 88.38% idle{id= le: cpu11} 11 root 155 ki31 0K 512K RUN 3 457.9H 88.28% idle{id= le: cpu3} 11 root 155 ki31 0K 512K CPU12 12 452.8H 87.89% idle{id= le: cpu12} 11 root 155 ki31 0K 512K CPU2 2 458.6H 87.26% idle{id= le: cpu2} 11 root 155 ki31 0K 512K CPU7 7 458.6H 87.16% idle{id= le: cpu7} 11 root 155 ki31 0K 512K CPU10 10 458.5H 86.87% idle{id= le: cpu10} 11 root 155 ki31 0K 512K CPU15 15 453.8H 86.47% idle{id= le: cpu15} 11 root 155 ki31 0K 512K CPU13 13 453.6H 85.79% idle{id= le: cpu13} 11 root 155 ki31 0K 512K CPU4 4 458.4H 85.69% idle{id= le: cpu4} 11 root 155 ki31 0K 512K RUN 9 453.4H 85.35% idle{id= le: cpu9} 11 root 155 ki31 0K 512K CPU0 0 460.5H 85.25% idle{id= le: cpu0} 11 root 155 ki31 0K 512K CPU17 17 449.3H 84.08% idle{id= le: cpu17} 11 root 155 ki31 0K 512K RUN 25 449.0H 83.40% idle{id= le: cpu25} 11 root 155 ki31 0K 512K CPU18 18 448.9H 82.76% idle{id= le: cpu18} 11 root 155 ki31 0K 512K CPU16 16 448.8H 82.18% idle{id= le: cpu16} 11 root 155 ki31 0K 512K CPU23 23 448.8H 81.88% idle{id= le: cpu23} 11 root 155 ki31 0K 512K CPU8 8 454.3H 81.30% idle{id= le: cpu8} 11 root 155 ki31 0K 512K RUN 21 449.1H 81.05% idle{id= le: cpu21} 11 root 155 ki31 0K 512K CPU20 20 448.6H 80.96% idle{id= le: cpu20} 11 root 155 ki31 0K 512K RUN 22 448.7H 80.66% idle{id= le: cpu22} 11 root 155 ki31 0K 512K RUN 30 448.4H 79.59% idle{id= le: cpu30} 11 root 155 ki31 0K 512K CPU24 24 448.7H 79.39% idle{id= le: cpu24} 11 root 155 ki31 0K 512K RUN 19 449.2H 78.47% idle{id= le: cpu19} 11 root 155 ki31 0K 512K CPU28 28 448.7H 54.20% idle{id= le: cpu28} $ uname -a FreeBSD db00 10.0-RELEASE-p7 FreeBSD 10.0-RELEASE-p7 #2: Thu Jul 24 14:35:0= 7 EDT 2014 root@build10:/usr/obj/usr/src/sys/GENERIC amd64 Any help would be appreciated. We would be more than happy to answer any o= ther questions. Thanks, Steve Polyack