From owner-freebsd-fs@FreeBSD.ORG  Thu Apr  2 02:28:17 2015
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 5E3A61D3
 for <freebsd-fs@freebsd.org>; Thu,  2 Apr 2015 02:28:17 +0000 (UTC)
Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca
 [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id EF491FCC
 for <freebsd-fs@freebsd.org>; Thu,  2 Apr 2015 02:28:16 +0000 (UTC)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: A2CtBABWqBxV/95baINcg1pcBYMQwkMKhSpJAoITAQEBAQEBfoQfAQEEAQEBICsgCxsOCgICDRkCKQEJJgYIBwQBHASIDg20c5grAQEBAQEBAQMBAQEBAQEBARqBIYoIhBYQAgEFFwEzB4JogUUFlFaDXoN9kmoihAoiMQEGgT1/AQEB
X-IronPort-AV: E=Sophos;i="5.11,507,1422939600"; d="scan'208";a="201338561"
Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca)
 ([131.104.91.222])
 by esa-jnhn.mail.uoguelph.ca with ESMTP; 01 Apr 2015 22:28:15 -0400
Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1])
 by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id D968AB3EB2;
 Wed,  1 Apr 2015 22:28:15 -0400 (EDT)
Date: Wed, 1 Apr 2015 22:28:15 -0400 (EDT)
From: Rick Macklem <rmacklem@uoguelph.ca>
To: Adam Guimont <aguimont@tezzaron.com>
Message-ID: <1199661815.10758124.1427941695874.JavaMail.root@uoguelph.ca>
In-Reply-To: <551C4F1D.1000206@tezzaron.com>
Subject: Re: NFSD high CPU usage
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-Originating-IP: [172.17.95.10]
X-Mailer: Zimbra 7.2.6_GA_2926 (ZimbraWebClient - FF3.0 (Win)/7.2.6_GA_2926)
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.18-1
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 02 Apr 2015 02:28:17 -0000

Adam Guimont wrote:
> I have an issue where NFSD will max out the CPU (1200% in this case)
> when a client workstation runs out of memory while trying to write
> via
> NFS. What also happens is the TCP Recv-Q fills up and causes
> connection
> timeouts for any other client trying to use the NFS server.
> 
> I can reproduce the issue by running stress on a low-end client
> workstation. Change into the NFS mounted directory and then use
> stress
> to write via NFS and exhaust the memory, example:
> 
> stress --cpu 2 --io 4 --vm 20 --hdd 4
> 
> The client workstation will eventually run out of memory trying to
> write
> into the NFS directory, fill the TCP Recv-Q on the NFS server, and
> then
> NFSD will max out the CPU.
> 
> The actual client workstations (~50) are not running stress when this
> happens, it's a mixture of EDA tools (simulation and verification).
> 
> For what it's worth, this is how I've been monitoring the TCP buffer
> queues where "xx.xxx.xx.xxx" is the IP address of the NFS server:
> 
> cmdwatch -n1 'netstat -an | grep -e "Proto" -e "tcp4" | grep -e
> "Proto"
> -e "xx.xxx.xx.xxx.2049"'
> 
> I have tried several tuning recommendations but it has not solved the
> problem.
> 
> Has anyone else experienced this and is anyone else able to reproduce
> it?
> 
> ---
> NFS server specs:
> 
> OS = FreeBSD 10.0-RELEASE
> CPU = E5-1650 v3
> Memory = 96GB
> Disks = 24x ST6000NM0034 in 4x raidz2
> HBA = LSI SAS 9300-8i
> NIC = Intel 10Gb X540-T2
> ---
> /boot/loader.conf
> 
> autoboot_delay="3"
> geom_mirror_load="YES"
> mpslsi3_load="YES"
> cc_htcp_load="YES"
> ---
> /etc/rc.conf
> 
> hostname="***"
> ifconfig_ix0="inet *** netmask 255.255.248.0 -tso -vlanhwtso"
> defaultrouter="***"
> sshd_enable="YES"
> ntpd_enable="YES"
> zfs_enable="YES"
> sendmail_enable="NO"
> nfs_server_enable="YES"
> nfs_server_flags="-h *** -t -n 128"
> nfs_client_enable="YES"
> rpcbind_enable="YES"
> rpc_lockd_enable="YES"
> rpc_statd_enable="YES"
> samba_enable="YES"
> atop_enable="YES"
> atop_interval="5"
> zabbix_agentd_enable="YES"
> ---
> /etc/sysctl.conf
> 
> vfs.nfsd.server_min_nfsvers=3
> vfs.nfsd.cachetcp=0
> kern.ipc.maxsockbuf=16777216
> net.inet.tcp.sendbuf_max=16777216
> net.inet.tcp.recvbuf_max=16777216
> net.inet.tcp.sendspace=1048576
> net.inet.tcp.recvspace=1048576
> net.inet.tcp.sendbuf_inc=32768
> net.inet.tcp.recvbuf_inc=65536
> net.inet.tcp.keepidle=10000
> net.inet.tcp.keepintvl=2500
> net.inet.tcp.always_keepalive=1
> net.inet.tcp.cc.algorithm=htcp
> net.inet.tcp.cc.htcp.adaptive_backoff=1
> net.inet.tcp.cc.htcp.rtt_scaling=1
> net.inet.tcp.sack.enable=0
> kern.ipc.soacceptqueue=1024
> net.inet.tcp.mssdflt=1460
> net.inet.tcp.minmss=1300
> net.inet.tcp.tso=0
> ---
> Client workstations:
> 
> OS = CentOS 6.6 x64
> Mount options from `cat /proc/mounts` =
> rw,nosuid,noatime,vers=3,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=***,mountvers=3,mountport=916,mountproto=udp,local_lock=none,addr=***

I can think of two explanations for this.
1 - The server nfsd threads get confused when the TCP recv Q fills
    and start looping around.
OR
2 - The client is sending massive #s of RPCs (or crap that is
    incomplete RPCs).

To get a better idea w.r.t. what is going on, I'd suggest that
you capture packets (for a relatively short period) when the
server is 100% CPU busy.
# tcpdump -s 0 -w out.pcap host <nfs-client>
- run on the server should do it.
Then look at out.pcap in wireshark and see what the packets
look like. (wireshark understands NFS, whereas tcpdump doesn't)
If #1, I'd guess very little traffic (maybe TCP layer stuff),
if #2, I'd guess you'll see a lot of RPC requests or garbage
that isn't a valid request. (This latter case would suggest a
CentOS problem.)

If you capture the packets but can't look at them in wireshark,
you could email me the packet capture as an attachment and I
can look at it after Apr. 10, when I get home.

rick

> ---
> 
> 
> Regards,
> 
> Adam Guimont
> 
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
>