Date: Wed, 01 Apr 2015 15:43:14 -0500 From: Rick Romero <rick@havokmon.com> To: freebsd-fs@freebsd.org Subject: Re: NFSD high CPU usage Message-ID: <20150401154314.Horde.e_w-9XEJOaa4SwYyNLlttA3@www.vfemail.net> In-Reply-To: <551C4F1D.1000206@tezzaron.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Quoting Adam Guimont <aguimont@tezzaron.com>: > I have an issue where NFSD will max out the CPU (1200% in this case) > when a client workstation runs out of memory while trying to write via > NFS. What also happens is the TCP Recv-Q fills up and causes connection > timeouts for any other client trying to use the NFS server. > > I can reproduce the issue by running stress on a low-end client > workstation. Change into the NFS mounted directory and then use stress > to write via NFS and exhaust the memory, example: > > stress --cpu 2 --io 4 --vm 20 --hdd 4 > > The client workstation will eventually run out of memory trying to write > into the NFS directory, fill the TCP Recv-Q on the NFS server, and then > NFSD will max out the CPU. > > The actual client workstations (~50) are not running stress when this > happens, it's a mixture of EDA tools (simulation and verification). > > For what it's worth, this is how I've been monitoring the TCP buffer > queues where "xx.xxx.xx.xxx" is the IP address of the NFS server: > > cmdwatch -n1 'netstat -an | grep -e "Proto" -e "tcp4" | grep -e "Proto" > -e "xx.xxx.xx.xxx.2049"' > > I have tried several tuning recommendations but it has not solved the > problem. > > Has anyone else experienced this and is anyone else able to reproduce it? > > --- > NFS server specs: > > OS = FreeBSD 10.0-RELEASE > CPU = E5-1650 v3 > Memory = 96GB > Disks = 24x ST6000NM0034 in 4x raidz2 > HBA = LSI SAS 9300-8i > NIC = Intel 10Gb X540-T2 > --- > /boot/loader.conf > > autoboot_delay="3" > geom_mirror_load="YES" > mpslsi3_load="YES" > cc_htcp_load="YES" > --- > /etc/rc.conf > > hostname="***" > ifconfig_ix0="inet *** netmask 255.255.248.0 -tso -vlanhwtso" > defaultrouter="***" > sshd_enable="YES" > ntpd_enable="YES" > zfs_enable="YES" > sendmail_enable="NO" > nfs_server_enable="YES" > nfs_server_flags="-h *** -t -n 128" > nfs_client_enable="YES" > rpcbind_enable="YES" > rpc_lockd_enable="YES" > rpc_statd_enable="YES" > samba_enable="YES" > atop_enable="YES" > atop_interval="5" > zabbix_agentd_enable="YES" > --- > /etc/sysctl.conf > > vfs.nfsd.server_min_nfsvers=3 > vfs.nfsd.cachetcp=0 > kern.ipc.maxsockbuf=16777216 > net.inet.tcp.sendbuf_max=16777216 > net.inet.tcp.recvbuf_max=16777216 > net.inet.tcp.sendspace=1048576 > net.inet.tcp.recvspace=1048576 > net.inet.tcp.sendbuf_inc=32768 > net.inet.tcp.recvbuf_inc=65536 > net.inet.tcp.keepidle=10000 > net.inet.tcp.keepintvl=2500 > net.inet.tcp.always_keepalive=1 > net.inet.tcp.cc.algorithm=htcp > net.inet.tcp.cc.htcp.adaptive_backoff=1 > net.inet.tcp.cc.htcp.rtt_scaling=1 > net.inet.tcp.sack.enable=0 > kern.ipc.soacceptqueue=1024 > net.inet.tcp.mssdflt=1460 > net.inet.tcp.minmss=1300 > net.inet.tcp.tso=0 Does your ZFS pool have log devices? How does gstat -d look? If the drives are busy, try adding vfs.nfsd.async: 0 Rick
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20150401154314.Horde.e_w-9XEJOaa4SwYyNLlttA3>