From owner-freebsd-fs@FreeBSD.ORG Mon Dec 8 13:47:22 2014 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 335CC19B for ; Mon, 8 Dec 2014 13:47:22 +0000 (UTC) Received: from smtp.unix-experience.fr (195-154-176-227.rev.poneytelecom.eu [195.154.176.227]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id E27F4398 for ; Mon, 8 Dec 2014 13:47:21 +0000 (UTC) Received: from smtp.unix-experience.fr (unknown [192.168.200.21]) by smtp.unix-experience.fr (Postfix) with ESMTP id 3B5DE25A84; Mon, 8 Dec 2014 13:47:18 +0000 (UTC) X-Virus-Scanned: scanned by unix-experience.fr Received: from smtp.unix-experience.fr ([192.168.200.21]) by smtp.unix-experience.fr (smtp.unix-experience.fr [192.168.200.21]) (amavisd-new, port 10024) with ESMTP id hT6XG7sQ9ghv; Mon, 8 Dec 2014 13:47:12 +0000 (UTC) Received: from mail.unix-experience.fr (repo.unix-experience.fr [192.168.200.30]) by smtp.unix-experience.fr (Postfix) with ESMTPSA id A5CE325A79; Mon, 8 Dec 2014 13:47:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=unix-experience.fr; s=uxselect; t=1418046432; bh=gbbq7Bw66H7kj0+p71Exbn2D98kgbDIEhhZIVhUxcIo=; h=Date:From:Subject:To:Cc:In-Reply-To:References; b=jbBBF8PVyvm42cI+/B06NnXyVqEK54kQxNdI32SnsKp6/s0nxcLi2sm5EbcoUlH1P zk4kvd69moveF5Nq9ahwAUDlj+WPYBcHaWUAKrN2bmLX1Kk3cqk0HL3rCLyYWMfeuK OEzK3jKFS8gLCJaLYW/0UwpvtsP2dNwtuTwyhjbw= Mime-Version: 1.0 Date: Mon, 08 Dec 2014 13:47:12 +0000 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-ID: X-Mailer: RainLoop/1.6.10.182 From: "=?utf-8?B?TG/Dr2MgQmxvdA==?=" Subject: Re: High Kernel Load with nfsv4 To: "Rick Macklem" In-Reply-To: References: <581583623.5730217.1417788866930.JavaMail.root@uoguelph.ca> Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Dec 2014 13:47:22 -0000 Hi rick,=0A=0AI waited 3 hours (no lag at jail launch) and now I do: sysr= c memcached_flags=3D"-v -m 512"=0ACommand was very very slow...=0A=0AHere= is a dd over NFS:=0A=0A601062912 bytes transferred in 21.060679 secs (28= 539579 bytes/sec)=0A=0AThis is quite slow...=0A=0AYou can found some nfss= tat below (command isn't finished yet)=0A=0Anfsstat -c -w 1=0A=0A GtAttr = Lookup Rdlink Read Write Rename Access Rddir=0A 0 0 0 = 0 0 0 0 0=0A 4 0 0 0 0 = 0 16 0=0A 2 0 0 0 0 0 17 = 0=0A 0 0 0 0 0 0 0 0=0A = 0 0 0 0 0 0 0 0=0A 0 0 = 0 0 0 0 0 0=0A 0 0 0 0 = 0 0 0 0=0A 0 4 0 0 0 0 = 4 0=0A 0 0 0 0 0 0 0 0=0A = 0 0 0 0 0 0 0 0=0A 0 0 = 0 0 0 0 0 0=0A 0 0 0 0 = 0 0 0 0=0A 4 0 0 0 0 0 = 3 0=0A 0 0 0 0 0 0 3 0= =0A 37 10 0 8 0 0 14 1=0A 18 = 16 0 4 1 2 4 0=0A 78 91 0 = 82 6 12 30 0=0A 19 18 0 2 2 = 4 2 0=0A 0 0 0 0 2 0 0 = 0=0A 0 0 0 0 0 0 0 0=0A GtAttr = Lookup Rdlink Read Write Rename Access Rddir=0A 0 0 0 = 0 0 0 0 0=0A 0 0 0 0 0 = 0 0 0=0A 0 0 0 0 0 0 0 = 0=0A 0 1 0 0 0 0 1 0=0A = 4 6 0 0 6 0 3 0=0A 2 0 = 0 0 0 0 0 0=0A 0 0 0 0 = 0 0 0 0=0A 1 0 0 0 0 0 = 0 0=0A 0 0 0 0 1 0 0 0=0A = 0 0 0 0 0 0 0 0=0A 0 0 = 0 0 0 0 0 0=0A 0 0 0 0 = 0 0 0 0=0A 0 0 0 0 0 0 = 0 0=0A 0 0 0 0 0 0 0 0= =0A 0 0 0 0 0 0 0 0=0A 0 = 0 0 0 0 0 0 0=0A 0 0 0 = 0 0 0 0 0=0A 6 108 0 0 0 = 0 0 0=0A 0 0 0 0 0 0 0 = 0=0A 0 0 0 0 0 0 0 0=0A GtAttr = Lookup Rdlink Read Write Rename Access Rddir=0A 0 0 0 = 0 0 0 0 0=0A 0 0 0 0 0 = 0 0 0=0A 0 0 0 0 0 0 0 = 0=0A 0 0 0 0 0 0 0 0=0A = 0 0 0 0 0 0 0 0=0A 0 0 = 0 0 0 0 0 0=0A 0 0 0 0 = 0 0 0 0=0A 98 54 0 86 11 0 = 25 0=0A 36 24 0 39 25 0 10 1=0A = 67 8 0 63 63 0 41 0=0A 34 0 = 0 35 34 0 0 0=0A 75 0 0 75 = 77 0 0 0=0A 34 0 0 35 35 0 = 0 0=0A 75 0 0 74 76 0 0 0= =0A 33 0 0 34 33 0 0 0=0A 0 = 0 0 0 5 0 0 0=0A 0 0 0 = 0 0 0 6 0=0A 11 0 0 0 0 = 0 11 0=0A 0 0 0 0 0 0 0 = 0=0A 0 17 0 0 0 0 1 0=0A GtAttr = Lookup Rdlink Read Write Rename Access Rddir=0A 4 5 0 = 0 0 0 12 0=0A 2 0 0 0 0 = 0 26 0=0A 0 0 0 0 0 0 0 = 0=0A 0 0 0 0 0 0 0 0=0A = 0 0 0 0 0 0 0 0=0A 0 0 = 0 0 0 0 0 0=0A 0 0 0 0 = 0 0 0 0=0A 0 4 0 0 0 0 = 4 0=0A 0 0 0 0 0 0 0 0=0A = 0 0 0 0 0 0 0 0=0A 0 0 = 0 0 0 0 0 0=0A 4 0 0 0 = 0 0 2 0=0A 2 0 0 0 0 0 = 24 0=0A 0 0 0 0 0 0 0 0= =0A 0 0 0 0 0 0 0 0=0A 0 = 0 0 0 0 0 0 0=0A 0 0 0 = 0 0 0 0 0=0A 0 0 0 0 0 = 0 0 0=0A 0 0 0 0 0 0 0 = 0=0A 0 0 0 0 0 0 0 0=0A GtAttr = Lookup Rdlink Read Write Rename Access Rddir=0A 0 0 0 = 0 0 0 0 0=0A 0 0 0 0 0 = 0 0 0=0A 4 0 0 0 0 0 7 = 0=0A 2 1 0 0 0 0 1 0=0A = 0 0 0 0 2 0 0 0=0A 0 0 = 0 0 0 0 0 0=0A 0 0 0 0 = 6 0 0 0=0A 0 0 0 0 0 0 = 0 0=0A 0 0 0 0 0 0 0 0=0A = 0 0 0 0 0 0 0 0=0A 0 0 = 0 0 0 0 0 0=0A 0 0 0 0 = 0 0 0 0=0A 0 0 0 0 0 0 = 0 0=0A 4 6 0 0 0 0 3 0= =0A 0 0 0 0 0 0 0 0=0A 2 = 0 0 0 0 0 0 0=0A 0 0 0 = 0 0 0 0 0=0A 0 0 0 0 0 = 0 0 0=0A 0 0 0 0 0 0 0 = 0=0A 0 0 0 0 0 0 0 0=0A GtAttr = Lookup Rdlink Read Write Rename Access Rddir=0A 0 0 0 = 0 0 0 0 0=0A 0 0 0 0 0 = 0 0 0=0A 0 0 0 0 0 0 0 = 0=0A 0 0 0 0 0 0 0 0=0A = 0 0 0 0 0 0 0 0=0A 4 71 = 0 0 0 0 0 0=0A 0 1 0 0 = 0 0 0 0=0A 2 36 0 0 0 0 = 1 0=0A 0 0 0 0 0 0 0 0=0A = 0 0 0 0 0 0 0 0=0A 0 0 = 0 0 0 0 0 0=0A 0 0 0 0 = 0 0 0 0=0A 1 0 0 0 0 0 = 1 0=0A 0 0 0 0 0 0 0 0= =0A 0 0 0 0 0 0 0 0=0A 79 = 6 0 79 79 0 2 0=0A 25 0 0 = 25 26 0 6 0=0A 43 18 0 39 46 = 0 23 0=0A 36 0 0 36 36 0 31 = 0=0A 68 1 0 66 68 0 0 0=0A GtAttr = Lookup Rdlink Read Write Rename Access Rddir=0A 36 0 0 = 36 36 0 0 0=0A 48 0 0 48 49 = 0 0 0=0A 20 0 0 20 20 0 0 = 0=0A 0 0 0 0 0 0 0 0=0A = 3 14 0 1 0 0 11 0=0A 0 0 = 0 0 0 0 0 0=0A 0 0 0 0 = 0 0 0 0=0A 0 4 0 0 0 0 = 4 0=0A 0 0 0 0 0 0 0 0=0A = 4 22 0 0 0 0 16 0=0A 2 0 = 0 0 0 0 23 0=0A=0ARegards,=0A=0ALo=C3=AFc Blo= t,=0AUNIX Systems, Network and Security Engineer=0Ahttp://www.unix-experi= ence.fr=0A=0A8 d=C3=A9cembre 2014 09:36 "Lo=C3=AFc Blot" a =C3=A9crit: =0A> Hi Rick,=0A> I stopped the jails this w= eek-end and started it this morning, i'll give you some stats this week.= =0A> =0A> Here is my nfsstat -m output (with your rsize/wsize tweaks)=0A>= =0A> nfsv4,tcp,resvport,hard,cto,sec=3Dsys,acdirmin=3D3,acdirmax=3D60,ac= regmin=3D5,acregmax=3D60,nametimeo=3D60,negna=0A> etimeo=3D60,rsize=3D327= 68,wsize=3D32768,readdirsize=3D32768,readahead=3D1,wcommitsize=3D773136,t= imeout=3D120,retra=0A> s=3D2147483647=0A> =0A> On server side my disks ar= e on a raid controller which show a 512b volume and write performances=0A= > are very honest (dd if=3D/dev/zero of=3D/jails/test.dd bs=3D4096 count= =3D100000000 =3D> 450MBps)=0A> =0A> Regards,=0A> =0A> Lo=C3=AFc Blot,=0A>= UNIX Systems, Network and Security Engineer=0A> http://www.unix-experien= ce.fr=0A> =0A> 5 d=C3=A9cembre 2014 15:14 "Rick Macklem" a =C3=A9crit:=0A> =0A>> Loic Blot wrote:=0A>> =0A>>> Hi,=0A>>> i'm= trying to create a virtualisation environment based on jails.=0A>>> Thos= e jails are stored under a big ZFS pool on a FreeBSD 9.3 which=0A>>> expo= rt a NFSv4 volume. This NFSv4 volume was mounted on a big=0A>>> hyperviso= r (2 Xeon E5v3 + 128GB memory and 8 ports (but only 1 was=0A>>> used at t= his time).=0A>>> =0A>>> The problem is simple, my hypervisors runs 6 jail= s (used 1% cpu and=0A>>> 10GB RAM approximatively and less than 1MB bandw= idth) and works=0A>>> fine at start but the system slows down and after 2= -3 days become=0A>>> unusable. When i look at top command i see 80-100% o= n system and=0A>>> commands are very very slow. Many process are tagged w= ith nfs_cl*.=0A>> =0A>> To be honest, I would expect the slowness to be b= ecause of slow response=0A>> from the NFSv4 server, but if you do:=0A>> #= ps axHl=0A>> on a client when it is slow and post that, it would give us= some more=0A>> information on where the client side processes are sittin= g.=0A>> If you also do something like:=0A>> # nfsstat -c -w 1=0A>> and le= t it run for a while, that should show you how many RPCs are=0A>> being d= one and which ones.=0A>> =0A>> # nfsstat -m=0A>> will show you what your = mount is actually using.=0A>> The only mount option I can suggest trying = is "rsize=3D32768,wsize=3D32768",=0A>> since some network environments ha= ve difficulties with 64K.=0A>> =0A>> There are a few things you can try o= n the NFSv4 server side, if it appears=0A>> that the clients are generati= ng a large RPC load.=0A>> - disabling the DRC cache for TCP by setting vf= s.nfsd.cachetcp=3D0=0A>> - If the server is seeing a large write RPC load= , then "sync=3Ddisabled"=0A>> might help, although it does run a risk of = data loss when the server=0A>> crashes.=0A>> Then there are a couple of o= ther ZFS related things (I'm not a ZFS guy,=0A>> but these have shown up = on the mailing lists).=0A>> - make sure your volumes are 4K aligned and a= shift=3D12 (in case a drive=0A>> that uses 4K sectors is pretending to be= 512byte sectored)=0A>> - never run over 70-80% full if write performance= is an issue=0A>> - use a zil on an SSD with good write performance=0A>> = =0A>> The only NFSv4 thing I can tell you is that it is known that ZFS's= =0A>> algorithm for determining sequential vs random I/O fails for NFSv4= =0A>> during writing and this can be a performance hit. The only workarou= nd=0A>> is to use NFSv3 mounts, since file handle affinity apparently fix= es=0A>> the problem and this is only done for NFSv3.=0A>> =0A>> rick=0A>>= =0A>>> I saw that there are TSO issues with igb then i'm trying to disab= le=0A>>> it with sysctl but the situation wasn't solved.=0A>>> =0A>>> Som= eone has got ideas ? I can give you more informations if you=0A>>> need.= =0A>>> =0A>>> Thanks in advance.=0A>>> Regards,=0A>>> =0A>>> Lo=C3=AFc Bl= ot,=0A>>> UNIX Systems, Network and Security Engineer=0A>>> http://www.un= ix-experience.fr=0A>>> _______________________________________________=0A= >>> freebsd-fs@freebsd.org mailing list=0A>>> http://lists.freebsd.org/ma= ilman/listinfo/freebsd-fs=0A>>> To unsubscribe, send any mail to "freebsd= -fs-unsubscribe@freebsd.org"=0A> =0A> ___________________________________= ____________=0A> freebsd-fs@freebsd.org mailing list=0A> http://lists.fre= ebsd.org/mailman/listinfo/freebsd-fs=0A> To unsubscribe, send any mail to= "freebsd-fs-unsubscribe@freebsd.org"