From owner-freebsd-fs@freebsd.org Tue Jan 19 06:01:24 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B1C60A885A5 for ; Tue, 19 Jan 2016 06:01:24 +0000 (UTC) (envelope-from rgowdapp@redhat.com) Received: from mx4-phx2.redhat.com (mx4-phx2.redhat.com [209.132.183.25]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "mx1.redhat.com", Issuer "DigiCert SHA2 Extended Validation Server CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 98E0D15D6 for ; Tue, 19 Jan 2016 06:01:24 +0000 (UTC) (envelope-from rgowdapp@redhat.com) Received: from zmail13.collab.prod.int.phx2.redhat.com (zmail13.collab.prod.int.phx2.redhat.com [10.5.83.15]) by mx4-phx2.redhat.com (8.13.8/8.13.8) with ESMTP id u0J61KY6002251; Tue, 19 Jan 2016 01:01:20 -0500 Date: Tue, 19 Jan 2016 01:01:19 -0500 (EST) From: Raghavendra Gowdappa To: Rick Macklem Cc: Jeff Darcy , Raghavendra G , freebsd-fs , Hubbard Jordan , Xavier Hernandez , Gluster Devel Message-ID: <7769801.11211464.1453183279015.JavaMail.zimbra@redhat.com> In-Reply-To: <1045057902.165261325.1453156629344.JavaMail.zimbra@uoguelph.ca> References: <571237035.145690509.1451437960464.JavaMail.zimbra@uoguelph.ca> <568F6D07.6070500@datalab.es> <1924941590.6473225.1452248249994.JavaMail.zimbra@redhat.com> <981529129.154244852.1452304799182.JavaMail.zimbra@uoguelph.ca> <1256214214.7158114.1452310490692.JavaMail.zimbra@redhat.com> <1045057902.165261325.1453156629344.JavaMail.zimbra@uoguelph.ca> Subject: Re: [Gluster-devel] FreeBSD port of GlusterFS racks up a lot of CPU usage MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [10.70.1.92] X-Mailer: Zimbra 8.0.6_GA_5922 (ZimbraWebClient - GC25 (Linux)/8.0.6_GA_5922) Thread-Topic: FreeBSD port of GlusterFS racks up a lot of CPU usage Thread-Index: nzgvBLPPgcBXuRsf6GlZU17foVfxCE+M3nAjcp2ixGppRbv/t1U+cI5T X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Jan 2016 06:01:24 -0000 ----- Original Message ----- > From: "Rick Macklem" > To: "Raghavendra Gowdappa" > Cc: "Jeff Darcy" , "Raghavendra G" , "freebsd-fs" > , "Hubbard Jordan" , "Xavier Hernandez" , "Gluster > Devel" > Sent: Tuesday, January 19, 2016 4:07:09 AM > Subject: Re: [Gluster-devel] FreeBSD port of GlusterFS racks up a lot of CPU usage > > Raghavendra Gowdappa wrote: > > > > > > ----- Original Message ----- > > > From: "Rick Macklem" > > > To: "Jeff Darcy" > > > Cc: "Raghavendra G" , "freebsd-fs" > > > , "Hubbard Jordan" > > > , "Xavier Hernandez" , "Gluster > > > Devel" > > > Sent: Saturday, January 9, 2016 7:29:59 AM > > > Subject: Re: [Gluster-devel] FreeBSD port of GlusterFS racks up a lot of > > > CPU usage > > > > > > Jeff Darcy wrote: > > > > > > I don't know anything about gluster's poll implementation so I may > > > > > > be totally wrong, but would it be possible to use an eventfd (or a > > > > > > pipe if eventfd is not supported) to signal the need to add more > > > > > > file descriptors to the poll call ? > > > > > > > > > > > > > > > > > > The poll call should listen on this new fd. When we need to change > > > > > > the fd list, we should simply write to the eventfd or pipe from > > > > > > another thread. This will cause the poll call to return and we > > > > > > will > > > > > > be able to change the fd list without having a short timeout nor > > > > > > having to decide on any trade-off. > > > > > > > > > > > > > > > Thats a nice idea. Based on my understanding of why timeouts are > > > > > being > > > > > used, this approach can work. > > > > > > > > The own-thread code which preceded the current poll implementation did > > > > something similar, using a pipe fd to be woken up for new *outgoing* > > > > messages. That code still exists, and might provide some insight into > > > > how to do this for the current poll code. > > > I took a look at event-poll.c and found something interesting... > > > - A pipe called "breaker" is already set up by event_pool_new_poll() and > > > closed by event_pool_destroy_poll(), however it never gets used for > > > anything. > > > > I did a check on history, but couldn't find any information on why it was > > removed. Can you send this patch to http://review.gluster.org ? We can > > review and merge the patch over there. If you are not aware, development > > work flow can be found at: > > > > http://www.gluster.org/community/documentation/index.php/Developers > > > Actually, the patch turned out to be a flop. Sometimes a fuse mount would end > up with an empty file system with the patch. (I don't know why it was broken, > but maybe the original author tan into issues as well?) +static void +event_pool_changed (struct event_pool *event_pool) +{ + + /* Write a byte into the breaker pipe to wake up poll(). */ + if (event_pool->breaker[1] >= 0) + write(event_pool->breaker[1], "X", 1); +} breaker is set to non-blocking on both read and write ends. So, probably write might be failing sometimes with EAGAIN/EBUSY and thereby preventing the socket from being registered. Probably that might be the reason? if (event_pool->breaker[1] >= 0) { do { ret = write(event_pool->breaker[1], "X", 1); } while (ret != 1); } Also similar logic might be required while flushing out junk from read end too. > > Anyhow, I am now using the 3.7.6 event-poll.c code except that I have > increased > the timeout from 1msec->10msec. (Going from 1->5->10 didn't seem to cause a > problem, but I got slower test runs when I increased to 20msec, so I've > settled on > 10mses. This does reduce the CPU usage when the GlusterFS file systems aren't > active.) > I will submit this one line change to your workflow if it continues to test > ok. > > Thanks for everyone's input, rick > > > > > > > So, I added a few lines of code that writes a byte to it whenever the > > > list > > > of > > > file descriptors is changed and read when poll() returns, if its revents > > > is > > > set. > > > I also changed the timeout to -1 (infinity) and it seems to work for a > > > trivial > > > test. > > > --> Btw, I also noticed the "changed" variable gets set to 1 on a change, > > > but > > > never reset to 0. I didn't change this, since it looks "racey". (ie. > > > I > > > think you could easily get a race between a thread that clears it and > > > one > > > that adds a new fd.) > > > > > > A slightly safer version of the patch would set a long (100msec ??) > > > timeout > > > instead > > > of -1. > > > > > > Anyhow, I've attached the patch in case anyone would like to try it and > > > will > > > create a bug report for this after I've had more time to test it. > > > (I only use a couple of laptops, so my testing will be minimal.) > > > > > > Thanks for all the help, rick > > > > > > > _______________________________________________ > > > > freebsd-fs@freebsd.org mailing list > > > > https://lists.freebsd.org/mailman/listinfo/freebsd-fs > > > > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > > > > > > > > > >