Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 19 Jan 2016 10:05:07 GMT
From:      Martin Simmons <martin@lispworks.com>
To:        Raghavendra Gowdappa <rgowdapp@redhat.com>
Cc:        rmacklem@uoguelph.ca, jdarcy@redhat.com, raghavendra@gluster.com, gluster-devel@gluster.org, freebsd-fs@freebsd.org, xhernandez@datalab.es, jkh@ixsystems.com
Subject:   Re: [Gluster-devel] FreeBSD port of GlusterFS racks up a lot of CPU usage
Message-ID:  <201601191005.u0JA57Nl004520@higson.cam.lispworks.com>
In-Reply-To: <7769801.11211464.1453183279015.JavaMail.zimbra@redhat.com> (message from Raghavendra Gowdappa on Tue, 19 Jan 2016 01:01:19 -0500 (EST))
References:  <571237035.145690509.1451437960464.JavaMail.zimbra@uoguelph.ca> <CADRNtgStOg8UZfxNt-SzvvPf7d1J7CC_gi49ww3BbixU0Ey-rg@mail.gmail.com> <568F6D07.6070500@datalab.es> <CADRNtgRM17Eg3Z=LWifVNo=ai72dMiEVRKS3RwNfQ-dK7Pspew@mail.gmail.com> <1924941590.6473225.1452248249994.JavaMail.zimbra@redhat.com> <981529129.154244852.1452304799182.JavaMail.zimbra@uoguelph.ca> <1256214214.7158114.1452310490692.JavaMail.zimbra@redhat.com> <1045057902.165261325.1453156629344.JavaMail.zimbra@uoguelph.ca> <7769801.11211464.1453183279015.JavaMail.zimbra@redhat.com>

next in thread | previous in thread | raw e-mail | index | archive | help
>>>>> On Tue, 19 Jan 2016 01:01:19 -0500 (EST), Raghavendra Gowdappa said:
> 
> ----- Original Message -----
> > From: "Rick Macklem" <rmacklem@uoguelph.ca>
> > To: "Raghavendra Gowdappa" <rgowdapp@redhat.com>
> > Cc: "Jeff Darcy" <jdarcy@redhat.com>, "Raghavendra G" <raghavendra@gluster.com>, "freebsd-fs"
> > <freebsd-fs@freebsd.org>, "Hubbard Jordan" <jkh@ixsystems.com>, "Xavier Hernandez" <xhernandez@datalab.es>, "Gluster
> > Devel" <gluster-devel@gluster.org>
> > Sent: Tuesday, January 19, 2016 4:07:09 AM
> > Subject: Re: [Gluster-devel] FreeBSD port of GlusterFS racks up a lot of CPU usage
> > 
> > Raghavendra Gowdappa wrote:
> > > 
> > > 
> > > ----- Original Message -----
> > > > From: "Rick Macklem" <rmacklem@uoguelph.ca>
> > > > To: "Jeff Darcy" <jdarcy@redhat.com>
> > > > Cc: "Raghavendra G" <raghavendra@gluster.com>, "freebsd-fs"
> > > > <freebsd-fs@freebsd.org>, "Hubbard Jordan"
> > > > <jkh@ixsystems.com>, "Xavier Hernandez" <xhernandez@datalab.es>, "Gluster
> > > > Devel" <gluster-devel@gluster.org>
> > > > Sent: Saturday, January 9, 2016 7:29:59 AM
> > > > Subject: Re: [Gluster-devel] FreeBSD port of GlusterFS racks up a lot of
> > > > CPU usage
> > > > 
> > > > Jeff Darcy wrote:
> > > > > > > I don't know anything about gluster's poll implementation so I may
> > > > > > > be totally wrong, but would it be possible to use an eventfd (or a
> > > > > > > pipe if eventfd is not supported) to signal the need to add more
> > > > > > > file descriptors to the poll call ?
> > > > > > >
> > > > > > >
> > > > > > > The poll call should listen on this new fd. When we need to change
> > > > > > > the fd list, we should simply write to the eventfd or pipe from
> > > > > > > another thread.  This will cause the poll call to return and we
> > > > > > > will
> > > > > > > be able to change the fd list without having a short timeout nor
> > > > > > > having to decide on any trade-off.
> > > > > > 
> > > > > >
> > > > > > Thats a nice idea. Based on my understanding of why timeouts are
> > > > > > being
> > > > > > used, this approach can work.
> > > > > 
> > > > > The own-thread code which preceded the current poll implementation did
> > > > > something similar, using a pipe fd to be woken up for new *outgoing*
> > > > > messages.  That code still exists, and might provide some insight into
> > > > > how to do this for the current poll code.
> > > > I took a look at event-poll.c and found something interesting...
> > > > - A pipe called "breaker" is already set up by event_pool_new_poll() and
> > > >   closed by event_pool_destroy_poll(), however it never gets used for
> > > >   anything.
> > > 
> > > I did a check on history, but couldn't find any information on why it was
> > > removed. Can you send this patch to http://review.gluster.org ? We can
> > > review and merge the patch over there. If you are not aware, development
> > > work flow can be found at:
> > > 
> > > http://www.gluster.org/community/documentation/index.php/Developers
> > > 
> > Actually, the patch turned out to be a flop. Sometimes a fuse mount would end
> > up with an empty file system with the patch. (I don't know why it was broken,
> > but maybe the original author tan into issues as well?)
> 
> +static void
> +event_pool_changed (struct event_pool *event_pool)
> +{
> +
> +        /* Write a byte into the breaker pipe to wake up poll(). */
> +        if (event_pool->breaker[1] >= 0)
> +                write(event_pool->breaker[1], "X", 1);
> +}
> 
> breaker is set to non-blocking on both read and write ends. So, probably write might be failing sometimes with EAGAIN/EBUSY and thereby preventing the socket from being registered. Probably that might be the reason?

In what situations does write return EAGAIN/EBUSY for a pipe?  If it is only
when the pipe's internal buffer is full, then there is still unread data and
the reading end will wake up anyway, so there should be no need to write more.


>         if (event_pool->breaker[1] >= 0) {
>                  do {
>                         ret = write(event_pool->breaker[1], "X", 1);
>                  } while (ret != 1);
>         }

This will busy-wait, which doesn't look good.


> Also similar logic might be required while flushing out junk from read end too.

Failure to read everything should be benign because it will just cause poll to
return again immediately next time.


> > 
> > Anyhow, I am now using the 3.7.6 event-poll.c code except that I have
> > increased
> > the timeout from 1msec->10msec. (Going from 1->5->10 didn't seem to cause a
> > problem, but I got slower test runs when I increased to 20msec, so I've
> > settled on
> > 10mses. This does reduce the CPU usage when the GlusterFS file systems aren't
> > active.)
> > I will submit this one line change to your workflow if it continues to test
> > ok.
> > 
> > Thanks for everyone's input, rick
> > 
> > > > 
> > > > So, I added a few lines of code that writes a byte to it whenever the
> > > > list
> > > > of
> > > > file descriptors is changed and read when poll() returns, if its revents
> > > > is
> > > > set.
> > > > I also changed the timeout to -1 (infinity) and it seems to work for a
> > > > trivial
> > > > test.
> > > > --> Btw, I also noticed the "changed" variable gets set to 1 on a change,
> > > > but
> > > >     never reset to 0. I didn't change this, since it looks "racey". (ie.
> > > >     I
> > > >     think you could easily get a race between a thread that clears it and
> > > >     one
> > > >     that adds a new fd.)
> > > > 
> > > > A slightly safer version of the patch would set a long (100msec ??)
> > > > timeout
> > > > instead
> > > > of -1.
> > > > 
> > > > Anyhow, I've attached the patch in case anyone would like to try it and
> > > > will
> > > > create a bug report for this after I've had more time to test it.
> > > > (I only use a couple of laptops, so my testing will be minimal.)
> > > > 
> > > > Thanks for all the help, rick
> > > > 
> > > > > _______________________________________________
> > > > > freebsd-fs@freebsd.org mailing list
> > > > > https://lists.freebsd.org/mailman/listinfo/freebsd-fs
> > > > > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
> > > > > 
> > > > 
> > > 
> > 
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
> 



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201601191005.u0JA57Nl004520>