From owner-freebsd-fs@freebsd.org Wed Dec 30 23:26:21 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 1C45EA56432 for ; Wed, 30 Dec 2015 23:26:21 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id C61D415C6 for ; Wed, 30 Dec 2015 23:26:20 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) IronPort-PHdr: 9a23:KRIJyBY0sn/QE8ZZDoA2+Q3/LSx+4OfEezUN459isYplN5qZpci9bnLW6fgltlLVR4KTs6sC0LqI9fi4EUU7or+/81k6OKRWUBEEjchE1ycBO+WiTXPBEfjxciYhF95DXlI2t1uyMExSBdqsLwaK+i760zceF13FOBZvIaytQ8iJ35rxj7j60qaQSjsLrQL1Wal1IhSyoFeZnegtqqwmFJwMzADUqGBDYeVcyDAgD1uSmxHh+pX4p8Y7oGwD884mou5dW6/zZahwb7tCAD0gezQ3583DtAnYXBCT634HFG4Rl0wbLRLC6UTAX5zy+g7zvel51SzSadfzRLs3XTmnx7psRwLljD8HcTUwpjKEwvdshb5W9Ury7yd0xJTZNdmY X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2CsBADwZoRW/61jaINehHkGiFO2RIYPAoFQEAEBAQEBAQEBgQmCLYIHAQEBAwEjVgULAgEIDgoCAg0ZAgJXAgQTiCcIrjWQfgEBAQEGAQEBAR+BAYEuhCeEf4QmEQEdgx6BSQWCcItAiFaPLYduhTGKR4NxAjkrhCggNINSOoEIAQEB X-IronPort-AV: E=Sophos;i="5.20,502,1444708800"; d="scan'208";a="260529533" Received: from nipigon.cs.uoguelph.ca (HELO zcs1.mail.uoguelph.ca) ([131.104.99.173]) by esa-annu.net.uoguelph.ca with ESMTP; 30 Dec 2015 18:26:18 -0500 Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id BAF7915F55D; Wed, 30 Dec 2015 18:26:18 -0500 (EST) Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id RBIatIWMlVHx; Wed, 30 Dec 2015 18:26:18 -0500 (EST) Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id EB79E15F565; Wed, 30 Dec 2015 18:26:17 -0500 (EST) X-Virus-Scanned: amavisd-new at zcs1.mail.uoguelph.ca Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id m5pbA3j6r6q7; Wed, 30 Dec 2015 18:26:17 -0500 (EST) Received: from zcs1.mail.uoguelph.ca (zcs1.mail.uoguelph.ca [172.17.95.18]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id CDDDA15F55D; Wed, 30 Dec 2015 18:26:17 -0500 (EST) Date: Wed, 30 Dec 2015 18:26:17 -0500 (EST) From: Rick Macklem To: Hubbard Jordan Cc: Niels de Vos , freebsd-fs , gluster-devel@gluster.org Message-ID: <1083933309.146084334.1451517977647.JavaMail.zimbra@uoguelph.ca> In-Reply-To: <2D8C2729-D556-479B-B4E2-66E1BB222F41@ixsystems.com> References: <571237035.145690509.1451437960464.JavaMail.zimbra@uoguelph.ca> <20151230103152.GS13942@ndevos-x240.usersys.redhat.com> <2D8C2729-D556-479B-B4E2-66E1BB222F41@ixsystems.com> Subject: Re: [Gluster-devel] FreeBSD port of GlusterFS racks up a lot of CPU usage MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [172.17.95.12] X-Mailer: Zimbra 8.0.9_GA_6191 (ZimbraWebClient - FF43 (Win)/8.0.9_GA_6191) Thread-Topic: FreeBSD port of GlusterFS racks up a lot of CPU usage Thread-Index: BLAXDsFlqcTtWjMaPr5EU6vvMTL3VQ== X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 Dec 2015 23:26:21 -0000 Jordan Hubbard wrote: >=20 > > On Dec 30, 2015, at 2:31 AM, Niels de Vos wrote: > >=20 > >> I'm guessing that Linux uses the event-epoll stuff instead of event-po= ll, > >> so it wouldn't exhibit this. Is that correct? > >=20 > > Well, both. most (if not all) Linux builds will use event-poll. But, > > that calls epoll_wait() with a timeout of 1 millisecond as well. > >=20 > >> Thanks for any information on this, rick > >> ps: I am tempted to just crank the timeout of 1msec up to 10 or 20msec= . > >=20 > > Yes, that is probably what I would do too. And have both poll functions > > use the same timeout, have it defined in libglusterfs/src/event.h. We > > could make it a configurable option too, but I do not think it is very > > useful to have. >=20 > I guess this begs the question - what=E2=80=99s the actual purpose of pol= ling for an > event with a 1 millisecond timeout? If it was some sort of heartbeat che= ck, > one might imagine that would be better served by a timer with nothing clo= se > to 1 millisecond as an interval (that would be one seriously aggressive > heartbeat) and if filesystem events are incoming that glusterfs needs to > respond to, why timeout at all? >=20 If I understand the code (I probably don't) the timeout allows the loop to call a function that may add new fd's to be polled. (If I'm right, the new ones might not get serviced.) I'll post once I've tried a longer timeout and if it seems ok, I will put it in the Redhat bugs database (as mentioned in the last post). In its current form, it's fine for testing. > I also have a broader question to go with the specific one: We (at > iXsystems) were attempting to engage with some of the Red Hat folks back > when the FreeBSD port was first done, in the hope of getting it more > =E2=80=9Cofficially supported=E2=80=9D for FreeBSD and perhaps even donat= ing some more > serious stress-testing and integration work for it, but when those Red Ha= t > folks moved on we lost continuity and the effort stalled. Who at Red Hat > would / could we work with in getting this back on track? We=E2=80=99d l= ike to > integrate glusterfs with FreeNAS 10, and in fact have already done so but > it=E2=80=99s still early days and we=E2=80=99re not even really sure what= we have yet. >=20 Just fyi..sofar, working with FreeBSD11/head and the port of 3.7.6 (the por= t tarball is in FreeBSD PR#194409), the only GlusterFS problem I've encountered is the above one. I'm not sure why this isn't in /usr/ports, but that would be nice as it might get more people trying it. (I'm a src comitter, but not a ports one.) However, I have several patches for the FreeBSD fuse interface and for a mount_glusterfs mount to work ok you need a couple of them. 1 - When an open decides to do DIRECT_IO after the file has done buffer cache I/O the buffer cache needs to be invalidated so you don't get stale cached data. 2 - For a WRONLY write, you need to force DIRECT_IO (or do a read/write ope= n). If you don't do this, the buffer cache code will get stuck when trying to read a block in before writing a partial block. (I think this is what FreeBSD PR#194293 is caused by.) Because I won't be able to do svn until April, these patches won't make it into head for a while, but they will both be in PR#194293 within hours. The others add features like extended attributes, advisory byte range locki= ng and the changes needed to export the fuse/glusterfs mount via the FreeBSD kernel nfsd. If anyone wants/needs these patches, email and I can send you them. A bit off your topic, but until you have the fixes for FreeBSD fuse, you probably can't do a lot of serious testing. (I don't know, but I'd guess that FreeNAS has about the same fuse module code as FreeBSD's head, since it hasn't been changed much in head recently= .) Thanks everyone for your help with this, rick > Thanks, >=20 > - Jordan >=20 >=20