From owner-freebsd-fs@freebsd.org Thu Mar 10 02:00:28 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A05B6ACA793; Thu, 10 Mar 2016 02:00:28 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id 3D816F75; Thu, 10 Mar 2016 02:00:27 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) IronPort-PHdr: 9a23:xlj03BGc05JqgNxMttjsUJ1GYnF86YWxBRYc798ds5kLTJ75oMSwAkXT6L1XgUPTWs2DsrQf27WQ4/+rBTRIyK3CmU5BWaQEbwUCh8QSkl5oK+++Imq/EsTXaTcnFt9JTl5v8iLzG0FUHMHjew+a+SXqvnYsExnyfTB4Ov7yUtaLyZ/niKbipNaPO01hv3mUX/BbFF2OtwLft80b08NJC50a7V/3mEZOYPlc3mhyJFiezF7W78a0+4N/oWwL46pyv+YJa6jxfrw5QLpEF3xmdjltvIy4/SXEGDOG+39Ud2wKkhdSS1zd5Qz+dpjrtS77qqxx3CiQe9PqC704RGLxwb1sTUrSiSwEfxsw+2LTh8k42LheqRmioxF665PTb5yYMOJ+OKjUK4BJDVFdV9pcAnQSSri3aJECWq9YZb5V X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2DOAQC71OBW/61jaINeDoRsBrhGghMBDYFthg8CggQUAQEBAQEBAQFjJ4ItghQBAQEDASMEUgULAgEIGAICDRkCAlcCBBOIHAiwCo8qAQEBAQEBBAEBAQEBAQEZfIUcgXuCR4QiFoMCgToFh1cChVl0PYhVj1eHa4UuhX6IXgIeAQFCggMZgQ1ZHi4BiBcjARl+AQEB X-IronPort-AV: E=Sophos;i="5.24,313,1454994000"; d="scan'208";a="271821200" Received: from nipigon.cs.uoguelph.ca (HELO zcs1.mail.uoguelph.ca) ([131.104.99.173]) by esa-annu.net.uoguelph.ca with ESMTP; 09 Mar 2016 20:59:57 -0500 Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 90D6E15F574; Wed, 9 Mar 2016 20:59:57 -0500 (EST) Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id LTeijNFXb1or; Wed, 9 Mar 2016 20:59:56 -0500 (EST) Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id BE86615F577; Wed, 9 Mar 2016 20:59:56 -0500 (EST) X-Virus-Scanned: amavisd-new at zcs1.mail.uoguelph.ca Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id f1mLcALzsdLK; Wed, 9 Mar 2016 20:59:56 -0500 (EST) Received: from zcs1.mail.uoguelph.ca (zcs1.mail.uoguelph.ca [172.17.95.18]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 9B80F15F574; Wed, 9 Mar 2016 20:59:56 -0500 (EST) Date: Wed, 9 Mar 2016 20:59:56 -0500 (EST) From: Rick Macklem To: Paul Mather Cc: Ronald Klop , freebsd-fs@freebsd.org, freebsd-arm@freebsd.org Message-ID: <508973676.11871738.1457575196588.JavaMail.zimbra@uoguelph.ca> In-Reply-To: <60E8006A-F0A8-4284-839E-882FAD7E6A55@gromit.dlib.vt.edu> References: <3DAB3639-8FB8-43D3-9517-94D46EDEC19E@gromit.dlib.vt.edu> <1482595660.8940439.1457405756110.JavaMail.zimbra@uoguelph.ca> <08710728-3130-49BE-8BD7-AFE85A31C633@gromit.dlib.vt.edu> <1290552239.10146172.1457484570450.JavaMail.zimbra@uoguelph.ca> <60E8006A-F0A8-4284-839E-882FAD7E6A55@gromit.dlib.vt.edu> Subject: Re: Unstable NFS on recent CURRENT MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.95.11] X-Mailer: Zimbra 8.0.9_GA_6191 (ZimbraWebClient - FF44 (Win)/8.0.9_GA_6191) Thread-Topic: Unstable NFS on recent CURRENT Thread-Index: RMlqSV8ZIP43xWv9Rxwnw2Yw7ASWew== X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Mar 2016 02:00:28 -0000 Paul Mather wrote: > On Mar 8, 2016, at 7:49 PM, Rick Macklem wrote: > > > Paul Mather wrote: > >> On Mar 7, 2016, at 9:55 PM, Rick Macklem wrote: > >> > >>> Paul Mather (forwarded by Ronald Klop) wrote: > >>>> On Sun, 06 Mar 2016 02:57:03 +0100, Paul Mather > >>>> > >>>> wrote: > >>>> > >>>>> On my BeagleBone Black running 11-CURRENT (r296162) lately I have been > >>>>> having trouble with NFS. I have been doing a buildworld and > >>>>> buildkernel > >>>>> with /usr/src and /usr/obj mounted via NFS. Recently, this process has > >>>>> resulted in the buildworld failing at some point, with a variety of > >>>>> errors (Segmentation fault; Permission denied; etc.). Even a "ls -alR" > >>>>> of /usr/src doesn't manage to complete. It errors out thus: > >>>>> > >>>>> ===== > >>>>> [[...]] > >>>>> total 0 > >>>>> ls: ./.svn/pristine/fe: Permission denied > >>>>> > >>>>> ./.svn/pristine/ff: > >>>>> total 0 > >>>>> ls: ./.svn/pristine/ff: Permission denied > >>>>> ls: fts_read: Permission denied > >>>>> ===== > >>>>> > >>>>> On the console, I get the following: > >>>>> > >>>>> newnfs: server 'chumby.chumby.lan' error: fileid changed. fsid > >>>>> 94790777:a4385de: expected fileid 0x4, got 0x2. (BROKEN NFS SERVER OR > >>>>> MIDDLEWARE) > >>>>> > > Oh, I had forgotten this. Here's the comment related to this error. > > (about line#445 in sys/fs/nfsclient/nfs_clport.c): > > 446 * BROKEN NFS SERVER OR MIDDLEWARE > > 447 * > > 448 * Certain NFS servers (certain old proprietary filers > > ca. > > 449 * 2006) or broken middleboxes (e.g. WAN accelerator > > products) > > 450 * will respond to GETATTR requests with results for a > > 451 * different fileid. > > 452 * > > 453 * The WAN accelerator we've observed not only serves > > stale > > 454 * cache results for a given file, it also > > occasionally serves > > 455 * results for wholly different files. This causes > > surprising > > 456 * problems; for example the cached size attribute of > > a file > > 457 * may truncate down and then back up, resulting in > > zero > > 458 * regions in file contents read by applications. We > > observed > > 459 * this reliably with Clang and .c files during > > parallel build. > > 460 * A pcap revealed packet fragmentation and GETATTR > > RPC > > 461 * responses with wholly wrong fileids. > > > > If you can connect the client->server with a simple switch (or just an RJ45 > > cable), it > > might be worth testing that way. (I don't recall the name of the middleware > > product, but > > I think it was shipped by one of the major switch vendors. I also don't > > know if the product > > supports NFSv4?) > > > > rick > > > Currently, the client is connected to the server via a dumb gigabit switch, > so it is already fairly direct. > > As for the above error, it appeared on the console only once. (Sorry if I > made it sound like it appears every time.) > > I just tried another buildworld attempt via NFS and it failed again. This > time, I get this on the BeagleBone Black console: > > nfs_getpages: error 13 > vm_fault: pager read error, pid 5401 (install) > 13 is EACCES and could be caused by what I mention below. (Any mount of a file system on the server unless "-S" is specified as a flag for mountd.) > > The other thing I have noticed is that if I induce heavy load on the NFS > server---e.g., by starting a Poudriere bulk build---then that provokes the > client to crash much more readily. For example, I started a NFS buildworld > on the BeagleBone Black, and it seemed to be chugging along nicely. The > moment I kicked off a Poudriere build update of my packages on the NFS > server, it crashed the buildworld on the NFS client. > Try adding "-S" to mountd_flags on the server. Any time file systems are mounted (and Poudriere likes to do that, I am told), mount sends a SIGHUP to mountd to reload /etc/exports. When /etc/exports are being reloaded, there will be access errors for mounts (that are temporarily not exported) unless you specify "-S" (which makes mountd suspend the nfsd threads during the reload of /etc/exports). rick > I have had problems with swap on FreeBSD/arm before. Swapping to a file does > not appear to work for me. As a result, I switched to swapping to a > partition on the SD card. Maybe this is unreliable, too? > > Cheers, > > Paul. > >