From owner-freebsd-fs@freebsd.org Sun Feb 28 01:25:52 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 22A7BAAE4A6 for ; Sun, 28 Feb 2016 01:25:52 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 00CB81E7C for ; Sun, 28 Feb 2016 01:25:52 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: by mailman.ysv.freebsd.org (Postfix) id 00D93AAE4A5; Sun, 28 Feb 2016 01:25:52 +0000 (UTC) Delivered-To: fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id DAA45AAE4A4 for ; Sun, 28 Feb 2016 01:25:51 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id 789F21E7B for ; Sun, 28 Feb 2016 01:25:50 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) IronPort-PHdr: 9a23:VaGv5RLqwKeJ4ry41NmcpTZWNBhigK39O0sv0rFitYgUL/nxwZ3uMQTl6Ol3ixeRBMOAu60C1bed6vG5EUU7or+/81k6OKRWUBEEjchE1ycBO+WiTXPBEfjxciYhF95DXlI2t1uyMExSBdqsLwaK+i760zceF13FOBZvIaytQ8iJ35vxib35osyKKyxzxxODIppKZC2sqgvQssREyaBDEY0WjiXzn31TZu5NznlpL1/A1zz158O34YIxu38I46FppIZ8VvDQZbkzQPR1Ej0gKChh7tfnuDHEVReS/X0RTiMdlR8OChWTvz/gWZKkiCrxtaJY0SKZOcDzBeQuXD2p7KNmTTf1jygaOjoh8Cfcg5oj3+pgvBu9qkknkMbva4aPOa8mcw== X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2B7AgD8S9JW/61jaINehH+6Vw6BZoJdgzYCgV8UAQEBAQEBAQFjJ4ItghQBAQEDASMEUgULAgEIDgoRGQICAlUCBIgqCLAojiYBAQEBAQEBAwEBAQEBAQEBEAiGEoF0gUl9hA8OFRmCaoE6BY0rdIhtgwmFP4Z4hESDJYUthXOIVQIeAUOCAxmBZh6HKj1+AQEB X-IronPort-AV: E=Sophos;i="5.22,512,1449550800"; d="scan'208";a="269831722" Received: from nipigon.cs.uoguelph.ca (HELO zcs1.mail.uoguelph.ca) ([131.104.99.173]) by esa-annu.net.uoguelph.ca with ESMTP; 27 Feb 2016 20:25:42 -0500 Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id BFA5615F56D; Sat, 27 Feb 2016 20:25:42 -0500 (EST) Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id QaJBHimiI6if; Sat, 27 Feb 2016 20:25:40 -0500 (EST) Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id D86F915F56E; Sat, 27 Feb 2016 20:25:40 -0500 (EST) X-Virus-Scanned: amavisd-new at zcs1.mail.uoguelph.ca Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id 5MCkkrLMhlHa; Sat, 27 Feb 2016 20:25:40 -0500 (EST) Received: from zcs1.mail.uoguelph.ca (zcs1.mail.uoguelph.ca [172.17.95.18]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id A6BC915F56D; Sat, 27 Feb 2016 20:25:40 -0500 (EST) Date: Sat, 27 Feb 2016 20:25:40 -0500 (EST) From: Rick Macklem To: Bruce Evans Cc: fs@freebsd.org Message-ID: <813250886.11776455.1456622740632.JavaMail.zimbra@uoguelph.ca> In-Reply-To: <20160227153409.W1735@besplex.bde.org> References: <20160226164613.N2180@besplex.bde.org> <1403082388.11082060.1456545103011.JavaMail.zimbra@uoguelph.ca> <20160227153409.W1735@besplex.bde.org> Subject: Re: silly write caching in nfs3 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_11776453_1126094421.1456622740624" X-Originating-IP: [172.17.95.12] X-Mailer: Zimbra 8.0.9_GA_6191 (ZimbraWebClient - FF44 (Win)/8.0.9_GA_6191) Thread-Topic: silly write caching in nfs3 Thread-Index: SB6gL3mse8HBf0fc9cmrpYEs0rExvQ== X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 28 Feb 2016 01:25:52 -0000 ------=_Part_11776453_1126094421.1456622740624 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Bruce Evans wrote: > On Fri, 26 Feb 2016, Rick Macklem wrote: > > > Bruce Evans wrote: > >> nfs3 is slower than in old versions of FreeBSD. I debugged one of the > >> reasons today. > >> > >> Writes have apparently always done silly caching. Typical behaviour > >> is for iozone writing a 512MB file where the file fits in the buffer > >> cache/VMIO. The write is cached perfectly. But then when nfs_open() > >> reeopens the file, it calls vinvalbuf() to discard all of the cached > >> data. Thus nfs write caching usually discards useful older data to > >> ... > >> I think not committing in close is supposed to be an optimization, but > >> it is actually a pessimization for my kernel build tests (with object > >> files on nfs, which I normally avoid). Builds certainly have to reopen > >> files after writing them, to link them and perhaps to install them. > >> This causes the discarding. My kernel build tests also do a lot of > >> utimes() calls which cause the discarding before commit-on-close can > >> avoid the above cause for it it by clearing NMODIFIED. Enabling > >> commit-on-close gives a small optimisation with oldnfs by avoiding all > >> of the discarding except for utimes(). It reduces read RPCs by about > >> 25% without increasing write RPCs or real time. It decreases real time > >> by a few percent. > >> > > Well, the new NFS client code was cloned from the old one (about FreeBSD7). > > I did this so that the new client wouldn't exhibit different caching > > behaviour than the old one (avoiding any POLA). > > If you look in stable/10/sys/nfsclient/nfs_vnops.c and > > stable/10/sys/fs/nfsclient/nfs_clvnops.c > > at the nfs_open() and nfs_close() functions, the algorithm appears to be > > identical for NFSv3. (The new one has a bunch of NFSv4 gunk, but if you > > scratch out that stuff and ignore function name differences (nfs_flush() vs > > ncl_flush()), I think you'll find them the same. I couldn't spot any > > differences at a glance.) > > --> see r214513 in head/sys/fs/nfsclient/nfs_clvnops.c for example > > I blamed newnfs before :-), but when I looked at newnfs more closely I > found that it was almost the same lexically in the most interesting > places (but unfortunately has lexical differences from s/nfs/ncl/, > and but doesn't have enough of these differences for debugging -- > debugging is broken by having 2 static functions named nfs_foo() for > many values of foo). But newnfs seems to have always been missing this > critical code: > > X 1541 rgrimes int > X 83651 peter nfs_writerpc(struct vnode *vp, struct uio *uiop, struct > ucred *cred, > X 158739 mohans int *iomode, int *must_commit) > X 1541 rgrimes { > X 9336 dfr if (v3) { > X 9336 dfr wccflag = NFSV3_WCCCHK; > X ... > X 158739 mohans } > X 158739 mohans if (wccflag) { > X 158739 mohans mtx_lock(&(VTONFS(vp))->n_mtx); > X 158739 mohans VTONFS(vp)->n_mtime = VTONFS(vp)->n_vattr.va_mtime; > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > X 158739 mohans mtx_unlock(&(VTONFS(vp))->n_mtx); > X 158739 mohans } > Well, this code does exist in the new client. The function is nfsrpc_writerpc() found in sys/fs/nfsclient/nfs_clrpcops.c. It calls nfscl_wcc_data() which does the same test as nfsm_wcc_data_xx() did in the old client. Then NFSWRITERPC_SETTIME() set the time if wccflag was set. However...NFSWRITERPC_SETTIME() is broken and sets n_mtime from the one in the nfs vnode (same as the old code). Unfortuantely the cached value in the nfs vnode hasn't been updated at that point. (The RPC layer functions don't do cache stuff.) The attached patch fixes this. Please test with the attached patch. Good catch. Thanks for reporting it. > This was in 4.4BSD-Lite1 under a slightly (?) different condition. > > BTW, how do you use svn to see the history of removed files? nfs_vnops.c > has been removed in -current. I can find it in other branches, but is > hard to find there even if you know where it is. This is no better than > in cvs, where to find its full history I have cd know where it is in 3 > different repositories that I have online and more that I should have. > > > >> The other reason for discarding is because the timestamps changed -- you > >> just wrote them, so the timestamps should have changed. Different bugs > >> in comparing the timestamps gave different misbehaviours. > >> > >> In old versions of FreeBSD and/or nfs, the timestamps had seconds > >> granularity, so many changes were missed. This explains mysterious > >> behaviours by iozone 10-20 years ago: the write caching is seen to > >> work perfectly for most small total sizes, since all the writes take > >> less than 1 second so the timestamps usually don't change (but sometimes > >> the writes lie across a seconds boundary so the timestamps do change). > >> > >> oldnfs was fixed many years ago to use timestamps with nanoseconds > >> resolution, but it doesn't suffer from the discarding in nfs_open() > >> in the !NMODIFIED case which is reached by either fsync() before close > >> of commit on close. I think this is because it updates n_mtime to > >> the server's new timestamp in nfs_writerpc(). This seems to be wrong, > >> since the file might have been written to by other clients and then > >> the change would not be noticed until much later if ever (setting the > >> timestamp prevents seeing it change when it is checked later, but you > >> might be able to see another metadata change). > >> > >> newfs has quite different code for nfs_writerpc(). Most of it was > >> moved to another function in nanother file. I understand this even > >> less, but it doesn't seem to have fetch the server's new timestamp or > >> update n_mtime in the v3 case. > >> > > I'm pretty sure it does capture the new attributes (including mtime in > > the reply. The function is called something like nfscl_loadattrcache(). > > Debugging shows that it loads the new attributes but doesn't clobber > n_mtime with them. For a write test that takes 20 seconds, n_mtime sticks > at its original value and the server time advances with each write by 20 > seconds total (the server time only advances every second if the server > timestamp precision is only 1 second). > Yep. The attached patch fixes this. > > In general, close-to-open consistency isn't needed for most mounts. > > (The only case where it matters is when multiple clients are concurrently > > updating files.) > > - There are a couple of options that might help performance when doing > > software builds on an NFS mount: > > nocto (I remember you don't like the name) > > I actually do like it except for its negative logic. To turn it back on, > you would need to use nonocto, but IIRC the negative logic for that is > still broken (missing), so there is no way to turn it back on. > > > - Actually, I can't remember why the code would still do the cache > > invalidation in nfs_open() when this is set. I wonder if the code > > in nfs_open() should maybe avoid invalidating the buffer cache > > when this is set? (I need to think about this.) > > I think it is technically correct for something to do the invalidation > if NMODIFIED is still set in nfs_open(). nocto shouldn't and doesn't > affect that. nocto is checked only in nfs_lookup() and only affects > nfs_open() indirectly: its effect is that when nocto is not set, > nfs_lookup() clears n_attrstamp which causes nfs_lookup() to do more, > but hopefully still not cache invalidation. Cache invalidation is > also done after a timeout and nocto doesn't affect that either. > > I still leave nocto off except for testing. I want to optimise the > cto case, and my reference benchmarks are with cto. > > > noncontigwr - This one allows the writes to happen for byte aligned > > chunks when they are non-contiguous without pushing the individual > > writes to the server. (Again, this shouldn't cause problems unless > > multiple clients are writing to the file concurrently.) > > Both of these are worth trying for mounts where software builds are being > > done. > > I tried this to see if it would fix the unordered writes. I didn't > expect it to do much because I usually only have a single active > client and a single active writer per file. It didn't make much > difference. > > With nfsiods misordering writes, this option might give another source > of silly writes. After it merges writes to give perfect contiguity, > you send them to multiple nfsiods which might give perfect discontiguity > (worse than random) :-). > > >> There are many other reasons why nfs is slower than in old versions. > >> One is that writes are more often done out of order. This tends to > >> give a slowness factor of about 2 unless the server can fix up the > >> order. I use an old server which can do the fixup for old clients but > >> not for newer clients starting in about FreeBSD-9 (or 7?). > > I actually thought this was mainly caused by the krpc that was introduced > > in FreeBSD7 (for both old and new NFS), separating the RPC from NFS. > > There are 2 layers in the krpc (sys/rpc/clnt_rc.c and sys/rpc/clnt_vc.c) > > that each use acquisition of a mutex to allow an RPC message to be sent. > > (Whichever thread happens to acquire the mutex first, sends first.) > > I don't like the new krpc since it is larger and harder to debug > (especially for me since I don't understand the old krpc either :-), > but it is in FreeBSD-7 and in my main reference kernel r181717, and > these don't have so many unordered blocks for at leasy writing. > > > I had a couple of patches that tried to keep the RPC messages more ordered. > > (They would not have guaranteed exact ordering.) They seemed to help for > > the limited testing I could do, but since I wasn't seeing a lot of > > "out of order" reads/writes on my single core hardware, I couldn't verify > > I usually use single core hardware too, but saw some problems with 2 cores, > and now with 8 cores the problems seem to be fundamental. > > > how well these patches worked. mav@ was working on this at the time, but > > didn't get these patches tested either, from what I recall. > > --> Unfortunately, I seem to have lost these patches or I would have > > attached them so you could try them. Ouch. > > (I've cc'd mav@. Maybe he'll have them lying about. I think one was > > related to the nfsiod and the other for either sys/rpc/clnt_rc.c or > > sys/rpc/clnt_vc.c.) > > > > The patches were all client side. Maybe I'll try and recreate them. > > It seems to require lots of communication between separate nfsiods to > even preserve an order that has carefully been set up for them. If > you have this then it is unclear why it can't be done more simply using > a single nfsiod thread (per NIC or ifq). Only 1 thread should talk to > the NIC/ifq, since you lose control if put other threads in between. > If the NIC/ifq uses multiple threads then maintaining the order is its > problem. > Yes. The patches I had didn't guarantee complete ordering in part because the one was for the nfsiods and the other for the RPC request in the krpc code. I will try and redo them one of these days. I will mention that the NFS RFCs have no requirement nor recommendation for ordering. The RPCs are assumed to be atomic entities. Having said that, I do believe that FreeBSD servers will perform better when the reads/writes are in sequentially increasing byte order, so this is worth working on. > >> I suspect > >> that this is just because Giant locking in old clients gave accidental > >> serialization. Multiple nfsiod's and/or nfsd's are are clearly needed > >> for performance if you have multiple NICs serving multiple mounts. > > Shared vnode locks are also a factor, at least for reads. > > (Before shared vnode locks, the vnode lock essentially serialized all > > reads.) > > > > As you note, a single threaded benchmark test is quite different than a lot > > of clients with a lot of threads doing I/O on a lot of files concurrently. > > It is also an important case for me. I mainly want creating of object files > to be fast, and the cache invalidation and unordered blocks seem to be > relatively even larger in this case. A typical compiler operation (if > tmp or obj files are on nfs, which they shouldn't be but sometimes are) is: > > cc -pipe avoids creating intermediate file for preprocessor > cc -c creates intermediate .S file for some compilers (not clang) > .S file is written to cache > reopening .S file to actually use it invalidates cache > (workaround: > (1) enable commit on close to clear NMODIFED, and > (2) use 1-second timestamp resolution on server to break detection > of the change, provided the file can be created and reopened > without crossing a seconds boundary), or > (3) use oldnfs > cc -c creates intermediate .o file for all compilers > similar considerations > link step uses .o files and invalidates their cache in most cases > (workaround: as above, except the whole compile usually takes > more than 1 second, so the timestamp resolution hack doesn't work) > install step > similar considerations -- the linked file was the intermediate file > for this step and reopening invalidates its cache. > > > The bandwidth * delay product of your network interconnect is also a > > factor. > > The larger this is, the more bits you need to be in transit to "fill the > > data > > pipe". You can increase the # of bits in transit by either using larger > > rsize/wsize > > or more read-ahead/write-behind. > > I already have latency tuned to about 5-10 times smaller than on FreeBSD > cluster machines, with the result that most operations are even more than > 5-10 faster, due to smaller operations and most operations not having > special support for keeping pipes full (if that is possible at all). > > > It would be nice to figure out why your case is performing better on the > > old nfs client (and/or server). > > > > If you have a fairly recent FreeBSD10 system, you could try doing mounts > > with new vs old client (and no other changes) and see what differences > > occur. (that would isolate new vs old from recent "old" and "really old") > > Er, that is what I already did to isolate this problem. I have oldnfs and > newfs in about 50 test kernels and finally isolated this problem in an > up to date FreeBSD-10. > > Bruce > Thanks for finding the "n_mtime isn't getting updated by writes" bug. The attached fix should fix it, rick ------=_Part_11776453_1126094421.1456622740624 Content-Type: text/x-patch; name=nfswriterpc.patch Content-Disposition: attachment; filename=nfswriterpc.patch Content-Transfer-Encoding: base64 LS0tIGZzL25mcy9uZnNwb3J0Lmguc2F2eHgJMjAxNi0wMi0yNyAxNToxOTo0MS4xNzU1NzQwMDAg LTA1MDAKKysrIGZzL25mcy9uZnNwb3J0LmgJMjAxNi0wMi0yNyAxNjo0OTo0Ny43OTAyNzYwMDAg LTA1MDAKQEAgLTc4OCwxMiArNzg4LDE0IEBAIE1BTExPQ19ERUNMQVJFKE1fTkVXTkZTRFNFU1NJ T04pOwogLyoKICAqIFNldCB0aGUgbl90aW1lIGluIHRoZSBjbGllbnQgd3JpdGUgcnBjLCBhcyBy ZXF1aXJlZC4KICAqLwotI2RlZmluZQlORlNXUklURVJQQ19TRVRUSU1FKHcsIG4sIHY0KQkJCQkJ XAorI2RlZmluZQlORlNXUklURVJQQ19TRVRUSU1FKHcsIG4sIGEsIHY0KQkJCQlcCiAJZG8gewkJ CQkJCQkJXAogCQlpZiAodykgewkJCQkJCVwKLQkJCShuKS0+bl9tdGltZSA9IChuKS0+bl92YXR0 ci5uYV92YXR0ci52YV9tdGltZTsgXAorCQkJbXR4X2xvY2soJigobiktPm5fbXR4KSk7CQkJXAor CQkJKG4pLT5uX210aW1lID0gKGEpLT5uYV9tdGltZTsJCQlcCiAJCQlpZiAodjQpCQkJCQkJXAot CQkJICAgIChuKS0+bl9jaGFuZ2UgPSAobiktPm5fdmF0dHIubmFfdmF0dHIudmFfZmlsZXJldjsg XAorCQkJCShuKS0+bl9jaGFuZ2UgPSAoYSktPm5hX2ZpbGVyZXY7CVwKKwkJCW10eF91bmxvY2so JigobiktPm5fbXR4KSk7CQkJXAogCQl9CQkJCQkJCVwKIAl9IHdoaWxlICgwKQogCi0tLSBmcy9u ZnNjbGllbnQvbmZzX2NscnBjb3BzLmMuc2F2eHgJMjAxNi0wMi0yNyAxNToxNzo1Ny42MjkwMTIw MDAgLTA1MDAKKysrIGZzL25mc2NsaWVudC9uZnNfY2xycGNvcHMuYwkyMDE2LTAyLTI3IDE2OjUx OjEzLjY1MTUzOTAwMCAtMDUwMApAQCAtMTczMyw3ICsxNzMzLDcgQEAgbmZzcnBjX3dyaXRlcnBj KHZub2RlX3QgdnAsIHN0cnVjdCB1aW8gKgogCQl9CiAJCWlmIChlcnJvcikKIAkJCWdvdG8gbmZz bW91dDsKLQkJTkZTV1JJVEVSUENfU0VUVElNRSh3Y2NmbGFnLCBucCwgKG5kLT5uZF9mbGFnICYg TkRfTkZTVjQpKTsKKwkJTkZTV1JJVEVSUENfU0VUVElNRSh3Y2NmbGFnLCBucCwgbmFwLCAobmQt Pm5kX2ZsYWcgJiBORF9ORlNWNCkpOwogCQltYnVmX2ZyZWVtKG5kLT5uZF9tcmVwKTsKIAkJbmQt Pm5kX21yZXAgPSBOVUxMOwogCQl0c2l6IC09IGxlbjsK ------=_Part_11776453_1126094421.1456622740624-- From owner-freebsd-fs@freebsd.org Sun Feb 28 01:45:05 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id EBEF1AAEE73 for ; Sun, 28 Feb 2016 01:45:04 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id CC3DE18DC for ; Sun, 28 Feb 2016 01:45:04 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: by mailman.ysv.freebsd.org (Postfix) id CC620AAEE72; Sun, 28 Feb 2016 01:45:04 +0000 (UTC) Delivered-To: fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B3093AAEE71 for ; Sun, 28 Feb 2016 01:45:04 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id 52DC518DB for ; Sun, 28 Feb 2016 01:45:04 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) IronPort-PHdr: 9a23:2xgoFhJ4+k/k3fsi5NmcpTZWNBhigK39O0sv0rFitYgUL/nxwZ3uMQTl6Ol3ixeRBMOAu60C1bed6vG/EUU7or+/81k6OKRWUBEEjchE1ycBO+WiTXPBEfjxciYhF95DXlI2t1uyMExSBdqsLwaK+i760zceF13FOBZvIaytQ8iJ35vxib35osyMKyxzxxODIppKZC2sqgvQssREyaBDEY0WjiXzn31TZu5NznlpL1/A1zz158O34YIxu38I46FppIZ8VvDQZbkzQPR1Ej0gKChh7tfnuDHEVReS/X0RTiMdlR8OChWTvz/gWZKkiCrxtaJY0SKZOcDzBeQuXD2p7KNmTTf1jygaOjoh8Cfcg5oj3+pgvBu9qkknkMbva4aPOa8mcw== X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2BsAwAGUNJW/61jaINehAwsQQa6VwENgWYXCoUoSgKBXxQBAQEBAQEBAWMngi2CFAEBAQMBAQEBIAQnIAEKBQsCAQgOCgICDRkCAicBCSYCBAgHBAEcBId2CA6wHY4kAQEBAQEBAQMBAQEBARcEe4UXgXSBSX2EBQsFAgEFCQ2CSjgTgScFh0yGUz2IMIVZgm+CMoRGjRaFc4hVAh4BAUKEAh4uB4Z2AR4dfgEBAQ X-IronPort-AV: E=Sophos;i="5.22,512,1449550800"; d="scan'208";a="268140621" Received: from nipigon.cs.uoguelph.ca (HELO zcs1.mail.uoguelph.ca) ([131.104.99.173]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 27 Feb 2016 20:45:02 -0500 Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 65AC015F56D; Sat, 27 Feb 2016 20:45:02 -0500 (EST) Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id RkU6ZfWk1kQe; Sat, 27 Feb 2016 20:45:01 -0500 (EST) Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 24AB515F56E; Sat, 27 Feb 2016 20:45:01 -0500 (EST) X-Virus-Scanned: amavisd-new at zcs1.mail.uoguelph.ca Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id 72OpX8hnWFhf; Sat, 27 Feb 2016 20:45:01 -0500 (EST) Received: from zcs1.mail.uoguelph.ca (zcs1.mail.uoguelph.ca [172.17.95.18]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 060A115F56D; Sat, 27 Feb 2016 20:45:01 -0500 (EST) Date: Sat, 27 Feb 2016 20:45:01 -0500 (EST) From: Rick Macklem To: Bruce Evans Cc: fs@freebsd.org Message-ID: <1367050076.11783367.1456623900997.JavaMail.zimbra@uoguelph.ca> In-Reply-To: <20160227131353.V1337@besplex.bde.org> References: <20160226164613.N2180@besplex.bde.org> <20160227131353.V1337@besplex.bde.org> Subject: Re: silly write caching in nfs3 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.95.10] X-Mailer: Zimbra 8.0.9_GA_6191 (ZimbraWebClient - FF44 (Win)/8.0.9_GA_6191) Thread-Topic: silly write caching in nfs3 Thread-Index: 4KrzpGsjlAPUelznr+fpCXmiBzXWDQ== X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 28 Feb 2016 01:45:05 -0000 Bruce Evans wrote: > On Fri, 26 Feb 2016, Bruce Evans wrote: > > > nfs3 is slower than in old versions of FreeBSD. I debugged one of the > > reasons today. > > ... > > oldnfs was fixed many years ago to use timestamps with nanoseconds > > resolution, but it doesn't suffer from the discarding in nfs_open() > > in the !NMODIFIED case which is reached by either fsync() before close > > of commit on close. I think this is because it updates n_mtime to > > the server's new timestamp in nfs_writerpc(). This seems to be wrong, > > since the file might have been written to by other clients and then > > the change would not be noticed until much later if ever (setting the > > timestamp prevents seeing it change when it is checked later, but you > > might be able to see another metadata change). > > > > newfs has quite different code for nfs_writerpc(). Most of it was > > moved to another function in nanother file. I understand this even > > less, but it doesn't seem to have fetch the server's new timestamp or > > update n_mtime in the v3 case. > > This quick fix seems to give the same behaviour as in oldnfs. It also > fixes some bugs in comments in nfs_fsync() (where I tried to pass a > non-null cred, but none is available. The ARGSUSED bug is in many > other functions): > > X Index: nfs_clvnops.c > X =================================================================== > X --- nfs_clvnops.c (revision 296089) > X +++ nfs_clvnops.c (working copy) > X @@ -1425,6 +1425,23 @@ > X } > X if (DOINGASYNC(vp)) > X *iomode = NFSWRITE_FILESYNC; > X + if (error == 0 && NFS_ISV3(vp)) { > X + /* > X + * Break seeing concurrent changes by other clients, > X + * since without this the next nfs_open() would > X + * invalidate our write buffers. This is worse than > X + * useless unless the write is committed on close or > X + * fsynced, since otherwise NMODIFIED remains set so > X + * the next nfs_open() will still invalidate the write > X + * buffers. Unfortunately, this cannot be placed in > X + * ncl_flush() where NMODIFIED is cleared since > X + * credentials are unavailable there for at least > X + * calls by nfs_fsync(). > X + */ > X + mtx_lock(&(VTONFS(vp))->n_mtx); > X + VTONFS(vp)->n_mtime = nfsva.na_mtime; > X + mtx_unlock(&(VTONFS(vp))->n_mtx); > X + } > X if (error && NFS_ISV4(vp)) > X error = nfscl_maperr(uiop->uio_td, error, (uid_t)0, (gid_t)0); > X return (error); > X @@ -2613,9 +2630,8 @@ > X } > X The fix I attached to the other email should fix this without breaking the weak cache consistency case (where another client has changed the mtime on the server). > X /* > X - * fsync vnode op. Just call ncl_flush() with commit == 1. > X + * fsync vnode op. > X */ > X -/* ARGSUSED */ > X static int > X nfs_fsync(struct vop_fsync_args *ap) > X { > X @@ -2622,8 +2638,12 @@ > X > X if (ap->a_vp->v_type != VREG) { > X /* > X + * XXX: this comment is misformatted (after fixing its > X + * internal errors) and misplaced. > X + * > X * For NFS, metadata is changed synchronously on the server, > X - * so there is nothing to flush. Also, ncl_flush() clears > X + * so the only thing to flush is data for regular files. > X + * Also, ncl_flush() clears > X * the NMODIFIED flag and that shouldn't be done here for > X * directories. > X */ > > > There are many other reasons why nfs is slower than in old versions. > > One is that writes are more often done out of order. This tends to > > give a slowness factor of about 2 unless the server can fix up the > > order. I use an old server which can do the fixup for old clients but > > not for newer clients starting in about FreeBSD-9 (or 7?). I suspect > > that this is just because Giant locking in old clients gave accidental > > serialization. Multiple nfsiod's and/or nfsd's are are clearly needed > > for performance if you have multiple NICs serving multiple mounts. I believe that you want at least one nfsiod for each concurrent process/thread reading/writing a file. If your readahead is set to a larger value than the default of 1, I think you would want that many nfsiod threads for each process/thread doing concurrent I/O on files. This is just conjecture. I have not done any benchmarking. > > Other cases are less clear. For the iozone benchmark, there is only > > 1 stream and multiple nfsiod's pessimize it into multiple streams that > > give buffers which arrive out of order on the server if the multiple > > nfsiod's are actually active. Since NFS specs have no ordering requirement or recommendation I would actually like to find a way to make ZFS (and maybe UFS too) perform well when the I/O ops are out of order. I have been told that ZFS uses some sequential writing heuristic. (I'm tempted to look for a way for the NFS server to force it to always assume sequential and just ignore the ordering of the VOP_WRITE()s.) > I use the following configuration to > > ameliorate this, but the slowness factor is still often about 2 for > > iozone: > > - limit nfsd's to 4 > > - limit nfsiod's to 4 > > - limit nfs i/o sizes to 8K. The server fs block size is 16K, and > > using a smaller block size usually helps by giving some delayed > > writes which can be clustered better. (The non-nfs parts of the > > server could be smarter and do this intentionally. The out-of-order > > buffers look like random writes to the server.) 16K i/o sizes > > otherwise work OK, but 32K i/o sizes are much slower for unknown > > reasons. > > Size 16K seems to work better now. > > I also use: > > - turn off most interrupt moderation. This reduces (ping) latency from > ~125 usec to ~75 usec for em on PCIe (after already turning off interrupt > moderation on the server to reduce it from 150-200 usec). 75 usec > is still a lot, though it is about 3 times lower than the default > misconfiguration. Downgrading up to older lem on PCI/33 reduces it to > 52. Downgrading to DEVICE_POLLING reduces it to about 40. The > dowgrades are upgrades :-(. Not using a switch reduces it by about > another 20. > > Low latency important for small i/o's. I was suprised that it also > helps a lot for large i/o's. Apparently it changes the timing enough > to reduce the out-of-order buffers significantly. > > The default misconfiguration with 20 nfsiod's is worse than I expected > (on an 8 core system). For (old) "iozone auto" which starts with a file > size of 1MB, the write speed is about 2MB/sec with 20 nfsiod's and > 22 MB/sec with 1 nfsiod. 2-4 nfsiod's work best. They give 30-40MB/sec > for most file sizes. Apparently, with 20 nfsiod's the write of 1MB is > split up into almost twenty pieces of 50K each (6 or 7 8K buffers each), > and the final order is perhaps even worse than random. I think it is > basically sequential with about seeks for all file > sizes between 1MB and many MB. > Unfortunately the number of concurrent processes doing I/O will vary all the time and I don't think trying to dynamically change the number of nfsiods to track this would be practical. (Of course, since you know you are only running one thread you can tune for that.) I like to do two things: 1 - Find a way to keep the requests more ordered in the client. (I wish I hadn't lost the patches. They were at least a starting point for this.) 2 - Find a way to make the FreeBSD server file systems less sensitive to ordering of I/O requests. --> Out of order for the nfsd doesn't imply random access. Have fun with it, rick > I also use: > > - no PREEMPTION and no IPI_PREEMPTION on SMP systems. This limits context > switching. > - no SCHED_ULE. HZ = 100. This also limits context switching. > > With more or fairer context switching, all nfsiods are more likely to run, > causing more damage. > > More detailed result for iozone 1 65536 with nfsiodmax=64 and oldnfs and > mostly best known other tuning: > > - first run write speed 2MB/S (probably still using 20) > (all rates use disk marketing MB) > - second run 9MB/S > - after repeated runs, 250MB/S > - the speed kept mostly dropping, and reached 21K/S > - server stats for next run at 29K/S: 139 blocks tested and order of > 24 fixed (the server has an early version of what is in -current, > with more debugging) > > with nfsiodmax=20: > - most runs 2-2.2MB/S; one at 750K/S > - server stats for a run at 2.2MB/S: 135 blocks tested and 86 fixed > > with nfsiodmax=4: > - 5.8-6.5MB/S > - server stats for a run at 6.0MB/S: 135 blocks tested and 0 fixed > > with nfsiodmax=2: > - 4.8-5.2MB/S > - server stats for a run at 5.1MB/S: 138 blocks tested and 0 fixed > > with nfsiodmax=1: > - 3.4MB/S > - server stats: 138 blocks tested and 0 fixed > > For iozone 512 65536: > > with nfsiodmax=1: > - 34.7MB/S > - server stats: 65543 blocks tested and 0 fixed > > with nfsiodmax=2: > - 45.9MB/S (this is close to the drive's speed and faster than direct on the > server. It is faster because everything the clustering accidentally works > better) > - server stats: 65550 blocks tested and 578 fixed > > with nfsiodmax=4: > - 45.6MB/S > - server stats: 65550 blocks tested and 2067 fixed > > with nfsiodmax=20: > - 21.4MB/S > - server stats: 65576 blocks tested and 12057 fixed > (it is easy to see how 7 nfsiods could give 1/7 = 14% of blocks > out of order. The server is fixing up almost 20%, but that is > not enough) > > with nfsiodmax=64 (caused server to not respond): > - test aborted at 500+MB > - server stats: about 10000 blocks fixed > > with nfsiodmax=64 again: > - 9.6MB/S > - server stats: 65598 blocks tested and 14034 fixed > > The nfsiod's get scheduled almost equally. > > Bruce > _______________________________________________ > freebsd-fs@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > From owner-freebsd-fs@freebsd.org Sun Feb 28 04:39:02 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 757D4AB72C8 for ; Sun, 28 Feb 2016 04:39:02 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 6289D1A3A for ; Sun, 28 Feb 2016 04:39:02 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: by mailman.ysv.freebsd.org (Postfix) id 6012BAB72C7; Sun, 28 Feb 2016 04:39:02 +0000 (UTC) Delivered-To: fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5FA60AB72C6 for ; Sun, 28 Feb 2016 04:39:02 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail107.syd.optusnet.com.au (mail107.syd.optusnet.com.au [211.29.132.53]) by mx1.freebsd.org (Postfix) with ESMTP id 0EC7A1A39 for ; Sun, 28 Feb 2016 04:39:01 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from c110-21-41-193.carlnfd1.nsw.optusnet.com.au (c110-21-41-193.carlnfd1.nsw.optusnet.com.au [110.21.41.193]) by mail107.syd.optusnet.com.au (Postfix) with ESMTPS id 3509AD4400A; Sun, 28 Feb 2016 15:38:58 +1100 (AEDT) Date: Sun, 28 Feb 2016 15:38:58 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Rick Macklem cc: fs@freebsd.org Subject: Re: silly write caching in nfs3 In-Reply-To: <813250886.11776455.1456622740632.JavaMail.zimbra@uoguelph.ca> Message-ID: <20160228143858.W2389@besplex.bde.org> References: <20160226164613.N2180@besplex.bde.org> <1403082388.11082060.1456545103011.JavaMail.zimbra@uoguelph.ca> <20160227153409.W1735@besplex.bde.org> <813250886.11776455.1456622740632.JavaMail.zimbra@uoguelph.ca> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.1 cv=EfU1O6SC c=1 sm=1 tr=0 a=73JWPhLeruqQCjN69UNZtQ==:117 a=L9H7d07YOLsA:10 a=9cW_t1CCXrUA:10 a=s5jvgZ67dGcA:10 a=kj9zAlcOel0A:10 a=edaYDiqBLB7Fnhdu0TYA:9 a=CjuIK1q_8ugA:10 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 28 Feb 2016 04:39:02 -0000 On Sat, 27 Feb 2016, Rick Macklem wrote: > Bruce Evans wrote: >> ... >> I blamed newnfs before :-), but when I looked at newnfs more closely I >> found that it was almost the same lexically in the most interesting >> places (but unfortunately has lexical differences from s/nfs/ncl/, >> and but doesn't have enough of these differences for debugging -- >> debugging is broken by having 2 static functions named nfs_foo() for >> many values of foo). But newnfs seems to have always been missing this >> critical code: >> >> X 1541 rgrimes int >> X 83651 peter nfs_writerpc(struct vnode *vp, struct uio *uiop, struct >> ucred *cred, >> X 158739 mohans int *iomode, int *must_commit) >> X 1541 rgrimes { >> X 9336 dfr if (v3) { >> X 9336 dfr wccflag = NFSV3_WCCCHK; >> X ... >> X 158739 mohans } >> X 158739 mohans if (wccflag) { >> X 158739 mohans mtx_lock(&(VTONFS(vp))->n_mtx); >> X 158739 mohans VTONFS(vp)->n_mtime = VTONFS(vp)->n_vattr.va_mtime; >> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >> X 158739 mohans mtx_unlock(&(VTONFS(vp))->n_mtx); >> X 158739 mohans } >> > Well, this code does exist in the new client. The function is nfsrpc_writerpc() > found in sys/fs/nfsclient/nfs_clrpcops.c. It calls nfscl_wcc_data() which does > the same test as nfsm_wcc_data_xx() did in the old client. > Then NFSWRITERPC_SETTIME() set the time if wccflag was set. > > However...NFSWRITERPC_SETTIME() is broken and sets n_mtime from the one in > the nfs vnode (same as the old code). Unfortuantely the cached value in the nfs > vnode hasn't been updated at that point. (The RPC layer functions don't do cache > stuff.) > > The attached patch fixes this. Please test with the attached patch. This passes my tests. newfs now works slightly better than oldnfs in FreeBSD-10 (higher throughput in re-read test and fewer read RPCs in compile tests). The negative regression in read RPCs might be a bug somewhere. In a larger compile test FreeBSD-11 has the opposite problem of many more read RPCs. I haven't started isolating this. Previous tests isolated the following regressions in the number of RPCs: - r208602:208603 gives more correct attribute cache clearing but more RPCs - r209947:209948 gives the same - r247115:247116 gives a 5-10% increase in sys time for makeworld by doing lots of locking in a slow way (thread_lock() in sigdefer/allowstop()). Removing the thread locking and doing racy accesses to td_flags reduces this overhead to below 5%. Removing the calls to sigdefer/allowstop() reduces it to a hard-to-measure amount (thus uses even more flags in VFS_PROLOGUE/EPILOGUE()) and doesn't remove the overhead for other file systems, but branch prediction apparently works well unless there are function calls there). For building a single kernel, in FreeBSD-[8-10] the RPC counts are down by about 10% relative to my reference version (which has an old implementation of negative name caching and cto improvements and a hack to work around broken dotdot namecaching), but for makeworld they are up by about 5%. The increase is probably another namecache or dirents bug, or just from different cache timeouts. > ... >> I tried this to see if it would fix the unordered writes. I didn't >> expect it to do much because I usually only have a single active >> client and a single active writer per file. It didn't make much >> difference. Today I tested on a 2-core system. The loss from extra nfsiods was much smaller than on a faster 8-core system with each core 2-3 times as fast and a slower (max throughput 28MB/S down from 70) but slightly lower latency network (ping latency 63 usec down from 75) (no loss up to a file size of about 32MB). > ... >>> The patches were all client side. Maybe I'll try and recreate them. >> >> It seems to require lots of communication between separate nfsiods to >> even preserve an order that has carefully been set up for them. If >> you have this then it is unclear why it can't be done more simply using >> a single nfsiod thread (per NIC or ifq). Only 1 thread should talk to >> the NIC/ifq, since you lose control if put other threads in between. >> If the NIC/ifq uses multiple threads then maintaining the order is its >> problem. > > Yes. The patches I had didn't guarantee complete ordering in part because the > one was for the nfsiods and the other for the RPC request in the krpc code. > I will try and redo them one of these days. > > I will mention that the NFS RFCs have no requirement nor recommendation for > ordering. The RPCs are assumed to be atomic entities. Having said that, I > do believe that FreeBSD servers will perform better when the reads/writes are > in sequentially increasing byte order, so this is worth working on. FreeBSD servers do lots of clustering, but not very well. Write clustering is easier than read clustering since it is reasonable to wait a bit to combine writes. A single client with a single writer is an easy case. For a single client with multiple writers, I think the nfsiods should try to keep the writes sequential for each writer but not across writers. This gives an order similar to local writers. FreeBSD servers (except maybe zfs ones) don't handle multiple writers well, but an nfs client can't be expected to understand the server's deficiencies better that the server does so that they can be avoided. Bruce From owner-freebsd-fs@freebsd.org Sun Feb 28 22:34:34 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id BB098AB78B1 for ; Sun, 28 Feb 2016 22:34:34 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 9D2511A42 for ; Sun, 28 Feb 2016 22:34:34 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: by mailman.ysv.freebsd.org (Postfix) id 98570AB78B0; Sun, 28 Feb 2016 22:34:34 +0000 (UTC) Delivered-To: fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 97E45AB78AF for ; Sun, 28 Feb 2016 22:34:34 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id 329B51A41 for ; Sun, 28 Feb 2016 22:34:33 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) IronPort-PHdr: 9a23:+bmXkhO/VT1Y5cKi194l6mtUPXoX/o7sNwtQ0KIMzox0KPn+rarrMEGX3/hxlliBBdydsKIbzbSI+Pm9CCQp2tWojjMrSNR0TRgLiMEbzUQLIfWuLgnFFsPsdDEwB89YVVVorDmROElRH9viNRWJ+iXhpQAbFhi3DwdpPOO9QteU1JTokb7ssMSOMk1hv3mUX/BbFF2OtwLft80b08NJC50a7V/3mEZOYPlc3mhyJFiezF7W78a0+4N/oWwL46pyv50IbaKvVb4lRrEQISovNXt9sMfxuRTrShOT+2AaX3lQmR1NRQHYukLURJD05xH7vek1/SCRPsn7SPhgQzGr5KRvRRrAlSAIKjM96GGRgcUm3/ETmw6ouxEqm92cW4qSLvcrJq4= X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2DOAQC+dNNW/61jaINdhH+6WAENgWaGEwKBWRQBAQEBAQEBAWMngi2CFAEBAQMBIwRSBQsCAQgOCgICDRkCAlcCBIgqCK9ujiMBAQEBAQEBAwEBAQEBARp7hReBdIJGhB2DGIE6BY0rdIRWhBeISIZ4hESDJYUtjkgCHgEBQoIDGYFmHodnfgEBAQ X-IronPort-AV: E=Sophos;i="5.22,517,1449550800"; d="scan'208";a="268225169" Received: from nipigon.cs.uoguelph.ca (HELO zcs1.mail.uoguelph.ca) ([131.104.99.173]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 28 Feb 2016 17:34:26 -0500 Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id B445315F577; Sun, 28 Feb 2016 17:34:26 -0500 (EST) Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id zN-7mw4Ra7oN; Sun, 28 Feb 2016 17:34:25 -0500 (EST) Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 85B8515F583; Sun, 28 Feb 2016 17:34:25 -0500 (EST) X-Virus-Scanned: amavisd-new at zcs1.mail.uoguelph.ca Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id JuyFYBoc-CAX; Sun, 28 Feb 2016 17:34:25 -0500 (EST) Received: from zcs1.mail.uoguelph.ca (zcs1.mail.uoguelph.ca [172.17.95.18]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 65AFD15F57F; Sun, 28 Feb 2016 17:34:25 -0500 (EST) Date: Sun, 28 Feb 2016 17:34:25 -0500 (EST) From: Rick Macklem To: Bruce Evans Cc: fs@freebsd.org Message-ID: <631790565.12474657.1456698865377.JavaMail.zimbra@uoguelph.ca> In-Reply-To: <20160228143858.W2389@besplex.bde.org> References: <20160226164613.N2180@besplex.bde.org> <1403082388.11082060.1456545103011.JavaMail.zimbra@uoguelph.ca> <20160227153409.W1735@besplex.bde.org> <813250886.11776455.1456622740632.JavaMail.zimbra@uoguelph.ca> <20160228143858.W2389@besplex.bde.org> Subject: Re: silly write caching in nfs3 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.95.10] X-Mailer: Zimbra 8.0.9_GA_6191 (ZimbraWebClient - FF44 (Win)/8.0.9_GA_6191) Thread-Topic: silly write caching in nfs3 Thread-Index: OEMFLDJC9KEWbx1HLFCY1RvpT0NrpQ== X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 28 Feb 2016 22:34:34 -0000 Bruce Evans wrote: > On Sat, 27 Feb 2016, Rick Macklem wrote: > > > Bruce Evans wrote: > >> ... > >> I blamed newnfs before :-), but when I looked at newnfs more closely I > >> found that it was almost the same lexically in the most interesting > >> places (but unfortunately has lexical differences from s/nfs/ncl/, > >> and but doesn't have enough of these differences for debugging -- > >> debugging is broken by having 2 static functions named nfs_foo() for > >> many values of foo). But newnfs seems to have always been missing this > >> critical code: > >> > >> X 1541 rgrimes int > >> X 83651 peter nfs_writerpc(struct vnode *vp, struct uio *uiop, > >> struct > >> ucred *cred, > >> X 158739 mohans int *iomode, int *must_commit) > >> X 1541 rgrimes { > >> X 9336 dfr if (v3) { > >> X 9336 dfr wccflag = NFSV3_WCCCHK; > >> X ... > >> X 158739 mohans } > >> X 158739 mohans if (wccflag) { > >> X 158739 mohans mtx_lock(&(VTONFS(vp))->n_mtx); > >> X 158739 mohans VTONFS(vp)->n_mtime = VTONFS(vp)->n_vattr.va_mtime; > >> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > >> X 158739 mohans mtx_unlock(&(VTONFS(vp))->n_mtx); > >> X 158739 mohans } > >> > > Well, this code does exist in the new client. The function is > > nfsrpc_writerpc() > > found in sys/fs/nfsclient/nfs_clrpcops.c. It calls nfscl_wcc_data() which > > does > > the same test as nfsm_wcc_data_xx() did in the old client. > > Then NFSWRITERPC_SETTIME() set the time if wccflag was set. > > > > However...NFSWRITERPC_SETTIME() is broken and sets n_mtime from the one in > > the nfs vnode (same as the old code). Unfortuantely the cached value in the > > nfs > > vnode hasn't been updated at that point. (The RPC layer functions don't do > > cache > > stuff.) > > > > The attached patch fixes this. Please test with the attached patch. > > This passes my tests. newfs now works slightly better than oldnfs in > FreeBSD-10 (higher throughput in re-read test and fewer read RPCs in > compile tests). > Thanks for testing it. I can't do commits until mid-April, but will commit it then. > The negative regression in read RPCs might be a bug somewhere. In a larger > compile test FreeBSD-11 has the opposite problem of many more read RPCs. Hmm. There is very little difference between the FreeBSD-10 and FreeBSD-11 NFS code. (I have MFC'd most all of the changes.) Did you happen to look and see if the buffer cache was about the same size for both -10 and -11? > I haven't started isolating this. Previous tests isolated the following > regressions in the number of RPCs: > - r208602:208603 gives more correct attribute cache clearing but more RPCs > - r209947:209948 gives the same > - r247115:247116 gives a 5-10% increase in sys time for makeworld by doing > lots of locking in a slow way (thread_lock() in sigdefer/allowstop()). > Removing the thread locking and doing racy accesses to td_flags reduces > this overhead to below 5%. Removing the calls to sigdefer/allowstop() > reduces it to a hard-to-measure amount (thus uses even more flags in > VFS_PROLOGUE/EPILOGUE()) and doesn't remove the overhead for other file > systems, but branch prediction apparently works well unless there are > function calls there). > > For building a single kernel, in FreeBSD-[8-10] the RPC counts are > down by about 10% relative to my reference version (which has an old > implementation of negative name caching and cto improvements and a > hack to work around broken dotdot namecaching), but for makeworld they > are up by about 5%. The increase is probably another namecache or > dirents bug, or just from different cache timeouts. > > > ... > >> I tried this to see if it would fix the unordered writes. I didn't > >> expect it to do much because I usually only have a single active > >> client and a single active writer per file. It didn't make much > >> difference. > > Today I tested on a 2-core system. The loss from extra nfsiods was > much smaller than on a faster 8-core system with each core 2-3 times > as fast and a slower (max throughput 28MB/S down from 70) but slightly > lower latency network (ping latency 63 usec down from 75) (no loss up to > a file size of about 32MB). > > > ... > >>> The patches were all client side. Maybe I'll try and recreate them. > >> > >> It seems to require lots of communication between separate nfsiods to > >> even preserve an order that has carefully been set up for them. If > >> you have this then it is unclear why it can't be done more simply using > >> a single nfsiod thread (per NIC or ifq). Only 1 thread should talk to > >> the NIC/ifq, since you lose control if put other threads in between. > >> If the NIC/ifq uses multiple threads then maintaining the order is its > >> problem. > > > > Yes. The patches I had didn't guarantee complete ordering in part because > > the > > one was for the nfsiods and the other for the RPC request in the krpc code. > > I will try and redo them one of these days. > > > > I will mention that the NFS RFCs have no requirement nor recommendation for > > ordering. The RPCs are assumed to be atomic entities. Having said that, I > > do believe that FreeBSD servers will perform better when the reads/writes > > are > > in sequentially increasing byte order, so this is worth working on. I will try and come up with another patch to try and improve ordering. > > FreeBSD servers do lots of clustering, but not very well. > > Write clustering is easier than read clustering since it is reasonable to > wait a bit to combine writes. > > A single client with a single writer is an easy case. For a single client > with multiple writers, I think the nfsiods should try to keep the writes > sequential for each writer but not across writers. This gives an order > similar to local writers. FreeBSD servers (except maybe zfs ones) don't > handle multiple writers well, but an nfs client can't be expected to > understand the server's deficiencies better that the server does so that > they can be avoided. > The krpc just sees RPC requests (doesn't really know they are even NFS, although they are for the most part now.) As such, the patch I did (and will probably try and reproduce) just tried to keep the order (ie. retain the order that the krpc call got the RPC requests in). When I did a little testing, the nfsdiods didn't appear to be causing much reordering, but I should take another look at this. Thanks for all the info, rick > Bruce > From owner-freebsd-fs@freebsd.org Mon Feb 29 18:22:24 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 062B5AB89FF for ; Mon, 29 Feb 2016 18:22:24 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id EB16F1BA2 for ; Mon, 29 Feb 2016 18:22:23 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id u1TIMNM9083338 for ; Mon, 29 Feb 2016 18:22:23 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 200513] Race at shutdown and corrupt fusefs systems Date: Mon, 29 Feb 2016 18:22:24 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 10.1-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: pi@FreeBSD.org X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 29 Feb 2016 18:22:24 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D200513 Kurt Jaeger changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |dpejesh@yahoo.com --- Comment #1 from Kurt Jaeger --- Maybe this script should become part of sysutils/fusefs-kmod ? --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-fs@freebsd.org Mon Feb 29 19:04:48 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id F1073AB8DAC for ; Mon, 29 Feb 2016 19:04:48 +0000 (UTC) (envelope-from 000.fbsd@quip.cz) Received: from elsa.codelab.cz (elsa.codelab.cz [94.124.105.4]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 99B031C16 for ; Mon, 29 Feb 2016 19:04:48 +0000 (UTC) (envelope-from 000.fbsd@quip.cz) Received: from elsa.codelab.cz (localhost [127.0.0.1]) by elsa.codelab.cz (Postfix) with ESMTP id 7CF982842E for ; Mon, 29 Feb 2016 20:04:46 +0100 (CET) Received: from illbsd.quip.test (ip-86-49-16-209.net.upcbroadband.cz [86.49.16.209]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by elsa.codelab.cz (Postfix) with ESMTPSA id 5E38228429 for ; Mon, 29 Feb 2016 20:04:45 +0100 (CET) Message-ID: <56D4964D.3010604@quip.cz> Date: Mon, 29 Feb 2016 20:04:45 +0100 From: Miroslav Lachman <000.fbsd@quip.cz> User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:35.0) Gecko/20100101 Firefox/35.0 SeaMonkey/2.32 MIME-Version: 1.0 To: freebsd-fs@freebsd.org Subject: abnormally high CPU load after zfs destroy Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 29 Feb 2016 19:04:49 -0000 I am using ZFS pool (4x 3TB) as small backup storage. Backups are made by rsync and there are few snapshots. When I use "zfs destroy -r", there are high disk activity (it seems normal) but also high CPU load - 80+ The system did nothing at this time, just deleting old ZFS snapshot, so why is the load so high? last pid: 90302; load averages: 81.63, 43.60, 19.28 up 43+03:33:04 19:56:16 36 processes: 1 running, 34 sleeping, 1 zombie CPU: 0.0% user, 0.0% nice, 96.0% system, 0.0% interrupt, 4.0% idle Mem: 5836K Active, 20M Inact, 4046M Wired, 755M Free ARC: 1572M Total, 82M MFU, 1018M MRU, 24M Anon, 26M Header, 422M Other Swap: 5120M Total, 17M Used, 5103M Free PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 592 root 1 20 0 26160K 18080K select 1 3:24 0.00% /usr/sbin/ntpd -g -c /etc/ntp.conf -p /var/run/ntpd.pid -f /var/db/ntpd.drift 624 root 1 20 0 61224K 4204K select 1 2:20 0.00% /usr/sbin/sshd 672 root 1 20 0 24136K 4592K select 0 1:10 0.00% sendmail: rejecting connections on daemon Daemon0: load average: 77 (sendmail) 655 root 1 20 0 25124K 4148K select 0 1:01 0.00% /usr/sbin/bsnmpd -p /var/run/snmpd.pid 443 root 1 20 0 14512K 1760K select 1 0:43 0.00% /usr/sbin/syslogd -ss 679 root 1 22 0 16612K 672K nanslp 1 0:32 0.00% /usr/sbin/cron -s 95875 xyz 1 20 0 65892K 4708K select 1 0:05 0.00% sshd: xyz @pts/0 (sshd) 649 root 1 20 0 30704K 1432K nanslp 1 0:04 0.00% /usr/local/sbin/smartd -c /usr/local/etc/smartd.conf -p /var/run/smartd.pid 352 root 1 20 0 13624K 1204K select 0 0:02 0.00% /sbin/devd 95873 root 1 20 0 65892K 4580K select 1 0:02 0.00% sshd: xyz [priv] (sshd) 95912 root 1 20 0 25772K 652K pause 1 0:02 0.00% screen 675 smmsp 1 20 0 24136K 1152K pause 0 0:01 0.00% sendmail: Queue runner@00:30:00 for /var/spool/clientmqueue (sendmail) 89875 root 1 20 0 21940K 3188K CPU1 1 0:00 0.00% top 95914 root 1 20 0 23592K 3152K pause 0 0:00 0.00% -/bin/tcsh 95913 root 1 20 0 25772K 3192K select 1 0:00 0.00% screen 95895 root 1 52 0 23592K 0K pause 0 0:00 0.00% -su () 95876 xyz 1 52 0 23592K 0K pause 1 0:00 0.00% -tcsh () 95894 xyz 1 23 0 47740K 0K wait 0 0:00 0.00% /usr/bin/su - root () 90271 mrtg 1 52 0 17088K 2508K wait 1 0:00 0.00% /bin/sh ./local_iostat_disk.sh 89976 root 1 21 0 16612K 1704K piperd 1 0:00 0.00% cron: running job (cron) 90270 mrtg 1 52 0 17088K 2504K piperd 1 0:00 0.00% /bin/sh ./local_iostat_cpu.sh 726 root 1 52 0 14508K 1700K ttyin 0 0:00 0.00% /usr/libexec/getty Pc ttyv0 733 root 1 52 0 14508K 1700K ttyin 1 0:00 0.00% /usr/libexec/getty Pc ttyv7 728 root 1 52 0 14508K 1700K ttyin 1 0:00 0.00% /usr/libexec/getty Pc ttyv2 727 root 1 52 0 14508K 1700K ttyin 1 0:00 0.00% /usr/libexec/getty Pc ttyv1 729 root 1 52 0 14508K 1700K ttyin 0 0:00 0.00% /usr/libexec/getty Pc ttyv3 731 root 1 52 0 14508K 1700K ttyin 0 0:00 0.00% /usr/libexec/getty Pc ttyv5 730 root 1 52 0 14508K 1700K ttyin 1 0:00 0.00% /usr/libexec/getty Pc ttyv4 732 root 1 52 0 14508K 1700K ttyin 0 0:00 0.00% /usr/libexec/getty Pc ttyv6 90286 mrtg 1 52 0 18740K 2236K nanslp 1 0:00 0.00% iostat -w 250 -c 2 -x ada0 ada1 ada2 ada3 90283 mrtg 1 52 0 18740K 2228K nanslp 1 0:00 0.00% iostat -d -C -n 0 -w 240 -c 2 90287 mrtg 1 52 0 12356K 1952K piperd 1 0:00 0.00% tail -n 4 137 root 1 52 0 12352K 0K pause 1 0:00 0.00% adjkerntz -i () 90288 mrtg 1 52 0 17088K 2508K piperd 0 0:00 0.00% /bin/sh ./local_iostat_disk.sh 3313 root 1 52 0 16728K 1884K select 1 0:00 0.00% /usr/sbin/moused -p /dev/ums0 -t auto -I /var/run/moused.ums0.pid # uname -srmi FreeBSD 10.2-RELEASE-p10 amd64 GENERIC # grep CPU: /var/run/dmesg.boot CPU: Intel(R) Pentium(R) Dual CPU E2160 @ 1.80GHz (1795.53-MHz K8-class CPU) Miroslav Lachman From owner-freebsd-fs@freebsd.org Mon Feb 29 21:21:41 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id EDC32AB7EBC for ; Mon, 29 Feb 2016 21:21:40 +0000 (UTC) (envelope-from killing@multiplay.co.uk) Received: from mail-wm0-x236.google.com (mail-wm0-x236.google.com [IPv6:2a00:1450:400c:c09::236]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 8D60CEAB for ; Mon, 29 Feb 2016 21:21:40 +0000 (UTC) (envelope-from killing@multiplay.co.uk) Received: by mail-wm0-x236.google.com with SMTP id l68so8346629wml.0 for ; Mon, 29 Feb 2016 13:21:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=multiplay-co-uk.20150623.gappssmtp.com; s=20150623; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-transfer-encoding; bh=Y08cZEmG/gpEbR3662nhN9hsDOcx2itVXfnoomK6Jjc=; b=y++s5BaisjhdoUAcifqLgsc1x/rdzywplfsV7v9Nbr706qKOoEXf5g0kxDSLwAo0Qq a5TeRvrPuNFnfjCoLdxI9zbc0RzQkK8RvVqn9xZY30VQFtM77/eN+vkJTUAjZ4/k40/z rybUSpELBQ4eCtY3dznkNfUZDNF6ylt3wiRtw+J8KkXveUNuu5JUYSyZCX9ERoM17VUE qwDQzU7Vfgh5pO06GDgY3hQcNkaz39gdP0a3CBMVTkUwrr/xFUoKEUS1cRiktmwoQgUV FS/NDXs+mHi3AG0Iw4FjYXLo44Ua3z53Yscfv8cIgSbFZQth+vpeOhjoXDRgPMLpurHj bgtg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding; bh=Y08cZEmG/gpEbR3662nhN9hsDOcx2itVXfnoomK6Jjc=; b=k9g3Q3pyfFV0fsSVpKNn6DUyNFU3S3ppC7irkaxbsQuD6p/zWHuI67thxYdo63eee1 KIrXeK8hCwbgjXnDOMBRCvh3pWm1hZap1xq5pjs1ifEEPR0cdc+XD29cjoRqBuN6B2HX ZZpH6dwC+orEjZ+7nN0BwfT5C8fCFvrknWt5xMsHNl9PAAQvO9u0CoMnwDTKux+9MneS MAkQkTqWyS46FQmZWq3qrw0G9Z/4h4y9B5UffO0ZPb5BtxjqMS3cWI4s8KQ2EJB9/rsF yBs6E1in5VwD0KF42nhS30CKYswreABxJ2IQ2vKMhPzZzFKEpR0ljTG9C5ysn+4OAiVy qYeA== X-Gm-Message-State: AD7BkJLR6/1khIHv26q954Wti58RT3jCuAMRIioYKO6PC4VX/NrhIa49i8hWedVQkVUsCueM X-Received: by 10.28.91.142 with SMTP id p136mr91231wmb.76.1456780898252; Mon, 29 Feb 2016 13:21:38 -0800 (PST) Received: from [10.10.1.58] (liv3d.labs.multiplay.co.uk. [82.69.141.171]) by smtp.gmail.com with ESMTPSA id g3sm27799537wjw.31.2016.02.29.13.21.36 for (version=TLSv1/SSLv3 cipher=OTHER); Mon, 29 Feb 2016 13:21:37 -0800 (PST) Subject: Re: abnormally high CPU load after zfs destroy To: freebsd-fs@freebsd.org References: <56D4964D.3010604@quip.cz> From: Steven Hartland Message-ID: <56D4B66D.4070007@multiplay.co.uk> Date: Mon, 29 Feb 2016 21:21:49 +0000 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0 MIME-Version: 1.0 In-Reply-To: <56D4964D.3010604@quip.cz> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 29 Feb 2016 21:21:41 -0000 Its likely churning through the actual delete, show system processes in top and you'll see it. On 29/02/2016 19:04, Miroslav Lachman wrote: > I am using ZFS pool (4x 3TB) as small backup storage. Backups are made > by rsync and there are few snapshots. When I use "zfs destroy -r", > there are high disk activity (it seems normal) but also high CPU load > - 80+ > > The system did nothing at this time, just deleting old ZFS snapshot, > so why is the load so high? > > last pid: 90302; load averages: 81.63, 43.60, 19.28 up > 43+03:33:04 19:56:16 > 36 processes: 1 running, 34 sleeping, 1 zombie > CPU: 0.0% user, 0.0% nice, 96.0% system, 0.0% interrupt, 4.0% idle > Mem: 5836K Active, 20M Inact, 4046M Wired, 755M Free > ARC: 1572M Total, 82M MFU, 1018M MRU, 24M Anon, 26M Header, 422M Other > Swap: 5120M Total, 17M Used, 5103M Free > > PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU > COMMAND > 592 root 1 20 0 26160K 18080K select 1 3:24 0.00% > /usr/sbin/ntpd -g -c /etc/ntp.conf -p /var/run/ntpd.pid -f > /var/db/ntpd.drift > 624 root 1 20 0 61224K 4204K select 1 2:20 0.00% > /usr/sbin/sshd > 672 root 1 20 0 24136K 4592K select 0 1:10 0.00% > sendmail: rejecting connections on daemon Daemon0: load average: 77 > (sendmail) > 655 root 1 20 0 25124K 4148K select 0 1:01 0.00% > /usr/sbin/bsnmpd -p /var/run/snmpd.pid > 443 root 1 20 0 14512K 1760K select 1 0:43 0.00% > /usr/sbin/syslogd -ss > 679 root 1 22 0 16612K 672K nanslp 1 0:32 0.00% > /usr/sbin/cron -s > 95875 xyz 1 20 0 65892K 4708K select 1 0:05 0.00% > sshd: xyz @pts/0 (sshd) > 649 root 1 20 0 30704K 1432K nanslp 1 0:04 0.00% > /usr/local/sbin/smartd -c /usr/local/etc/smartd.conf -p > /var/run/smartd.pid > 352 root 1 20 0 13624K 1204K select 0 0:02 0.00% > /sbin/devd > 95873 root 1 20 0 65892K 4580K select 1 0:02 0.00% > sshd: xyz [priv] (sshd) > 95912 root 1 20 0 25772K 652K pause 1 0:02 0.00% > screen > 675 smmsp 1 20 0 24136K 1152K pause 0 0:01 0.00% > sendmail: Queue runner@00:30:00 for /var/spool/clientmqueue (sendmail) > 89875 root 1 20 0 21940K 3188K CPU1 1 0:00 0.00% top > 95914 root 1 20 0 23592K 3152K pause 0 0:00 0.00% > -/bin/tcsh > 95913 root 1 20 0 25772K 3192K select 1 0:00 0.00% > screen > 95895 root 1 52 0 23592K 0K pause 0 0:00 0.00% > -su () > 95876 xyz 1 52 0 23592K 0K pause 1 0:00 0.00% > -tcsh () > 95894 xyz 1 23 0 47740K 0K wait 0 0:00 0.00% > /usr/bin/su - root () > 90271 mrtg 1 52 0 17088K 2508K wait 1 0:00 0.00% > /bin/sh ./local_iostat_disk.sh > 89976 root 1 21 0 16612K 1704K piperd 1 0:00 0.00% > cron: running job (cron) > 90270 mrtg 1 52 0 17088K 2504K piperd 1 0:00 0.00% > /bin/sh ./local_iostat_cpu.sh > 726 root 1 52 0 14508K 1700K ttyin 0 0:00 0.00% > /usr/libexec/getty Pc ttyv0 > 733 root 1 52 0 14508K 1700K ttyin 1 0:00 0.00% > /usr/libexec/getty Pc ttyv7 > 728 root 1 52 0 14508K 1700K ttyin 1 0:00 0.00% > /usr/libexec/getty Pc ttyv2 > 727 root 1 52 0 14508K 1700K ttyin 1 0:00 0.00% > /usr/libexec/getty Pc ttyv1 > 729 root 1 52 0 14508K 1700K ttyin 0 0:00 0.00% > /usr/libexec/getty Pc ttyv3 > 731 root 1 52 0 14508K 1700K ttyin 0 0:00 0.00% > /usr/libexec/getty Pc ttyv5 > 730 root 1 52 0 14508K 1700K ttyin 1 0:00 0.00% > /usr/libexec/getty Pc ttyv4 > 732 root 1 52 0 14508K 1700K ttyin 0 0:00 0.00% > /usr/libexec/getty Pc ttyv6 > 90286 mrtg 1 52 0 18740K 2236K nanslp 1 0:00 0.00% > iostat -w 250 -c 2 -x ada0 ada1 ada2 ada3 > 90283 mrtg 1 52 0 18740K 2228K nanslp 1 0:00 0.00% > iostat -d -C -n 0 -w 240 -c 2 > 90287 mrtg 1 52 0 12356K 1952K piperd 1 0:00 0.00% > tail -n 4 > 137 root 1 52 0 12352K 0K pause 1 0:00 0.00% > adjkerntz -i () > 90288 mrtg 1 52 0 17088K 2508K piperd 0 0:00 0.00% > /bin/sh ./local_iostat_disk.sh > 3313 root 1 52 0 16728K 1884K select 1 0:00 0.00% > /usr/sbin/moused -p /dev/ums0 -t auto -I /var/run/moused.ums0.pid > > > > # uname -srmi > FreeBSD 10.2-RELEASE-p10 amd64 GENERIC > > # grep CPU: /var/run/dmesg.boot > CPU: Intel(R) Pentium(R) Dual CPU E2160 @ 1.80GHz (1795.53-MHz > K8-class CPU) > > > Miroslav Lachman > _______________________________________________ > freebsd-fs@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" From owner-freebsd-fs@freebsd.org Mon Feb 29 21:45:06 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E0AD1AB88FF for ; Mon, 29 Feb 2016 21:45:05 +0000 (UTC) (envelope-from ken@freebsd.org) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id D11B21DC2 for ; Mon, 29 Feb 2016 21:45:05 +0000 (UTC) (envelope-from ken@freebsd.org) Received: by mailman.ysv.freebsd.org (Postfix) id CCB7CAB88F8; Mon, 29 Feb 2016 21:45:05 +0000 (UTC) Delivered-To: fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id CC191AB88F6; Mon, 29 Feb 2016 21:45:05 +0000 (UTC) (envelope-from ken@freebsd.org) Received: from mithlond.kdm.org (mithlond.kdm.org [96.89.93.250]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "A1-33714", Issuer "A1-33714" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 845D91DC0; Mon, 29 Feb 2016 21:45:05 +0000 (UTC) (envelope-from ken@freebsd.org) Received: from [10.0.0.27] (mbp2013-wired.int.kdm.org [10.0.0.27]) (authenticated bits=0) by mithlond.kdm.org (8.15.2/8.14.9) with ESMTPSA id u1TLj3Uq067615 (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 29 Feb 2016 16:45:04 -0500 (EST) (envelope-from ken@freebsd.org) From: Ken Merry Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: FUSE extended attribute patches available Date: Mon, 29 Feb 2016 16:45:03 -0500 Message-Id: Cc: scsi@freebsd.org To: fs@freebsd.org Mime-Version: 1.0 (Mac OS X Mail 9.2 \(3112\)) X-Mailer: Apple Mail (2.3112) X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.4.3 (mithlond.kdm.org [96.89.93.250]); Mon, 29 Feb 2016 16:45:04 -0500 (EST) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 29 Feb 2016 21:45:06 -0000 I have patches for FreeBSD=E2=80=99s FUSE filesystem kernel module to = support extended attributes: https://people.freebsd.org/~ken/fuse_extattr.20160229.1.txt The patch implements the get/set/delete/list extended attribute methods. = The listing code also converts extended attribute lists from the = Linux/FUSE format to the FreeBSD format. For example: # touch foo # ls -la foo -rwxrwxrwx 1 root wheel 0 Feb 29 21:40 foo # lsextattr user foo foo # setextattr user testattr1 "12345678" foo # lsextattr user foo foo testattr1 # getextattr user testattr1 foo foo 12345678 # setextattr user testattr2 "87654321" foo # lsextattr user foo foo testattr2 testattr1 # rmextattr user testattr1 foo # lsextattr user foo foo testattr2 # getextattr user testattr1 foo getextattr: foo: failed: Attribute not found # getextattr user testattr2 foo foo 87654321 Just to be clear on what this does, it only provides extended attribute = support to FreeBSD applications if the underlying FUSE filesystem = implements FUSE extended attribute support. Many FUSE filesystems = don=E2=80=99t support the extended attribute VFS operations. I have tested this out on IBM=E2=80=99s LTFS implementation, but I have = not yet found another FUSE filesystem that supports extended attributes. = If anyone knows of one, please let me know so I can try it out. (I = looked through a number of the filesystems in sysutils/fusefs* in the = ports tree.) Any feedback is welcome. I=E2=80=99m planning to check this into = FreeBSD/head in the next week or so. Obviously, I=E2=80=99ve also ported IBM=E2=80=99s LTFS implementation to = FreeBSD. It works in the standard FUSE mode, and you can also link it = into an application as a library if you don=E2=80=99t want to incur the = overhead of running through FUSE. I haven=E2=80=99t gotten around to = packaging it up to go out for testing / review. If anyone has IBM LTO-5 or newer tape drives, or IBM TS1140 or newer = tape drives, and wants to try it out, let me know. I=E2=80=99ll send = you the code when I=E2=80=99ve got it at least somewhat ready. This is = IBM-specific, and won=E2=80=99t work on HP tape drives. Ken =E2=80=94=20 Ken Merry ken@FreeBSD.ORG From owner-freebsd-fs@freebsd.org Tue Mar 1 00:02:39 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id EA172AB9D75 for ; Tue, 1 Mar 2016 00:02:38 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id CDB28EC7 for ; Tue, 1 Mar 2016 00:02:38 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: by mailman.ysv.freebsd.org (Postfix) id CB249AB9D73; Tue, 1 Mar 2016 00:02:38 +0000 (UTC) Delivered-To: fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B18D9AB9D72; Tue, 1 Mar 2016 00:02:38 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id 26C41EC6; Tue, 1 Mar 2016 00:02:37 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) IronPort-PHdr: 9a23:xXC7FBV3sOwopxkcFA90GP1FxxXV8LGtZVwlr6E/grcLSJyIuqrYZhCBt8tkgFKBZ4jH8fUM07OQ6PC/HzJRqszY+Fk5M7VyFDY9wf0MmAIhBMPXQWbaF9XNKxIAIcJZSVV+9Gu6O0UGUOz3ZlnVv2HgpWVKQka3CwN5K6zPF5LIiIzvjqbpq8KVPVQD3mP1SIgxBSv1hD2ZjtMRj4pmJ/R54TryiVwMRd5rw3h1L0mYhRf265T41pdi9yNNp6BprJYYAu2pN5g/GJ9VCnwDPnov9YW/thTFZQWV63YWSWlQlQBHVVvr9hb/C63wuSiyk+N22y2XOIWiV7U9Ujem4qJDVRjnlSoDLz5/+2iB2Z84t75SvB/0/083+IXTeozAbPc= X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2DQAQAT29RW/61jaINeFoN2bQa6ZwENgWcXCoUoSgKBchQBAQEBAQEBAWMngi2CFAEBAQMBAQEBICsgCwULAgEIGAICDRkCAicBCSYCBAgHBAEcAgKHdggOsQKPBgEBAQEBAQEDAQEBAQEBAQEYe4UXgXSCRoQFEAIBBRaCSjgTgScFh1SGSz2IMIVZhSGERkuDeYhSjkgCHgEBQoIDGYFmHi4BAQEEh0d+AQEB X-IronPort-AV: E=Sophos;i="5.22,521,1449550800"; d="scan'208";a="268423735" Received: from nipigon.cs.uoguelph.ca (HELO zcs1.mail.uoguelph.ca) ([131.104.99.173]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 29 Feb 2016 19:02:30 -0500 Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 86C5F15F574; Mon, 29 Feb 2016 19:02:30 -0500 (EST) Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id uY9p308SZ875; Mon, 29 Feb 2016 19:02:29 -0500 (EST) Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id E981C15F578; Mon, 29 Feb 2016 19:02:28 -0500 (EST) X-Virus-Scanned: amavisd-new at zcs1.mail.uoguelph.ca Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id 7Nv56U13dYVU; Mon, 29 Feb 2016 19:02:28 -0500 (EST) Received: from zcs1.mail.uoguelph.ca (zcs1.mail.uoguelph.ca [172.17.95.18]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id C3E3E15F574; Mon, 29 Feb 2016 19:02:28 -0500 (EST) Date: Mon, 29 Feb 2016 19:02:28 -0500 (EST) From: Rick Macklem To: Ken Merry Cc: fs@freebsd.org, scsi@freebsd.org Message-ID: <1740288370.14765302.1456790548437.JavaMail.zimbra@uoguelph.ca> In-Reply-To: References: Subject: Re: FUSE extended attribute patches available MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [172.17.95.11] X-Mailer: Zimbra 8.0.9_GA_6191 (ZimbraWebClient - FF44 (Win)/8.0.9_GA_6191) Thread-Topic: FUSE extended attribute patches available Thread-Index: L3zcHd+nBH3JTTOVXhgbDnAZGiFZUA== X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 01 Mar 2016 00:02:39 -0000 Ken Merry wrote: > I have patches for FreeBSD=E2=80=99s FUSE filesystem kernel module to sup= port > extended attributes: >=20 > https://people.freebsd.org/~ken/fuse_extattr.20160229.1.txt >=20 > The patch implements the get/set/delete/list extended attribute methods. = The > listing code also converts extended attribute lists from the Linux/FUSE > format to the FreeBSD format. I also have patches, although my list didn't work. (I didn't know that ther= e was a difference between what Linux/FUSE returns vs what FreeBSD wanted. So now= I know why my "list" didn't work.) Btw, when I discussed what to do w.r.t. extended attribute namespace, he se= emed to think just considering all the fuse ones as User was ok. I also have patched FreeBSD's Fuse for the other stuff needed to export the= fuse mount via an NFS server. (VFS_FHTOVP() etc) These aren't quite ready for "p= rime time" but if anyone wants to try them, just email me and I'll send you a copy. Btw, I have a couple of patches related to direct I/O and buffered I/O. You= can find 2 of these at PR#206238. I also patched fuse_vnop_inactive() to flush/= write dirty pages. (I think this is needed when an application mmaps a file and t= he modifies it after closing the file descriptor. I haven't actually tested to= see if this fix is needed, so I haven't put it anywhere yet.) > For example: >=20 > # touch foo > # ls -la foo > -rwxrwxrwx 1 root wheel 0 Feb 29 21:40 foo > # lsextattr user foo > foo > # setextattr user testattr1 "12345678" foo > # lsextattr user foo > foo testattr1 > # getextattr user testattr1 foo > foo 12345678 > # setextattr user testattr2 "87654321" foo > # lsextattr user foo > foo testattr2 testattr1 > # rmextattr user testattr1 foo > # lsextattr user foo > foo testattr2 > # getextattr user testattr1 foo > getextattr: foo: failed: Attribute not found > # getextattr user testattr2 foo > foo 87654321 >=20 >=20 > Just to be clear on what this does, it only provides extended attribute > support to FreeBSD applications if the underlying FUSE filesystem impleme= nts > FUSE extended attribute support. Many FUSE filesystems don=E2=80=99t sup= port the > extended attribute VFS operations. >=20 > I have tested this out on IBM=E2=80=99s LTFS implementation, but I have n= ot yet found > another FUSE filesystem that supports extended attributes. If anyone kno= ws > of one, please let me know so I can try it out. (I looked through a numb= er > of the filesystems in sysutils/fusefs* in the ports tree.) >=20 The FreeBSD GlusterFS port includes a fuse interface that supports extended attributes. That is how I tested what I have. (I think the port is now in svn, but if not you can find the GlusterFS port at PR#194409.) It glusterfs.org doc doesn't mention that you can create/test a volume of only one brick, but that works for trivial testing. If you decide to use it for testing and have trouble getting it going, just email. I know diddly ab= out GlusterFS, but I have fired it up a few times. > Any feedback is welcome. I=E2=80=99m planning to check this into FreeBSD= /head in the > next week or so. >=20 I'll try to get around and taking a look to see if there is anything differ= ent than what I did (other than the above "list" case I didn't get right;-). rick > Obviously, I=E2=80=99ve also ported IBM=E2=80=99s LTFS implementation to = FreeBSD. It works > in the standard FUSE mode, and you can also link it into an application a= s a > library if you don=E2=80=99t want to incur the overhead of running throug= h FUSE. I > haven=E2=80=99t gotten around to packaging it up to go out for testing / = review. >=20 > If anyone has IBM LTO-5 or newer tape drives, or IBM TS1140 or newer tape > drives, and wants to try it out, let me know. I=E2=80=99ll send you the = code when > I=E2=80=99ve got it at least somewhat ready. This is IBM-specific, and w= on=E2=80=99t work > on HP tape drives. >=20 > Ken > =E2=80=94 > Ken Merry > ken@FreeBSD.ORG >=20 >=20 >=20 > _______________________________________________ > freebsd-fs@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" From owner-freebsd-fs@freebsd.org Tue Mar 1 00:06:58 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D8ECAAB75B2 for ; Tue, 1 Mar 2016 00:06:58 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id CBF491298 for ; Tue, 1 Mar 2016 00:06:58 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id u2106wi5077040 for ; Tue, 1 Mar 2016 00:06:58 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 200513] Race at shutdown and corrupt fusefs systems Date: Tue, 01 Mar 2016 00:06:58 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 10.1-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: rmacklem@FreeBSD.org X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 01 Mar 2016 00:06:58 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D200513 Rick Macklem changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |rmacklem@FreeBSD.org --- Comment #2 from Rick Macklem --- Well, for FreeBSD10 (and later) the fuse module is in the kernel and not a port. As such, I think the rc scripts should be fixed to handle this. (I am not an rc.d guy, so I do not know if this script is the best way to fix things.) --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-fs@freebsd.org Tue Mar 1 06:07:13 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id BB7BCABE382 for ; Tue, 1 Mar 2016 06:07:13 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id AC79878D for ; Tue, 1 Mar 2016 06:07:13 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id u2167D6m041740 for ; Tue, 1 Mar 2016 06:07:13 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 200513] Race at shutdown and corrupt fusefs systems Date: Tue, 01 Mar 2016 06:07:13 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 10.1-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: rkoberman@gmail.com X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 01 Mar 2016 06:07:13 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D200513 --- Comment #3 from rkoberman@gmail.com --- Minor detail... under 10 (and head) fuse is built as a kernel module. It is= not in GENERIC. The kmod is only used on 9. There are also significant differen= ce in user-side code in 9. I am not confident that this is needed there. I only ran into the corruption issue after I started using the base-system patches while they were under test and development. (I had a LOT of other issues wi= th the old fuse, so this might have been hidden by more other issues.) Because of this, I'm not at all sure that it belongs in the kmod port, thou= gh it might. I think that it does belong in the base system /etc/rc.d in 10 and head. --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-fs@freebsd.org Tue Mar 1 15:11:19 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C90D3ABE4A3 for ; Tue, 1 Mar 2016 15:11:19 +0000 (UTC) (envelope-from ken@kdm.org) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id B30351586 for ; Tue, 1 Mar 2016 15:11:19 +0000 (UTC) (envelope-from ken@kdm.org) Received: by mailman.ysv.freebsd.org (Postfix) id B07A6ABE4A1; Tue, 1 Mar 2016 15:11:19 +0000 (UTC) Delivered-To: fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id AFF57ABE49F; Tue, 1 Mar 2016 15:11:19 +0000 (UTC) (envelope-from ken@kdm.org) Received: from mithlond.kdm.org (mithlond.kdm.org [96.89.93.250]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "A1-33714", Issuer "A1-33714" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 896B91584; Tue, 1 Mar 2016 15:11:19 +0000 (UTC) (envelope-from ken@kdm.org) Received: from mithlond.kdm.org (localhost [127.0.0.1]) by mithlond.kdm.org (8.15.2/8.14.9) with ESMTPS id u21FBAQs080918 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Tue, 1 Mar 2016 10:11:10 -0500 (EST) (envelope-from ken@mithlond.kdm.org) Received: (from ken@localhost) by mithlond.kdm.org (8.15.2/8.14.9/Submit) id u21FBAbi080917; Tue, 1 Mar 2016 10:11:10 -0500 (EST) (envelope-from ken) Date: Tue, 1 Mar 2016 10:11:10 -0500 From: "Kenneth D. Merry" To: Rick Macklem Cc: fs@freebsd.org, scsi@freebsd.org Subject: Re: FUSE extended attribute patches available Message-ID: <20160301151109.GA79912@mithlond.kdm.org> References: <1740288370.14765302.1456790548437.JavaMail.zimbra@uoguelph.ca> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1740288370.14765302.1456790548437.JavaMail.zimbra@uoguelph.ca> User-Agent: Mutt/1.5.23 (2014-03-12) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3 (mithlond.kdm.org [127.0.0.1]); Tue, 01 Mar 2016 10:11:10 -0500 (EST) X-Spam-Status: No, score=-1.2 required=5.0 tests=ALL_TRUSTED,BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,URIBL_BLACK autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on mithlond.kdm.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 01 Mar 2016 15:11:19 -0000 On Mon, Feb 29, 2016 at 19:02:28 -0500, Rick Macklem wrote: > Ken Merry wrote: > > I have patches for FreeBSD???s FUSE filesystem kernel module to support > > extended attributes: > > > > https://people.freebsd.org/~ken/fuse_extattr.20160229.1.txt > > > > The patch implements the get/set/delete/list extended attribute methods. The > > listing code also converts extended attribute lists from the Linux/FUSE > > format to the FreeBSD format. > I also have patches, although my list didn't work. (I didn't know that there was > a difference between what Linux/FUSE returns vs what FreeBSD wanted. So now I know > why my "list" didn't work.) > Btw, when I discussed what to do w.r.t. extended attribute namespace, he seemed to > think just considering all the fuse ones as User was ok. Ahh. Who is "he" in this context? It was easy enough to allow access to the user and system namespace. FUSE and Linux use the same names, so might as well pass them through. The other issue as far as FUSE goes, is that it is expecting the "user." or "system." prefix on the attribute name. FreeBSD passes those as a separate, numeric, argument in the VFS layer at least, and expects to not see the namespace as a prefix when listing attributes. > I also have patched FreeBSD's Fuse for the other stuff needed to export the fuse > mount via an NFS server. (VFS_FHTOVP() etc) These aren't quite ready for "prime time" > but if anyone wants to try them, just email me and I'll send you a copy. > > Btw, I have a couple of patches related to direct I/O and buffered I/O. You can > find 2 of these at PR#206238. I also patched fuse_vnop_inactive() to flush/write > dirty pages. (I think this is needed when an application mmaps a file and the > modifies it after closing the file descriptor. I haven't actually tested to see > if this fix is needed, so I haven't put it anywhere yet.) Ahh, cool! Now I can export a tape via NFS! :) Seriously, though, that will be helpful for some filesystems. > > For example: > > > > # touch foo > > # ls -la foo > > -rwxrwxrwx 1 root wheel 0 Feb 29 21:40 foo > > # lsextattr user foo > > foo > > # setextattr user testattr1 "12345678" foo > > # lsextattr user foo > > foo testattr1 > > # getextattr user testattr1 foo > > foo 12345678 > > # setextattr user testattr2 "87654321" foo > > # lsextattr user foo > > foo testattr2 testattr1 > > # rmextattr user testattr1 foo > > # lsextattr user foo > > foo testattr2 > > # getextattr user testattr1 foo > > getextattr: foo: failed: Attribute not found > > # getextattr user testattr2 foo > > foo 87654321 > > > > > > Just to be clear on what this does, it only provides extended attribute > > support to FreeBSD applications if the underlying FUSE filesystem implements > > FUSE extended attribute support. Many FUSE filesystems don???t support the > > extended attribute VFS operations. > > > > I have tested this out on IBM???s LTFS implementation, but I have not yet found > > another FUSE filesystem that supports extended attributes. If anyone knows > > of one, please let me know so I can try it out. (I looked through a number > > of the filesystems in sysutils/fusefs* in the ports tree.) > > > The FreeBSD GlusterFS port includes a fuse interface that supports extended > attributes. That is how I tested what I have. (I think the port is now in > svn, but if not you can find the GlusterFS port at PR#194409.) > It glusterfs.org doc doesn't mention that you can create/test a volume of > only one brick, but that works for trivial testing. If you decide to use it > for testing and have trouble getting it going, just email. I know diddly about > GlusterFS, but I have fired it up a few times. Ahh, that would be helpful. I'll give it a try. > > Any feedback is welcome. I???m planning to check this into FreeBSD/head in the > > next week or so. > > > I'll try to get around and taking a look to see if there is anything different > than what I did (other than the above "list" case I didn't get right;-). Yes, it would be good to get another set of eyes. Perhaps the namespace handling (user versus system) is different as well. Ken -- Kenneth Merry ken@FreeBSD.ORG From owner-freebsd-fs@freebsd.org Tue Mar 1 17:17:58 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2CC9FABDAF9 for ; Tue, 1 Mar 2016 17:17:58 +0000 (UTC) (envelope-from 000.fbsd@quip.cz) Received: from elsa.codelab.cz (elsa.codelab.cz [94.124.105.4]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id E8BF5147E for ; Tue, 1 Mar 2016 17:17:57 +0000 (UTC) (envelope-from 000.fbsd@quip.cz) Received: from elsa.codelab.cz (localhost [127.0.0.1]) by elsa.codelab.cz (Postfix) with ESMTP id 5879828429; Tue, 1 Mar 2016 18:17:54 +0100 (CET) Received: from illbsd.quip.test (ip-86-49-16-209.net.upcbroadband.cz [86.49.16.209]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by elsa.codelab.cz (Postfix) with ESMTPSA id 9731B28439; Tue, 1 Mar 2016 18:17:53 +0100 (CET) Message-ID: <56D5CEC1.2070000@quip.cz> Date: Tue, 01 Mar 2016 18:17:53 +0100 From: Miroslav Lachman <000.fbsd@quip.cz> User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:35.0) Gecko/20100101 Firefox/35.0 SeaMonkey/2.32 MIME-Version: 1.0 To: Steven Hartland , freebsd-fs@freebsd.org Subject: Re: abnormally high CPU load after zfs destroy References: <56D4964D.3010604@quip.cz> <56D4B66D.4070007@multiplay.co.uk> In-Reply-To: <56D4B66D.4070007@multiplay.co.uk> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 01 Mar 2016 17:17:58 -0000 Steven Hartland wrote on 02/29/2016 22:21: > Its likely churning through the actual delete, show system processes in > top and you'll see it. Yes, there are about 300 kernel thread doing ZFS work, but should it really bomb the system that way? The system is heavilly lagging for about 10 minutes. Are there any sysctl to control this behavior? Miroslav Lachman From owner-freebsd-fs@freebsd.org Wed Mar 2 01:04:01 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0EC18ABE6BD for ; Wed, 2 Mar 2016 01:04:01 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id E2F131C15 for ; Wed, 2 Mar 2016 01:04:00 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: by mailman.ysv.freebsd.org (Postfix) id E0981ABE6BB; Wed, 2 Mar 2016 01:04:00 +0000 (UTC) Delivered-To: fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C62F4ABE6BA; Wed, 2 Mar 2016 01:04:00 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id 396111C14; Wed, 2 Mar 2016 01:03:59 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) IronPort-PHdr: 9a23:VQBJoxHYPQl88CXGEufR0p1GYnF86YWxBRYc798ds5kLTJ75osmwAkXT6L1XgUPTWs2DsrQf27WQ7PmrAzxIyK3CmU5BWaQEbwUCh8QSkl5oK+++Imq/EsTXaTcnFt9JTl5v8iLzG0FUHMHjew+a+SXqvnYsExnyfTB4Ov7yUtaLyZ/niKbtotaJM01hv3mUX/BbFF2OtwLft80b08NJC50a7V/3mEZOYPlc3mhyJFiezF7W78a0+4N/oWwL46pyv50IbaKvNYc1S7pVEDRuHyZ9wcDxrwiJBV+M6300fH8bnzBzL07i1j6sDbnrtS6vjOt222G/NMb1Sb0xEWC46q5gSxvljQ8aMDEk/WXPiop7hfQI81qauxVjztuMM8muP/1kc/aFcA== X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2DPAQBaO9ZW/61jaINcFoN2bQa4GoITAQ2BZiGFcgKCBhQBAQEBAQEBAWMngi2CFAEBAQMBI1YFCwIBCBgCAg0ZAgJXAgQTiBcIDq8MjxEBAQEBAQEBAQEBAQEBAQEBGnuFF4F0gkaEBREBBoMYgToFh1SFV3Q9iDKFWYlpS4N5iFKOSgIeAQFCggMZgWYeLgEBAYcMNH4BAQE X-IronPort-AV: E=Sophos;i="5.22,524,1449550800"; d="scan'208";a="270296052" Received: from nipigon.cs.uoguelph.ca (HELO zcs1.mail.uoguelph.ca) ([131.104.99.173]) by esa-annu.net.uoguelph.ca with ESMTP; 01 Mar 2016 20:03:52 -0500 Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id DCAB315F55D; Tue, 1 Mar 2016 20:03:52 -0500 (EST) Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id ecdwE9WP9mWa; Tue, 1 Mar 2016 20:03:52 -0500 (EST) Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 1AC9D15F565; Tue, 1 Mar 2016 20:03:52 -0500 (EST) X-Virus-Scanned: amavisd-new at zcs1.mail.uoguelph.ca Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id In4kA1axQUpe; Tue, 1 Mar 2016 20:03:52 -0500 (EST) Received: from zcs1.mail.uoguelph.ca (zcs1.mail.uoguelph.ca [172.17.95.18]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id F379B15F55D; Tue, 1 Mar 2016 20:03:51 -0500 (EST) Date: Tue, 1 Mar 2016 20:03:51 -0500 (EST) From: Rick Macklem To: "Kenneth D. Merry" Cc: fs@freebsd.org, scsi@freebsd.org Message-ID: <2079334495.1177596.1456880631768.JavaMail.zimbra@uoguelph.ca> In-Reply-To: <20160301151109.GA79912@mithlond.kdm.org> References: <1740288370.14765302.1456790548437.JavaMail.zimbra@uoguelph.ca> <20160301151109.GA79912@mithlond.kdm.org> Subject: Re: FUSE extended attribute patches available MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.95.12] X-Mailer: Zimbra 8.0.9_GA_6191 (ZimbraWebClient - IE7 (Win)/8.0.9_GA_6191) Thread-Topic: FUSE extended attribute patches available Thread-Index: f0Y98rCs8CXfE8GZlh7LylfdvF2/qg== X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 Mar 2016 01:04:01 -0000 Kenneth D. Merry wrote: > On Mon, Feb 29, 2016 at 19:02:28 -0500, Rick Macklem wrote: > > Ken Merry wrote: > > > I have patches for FreeBSD???s FUSE filesystem kernel module to support > > > extended attributes: > > > > > > https://people.freebsd.org/~ken/fuse_extattr.20160229.1.txt > > > > > > The patch implements the get/set/delete/list extended attribute methods. > > > The > > > listing code also converts extended attribute lists from the Linux/FUSE > > > format to the FreeBSD format. > > I also have patches, although my list didn't work. (I didn't know that > > there was > > a difference between what Linux/FUSE returns vs what FreeBSD wanted. So now > > I know > > why my "list" didn't work.) > > Btw, when I discussed what to do w.r.t. extended attribute namespace, he > > seemed to > > think just considering all the fuse ones as User was ok. > > Ahh. Who is "he" in this context? > rwatson@ > It was easy enough to allow access to the user and system namespace. FUSE > and Linux use the same names, so might as well pass them through. The > other issue as far as FUSE goes, is that it is expecting the "user." or > "system." prefix on the attribute name. > Well, the extended attributes I am interested in are generated automagically by GlusterFS and they don't have a "user." or "system." prefix. For example: glusterfs.gfid This wouldn't work if fuse prepended the "user." or "system.". > FreeBSD passes those as a separate, numeric, argument in the VFS layer at > least, and expects to not see the namespace as a prefix when listing > attributes. > > > I also have patched FreeBSD's Fuse for the other stuff needed to export the > > fuse > > mount via an NFS server. (VFS_FHTOVP() etc) These aren't quite ready for > > "prime time" > > but if anyone wants to try them, just email me and I'll send you a copy. > > > > Btw, I have a couple of patches related to direct I/O and buffered I/O. You > > can > > find 2 of these at PR#206238. I also patched fuse_vnop_inactive() to > > flush/write > > dirty pages. (I think this is needed when an application mmaps a file and > > the > > modifies it after closing the file descriptor. I haven't actually tested to > > see > > if this fix is needed, so I haven't put it anywhere yet.) > > Ahh, cool! Now I can export a tape via NFS! :) > > Seriously, though, that will be helpful for some filesystems. > Btw, I have also done VOP_ADVLOCK(), since GlusterFS supports that. > > > For example: > > > > > > # touch foo > > > # ls -la foo > > > -rwxrwxrwx 1 root wheel 0 Feb 29 21:40 foo > > > # lsextattr user foo > > > foo > > > # setextattr user testattr1 "12345678" foo > > > # lsextattr user foo > > > foo testattr1 > > > # getextattr user testattr1 foo > > > foo 12345678 > > > # setextattr user testattr2 "87654321" foo > > > # lsextattr user foo > > > foo testattr2 testattr1 > > > # rmextattr user testattr1 foo > > > # lsextattr user foo > > > foo testattr2 > > > # getextattr user testattr1 foo > > > getextattr: foo: failed: Attribute not found > > > # getextattr user testattr2 foo > > > foo 87654321 > > > > > > > > > Just to be clear on what this does, it only provides extended attribute > > > support to FreeBSD applications if the underlying FUSE filesystem > > > implements > > > FUSE extended attribute support. Many FUSE filesystems don???t support > > > the > > > extended attribute VFS operations. > > > > > > I have tested this out on IBM???s LTFS implementation, but I have not yet > > > found > > > another FUSE filesystem that supports extended attributes. If anyone > > > knows > > > of one, please let me know so I can try it out. (I looked through a > > > number > > > of the filesystems in sysutils/fusefs* in the ports tree.) > > > > > The FreeBSD GlusterFS port includes a fuse interface that supports extended > > attributes. That is how I tested what I have. (I think the port is now in > > svn, but if not you can find the GlusterFS port at PR#194409.) > > It glusterfs.org doc doesn't mention that you can create/test a volume of > > only one brick, but that works for trivial testing. If you decide to use it > > for testing and have trouble getting it going, just email. I know diddly > > about > > GlusterFS, but I have fired it up a few times. > > Ahh, that would be helpful. I'll give it a try. > > > > Any feedback is welcome. I???m planning to check this into FreeBSD/head > > > in the > > > next week or so. > > > > > I'll try to get around and taking a look to see if there is anything > > different > > than what I did (other than the above "list" case I didn't get right;-). > > Yes, it would be good to get another set of eyes. Perhaps the namespace > handling (user versus system) is different as well. > Yep, as above, the "namespace" is an issue. Have fun with it, rick > Ken > -- > Kenneth Merry > ken@FreeBSD.ORG > From owner-freebsd-fs@freebsd.org Wed Mar 2 01:36:35 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7B346AC0623 for ; Wed, 2 Mar 2016 01:36:35 +0000 (UTC) (envelope-from pfg@FreeBSD.org) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 5FD1B11B2 for ; Wed, 2 Mar 2016 01:36:35 +0000 (UTC) (envelope-from pfg@FreeBSD.org) Received: by mailman.ysv.freebsd.org (Postfix) id 5E267AC0620; Wed, 2 Mar 2016 01:36:35 +0000 (UTC) Delivered-To: fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5DBCBAC061F for ; Wed, 2 Mar 2016 01:36:35 +0000 (UTC) (envelope-from pfg@FreeBSD.org) Received: from nm38-vm1.bullet.mail.bf1.yahoo.com (nm38-vm1.bullet.mail.bf1.yahoo.com [72.30.239.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 1872A11B0 for ; Wed, 2 Mar 2016 01:36:35 +0000 (UTC) (envelope-from pfg@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1456882208; bh=oYOpE0rWiArl/uu4eWCE/Zpj8S11Bvx34YsVkK26WEQ=; h=From:Subject:To:Date:From:Subject; b=DydNsr5QbNkdxlEucRIwlf1yAopDKm9SxfuF4k5Eq7o5biT+YSV0TmDmxCwCWpOH2R52449hiRqdbDvdtXmr3Umju2N/a5HsGVZ62VEHXDcAruepzF2Or9tSg4UVKTmjjMc/CoKFVTN1H8SOR2usfNPYTcN3dXSMDFL08S6x7HIHq52wdoA/wWXb58XERzfB01IbWbIWvo/o/9/YjbltOqMhAtpqH0KG+DSPSDsyA0kUy/dvemI/KwnUGCbmM8TbQEleQnaQVQ97wQYg0/w0BSP1HSrEqon0AtTceGCz0L5SGbsNzW3iV6FkooeBaUBB9YxT1147nqDzsr2CqxjKJw== Received: from [66.196.81.173] by nm38.bullet.mail.bf1.yahoo.com with NNFMP; 02 Mar 2016 01:30:08 -0000 Received: from [98.139.213.12] by tm19.bullet.mail.bf1.yahoo.com with NNFMP; 02 Mar 2016 01:30:08 -0000 Received: from [127.0.0.1] by smtp112.mail.bf1.yahoo.com with NNFMP; 02 Mar 2016 01:30:08 -0000 X-Yahoo-Newman-Id: 278792.70794.bm@smtp112.mail.bf1.yahoo.com X-Yahoo-Newman-Property: ymail-3 X-YMail-OSG: wOZdQJoVM1nf7ayMZ2D2sHLXqpes6QU2lMkjRScDd5Me37M LEjO7bwpYVzI5H64nbR8EfmVeK_aCfaZJsKihPH.JC_DDPtG5eqIo8aCrONm .nkSOyP6L1CZo87FxHsyWRptVHk8VuEnvxr1Tc.PPv.NwQlsUY_8B3CEAist dv3ViEZs_c5SL7WqbCppcsaW3JbsheeFnXrTp8OXVYDU1mpbpdz7Tw2XKkYM 0DCEbC4CJrLag5laAXi3TsR43qhQ2NTkbDuGSzBwNl9yrw_jrnZIp5iNbfwQ jDi8zS4KcJvCnQDH7XNQv96d1yy2B.c.bDc1tcVIcftJeI7rd.SBaFTCfRLU fxjgE6SQ0EK0nD7YLTjuB.ythcvFf3OGuWySTgyQvO9.wRzG5d6dwH71xmJj wylABvBL31QFYSfqv9bThXeOkuxUhyjqmvBcIkVHPmSrh1Y.LtehovkDR3Cu Klh17_uxkiR1acuX_Mbv7Gkygb6CpO2DzNVWqeNzYdQWgZcRdeem8PkXUhQl zh.T.KTc9Nsc9LeRr28YH50mULSN.oTGF X-Yahoo-SMTP: xcjD0guswBAZaPPIbxpWwLcp9Unf From: Pedro Giffuni Subject: Re: FUSE extended attribute patches available To: Ken Merry , fs@freebsd.org, Rick Macklem Message-ID: <56D6424D.7090104@FreeBSD.org> Date: Tue, 1 Mar 2016 20:30:53 -0500 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 Mar 2016 01:36:35 -0000 Hello guys Ken Merry wrote: > I have patches for FreeBSD=E2=80=99s FUSE filesystem kernel module to > support extended attributes: > > https://people.freebsd.org/~ken/fuse_extattr.20160229.1.txt > > The patch implements the get/set/delete/list extended attribute > methods. > The listing code also converts extended attribute lists from the > Linux/FUSE format to the FreeBSD format. Very interesting, I was unaware we could convert attributes with linux. That would be useful for ext2fs. Please update the fuse/fuse-kernel.h to 7.9 [1]. That version added the attributes stuff and not having the same headers as the linux versions causes trouble. FWIW, it was an event that I try to forget, but I tried to bump the level to 7.10 without implementing the API and it broke a lot of stuff. Pedro. [1] The fuse-kernel.h header is under a BSD license https://github.com/libfuse/libfuse/commit/b20d88bbbc6e5ae67f0c99595859fd653949a3aa From owner-freebsd-fs@freebsd.org Wed Mar 2 02:51:03 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E9D16AC00A8 for ; Wed, 2 Mar 2016 02:51:03 +0000 (UTC) (envelope-from bfriesen@simple.dallas.tx.us) Received: from smtp.simplesystems.org (smtp.simplesystems.org [65.66.246.90]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 8DB201C71 for ; Wed, 2 Mar 2016 02:51:03 +0000 (UTC) (envelope-from bfriesen@simple.dallas.tx.us) Received: from freddy.simplesystems.org (freddy.simplesystems.org [65.66.246.65]) by smtp.simplesystems.org (8.14.4+Sun/8.14.4) with ESMTP id u222obpA020688; Tue, 1 Mar 2016 20:50:37 -0600 (CST) Date: Tue, 1 Mar 2016 20:50:37 -0600 (CST) From: Bob Friesenhahn X-X-Sender: bfriesen@freddy.simplesystems.org To: Miroslav Lachman <000.fbsd@quip.cz> cc: Steven Hartland , freebsd-fs@freebsd.org Subject: Re: abnormally high CPU load after zfs destroy In-Reply-To: <56D5CEC1.2070000@quip.cz> Message-ID: References: <56D4964D.3010604@quip.cz> <56D4B66D.4070007@multiplay.co.uk> <56D5CEC1.2070000@quip.cz> User-Agent: Alpine 2.01 (GSO 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (smtp.simplesystems.org [65.66.246.90]); Tue, 01 Mar 2016 20:50:37 -0600 (CST) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 Mar 2016 02:51:04 -0000 On Tue, 1 Mar 2016, Miroslav Lachman wrote: > Steven Hartland wrote on 02/29/2016 22:21: >> Its likely churning through the actual delete, show system processes in >> top and you'll see it. > > Yes, there are about 300 kernel thread doing ZFS work, but should it really > bomb the system that way? The system is heavilly lagging for about 10 > minutes. Are there any sysctl to control this behavior? Is it possible that you enabled the dedup feature? Bob -- Bob Friesenhahn bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/ From owner-freebsd-fs@freebsd.org Wed Mar 2 09:12:34 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7CA80AC1053 for ; Wed, 2 Mar 2016 09:12:34 +0000 (UTC) (envelope-from sobomax@sippysoft.com) Received: from mail-wm0-x232.google.com (mail-wm0-x232.google.com [IPv6:2a00:1450:400c:c09::232]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 0FA1410AB for ; Wed, 2 Mar 2016 09:12:33 +0000 (UTC) (envelope-from sobomax@sippysoft.com) Received: by mail-wm0-x232.google.com with SMTP id l68so70589316wml.0 for ; Wed, 02 Mar 2016 01:12:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sippysoft-com.20150623.gappssmtp.com; s=20150623; h=mime-version:sender:date:message-id:subject:from:to:cc; bh=M3eErwsYcbIGbul4VGdGyj18qwbKY9IslEmwfvimpgU=; b=zWojK1QQP9vzc0VQtM828FEDjKA4FpTQaSkt661Ct6uoeh5WIEYtea0Q8gRMjucS75 CECBtol3FWOd4TeD/V7qQGUXg3eiENhq4162ZTLfedTZoD9iJd6JKyR9vRimW2fzzx16 wCZGG50NjV0gp0Xq7sK4N9SI3OUNWkOnHoOEMKiNloIfeHE2ZjKNwl6VP48eAyeXTp2E kLh1Yufo2m9bsPc3KLHHZKpa5xe35kZ4/M3CyINk/W0CFko4jIcb6ZmN85jO++FcQZy8 ubWR1N1oPc51mQTyFjjsW+Vvz2BRpCZiywZOc5NLypth1SE41UKQ64WmnXefZopQJ99o FB0Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:date:message-id:subject:from :to:cc; bh=M3eErwsYcbIGbul4VGdGyj18qwbKY9IslEmwfvimpgU=; b=E5Ztjbo2n+vcVipWHygcwi52uxSsferES4W4894mk3GAJ+D9rIgjWFWqm+7yxaJNBI /61FYMt2NwbW0vSG/Jx3/hBvNJEUtRzhomwWuh1C2f42aBoDRkYtanBdeb83vEVUnQkp 8AWpJt2sXaqitzgkKLdnDLDcxg2PPrpvio00KroUK+jNKmUem1rfIcVBnL2d0uMCiqdb 3VOznLM99bKPFqRrsLSut93jWmzQ5vyWuc6P4Qumip9Y63e/nlfyJRNHkTin44xa03Yk QTNP5l7weF38InMuFbKJNwCsiUG/R8xuwksDheQ2J8Kd24QqqjYtOS66swtT9pqiMQO7 K0XQ== X-Gm-Message-State: AD7BkJLMwQIZve2tljC53MPHjDb7hjNNu6CCJEX2KaDqO0yAxjnYlsJdXkxEJQ13Gj27dtcPA7P6/jINQGRXptSs MIME-Version: 1.0 X-Received: by 10.28.148.16 with SMTP id w16mr1148701wmd.90.1456909951915; Wed, 02 Mar 2016 01:12:31 -0800 (PST) Sender: sobomax@sippysoft.com Received: by 10.27.218.12 with HTTP; Wed, 2 Mar 2016 01:12:31 -0800 (PST) Date: Wed, 2 Mar 2016 01:12:31 -0800 X-Google-Sender-Auth: 2EwUDXaJj1zU5SYvfTyflX5yKdc Message-ID: Subject: Process stuck in "vnread" From: Maxim Sobolev To: stable@freebsd.org, freebsd-fs@freebsd.org Cc: Kirk McKusick , kib@freebsd.org Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 Mar 2016 09:12:34 -0000 Hi, I've encountered cp(1) process stuck in the vnread state on one of my build machines that got recently upgraded to 10.3. 0 79596 1 0 20 0 17092 1396 wait I 1 0:00.00 /bin/sh /usr/local/bin/autoreconf -f -i 0 79602 79596 0 52 0 41488 9036 wait I 1 0:00.07 /usr/local/bin/perl -w /usr/local/bin/autoreconf-2.69 -f -i 0 79639 79602 0 72 0 0 0 - Z 1 0:00.27 0 79762 79602 0 20 0 17092 1396 wait I 1 0:00.00 /bin/sh /usr/local/bin/automake --add-missing --copy --force-missing 0 79768 79762 0 52 0 49736 13936 wait I 1 0:00.11 /usr/local/bin/perl -w /usr/local/bin/automake-1.15 --add-missing --copy --force-missing 0 79962 79768 0 20 0 12368 1024 vnread DL 1 0:00.00 cp /usr/local/share/automake-1.15/compile ./compile I am not sure if it's related to that OS version upgrade, but I have not seen any such issues on the same machine in 2-3 years running essentially the same build process with version 9.x, 10.0, 10.1 and 10.2. $ uname -a FreeBSD van01.sippysoft.com 10.3-PRERELEASE FreeBSD 10.3-PRERELEASE #1 80de3e2(master)-dirty: Tue Feb 2 12:19:57 PST 2016 sobomax@abc.sippysoft.com:/usr/obj/usr/home/sobomax/projects/freebsd103/sys/ABC amd64 The kernel stack trace is: (kgdb) thread 360 [Switching to thread 360 (Thread 100515)]#0 0xffffffff8095244e in sched_switch () (kgdb) bt #0 0xffffffff8095244e in sched_switch () #1 0xffffffff809313b1 in mi_switch () #2 0xffffffff8097089a in sleepq_wait () #3 0xffffffff80930dd7 in _sleep () #4 0xffffffff809b230e in bwait () #5 0xffffffff80b511f3 in vnode_pager_generic_getpages () #6 0xffffffff80dd1607 in VOP_GETPAGES_APV () #7 0xffffffff80b4f59a in vnode_pager_getpages () #8 0xffffffff80b30031 in vm_fault_hold () #9 0xffffffff80b2f797 in vm_fault () #10 0xffffffff80cb5a75 in trap_pfault () #11 0xffffffff80cb51dd in trap () #12 0xffffffff80c9b122 in calltrap () #13 0xffffffff80cb36f1 in copyin () #14 0xffffffff80977ddf in uiomove_faultflag () The FS stack configuration is somewhat unique, so I am not sure if I am hitting some rare race condition or lock ordering issues specific to that. It's basically ZFS (ZRAID) on top of pair or SATA SSDs with big file on that FS attached via md(4) and UFS2 on that md(4). The build itself runs in chroot with that UFS2 fs as its primary root. Just maybe additional bit of info, attempting to list the directory with that UFS image also got my bash process stuck in "zfs" state, backtrace from that is: (kgdb) thread 353 [Switching to thread 353 (Thread 100508)]#0 0xffffffff8095244e in sched_switch () (kgdb) bt #0 0xffffffff8095244e in sched_switch () #1 0xffffffff809313b1 in mi_switch () #2 0xffffffff8097089a in sleepq_wait () #3 0xffffffff809069ad in sleeplk () #4 0xffffffff809060e0 in __lockmgr_args () #5 0xffffffff809b8b7c in vop_stdlock () #6 0xffffffff80dd0a3b in VOP_LOCK1_APV () #7 0xffffffff809d6d23 in _vn_lock () #8 0xffffffff81a8c9cd in ?? () #9 0x0000000000000000 in ?? () From owner-freebsd-fs@freebsd.org Wed Mar 2 09:32:22 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D76A5AC195C; Wed, 2 Mar 2016 09:32:22 +0000 (UTC) (envelope-from ronald-lists@klop.ws) Received: from smarthost1.greenhost.nl (smarthost1.greenhost.nl [195.190.28.81]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id A08281D9E; Wed, 2 Mar 2016 09:32:22 +0000 (UTC) (envelope-from ronald-lists@klop.ws) Received: from smtp.greenhost.nl ([213.108.104.138]) by smarthost1.greenhost.nl with esmtps (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16) (Exim 4.72) (envelope-from ) id 1ab38a-0005Rg-Gm; Wed, 02 Mar 2016 10:32:13 +0100 Content-Type: text/plain; charset=utf-8; format=flowed; delsp=yes To: stable@freebsd.org, freebsd-fs@freebsd.org, "Maxim Sobolev" Cc: "Kirk McKusick" , kib@freebsd.org Subject: Re: Process stuck in "vnread" References: Date: Wed, 02 Mar 2016 10:32:05 +0100 MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: "Ronald Klop" Message-ID: In-Reply-To: User-Agent: Opera Mail/1.0 (Win32) X-Authenticated-As-Hash: 398f5522cb258ce43cb679602f8cfe8b62a256d1 X-Virus-Scanned: by clamav at smarthost1.samage.net X-Spam-Level: / X-Spam-Score: -0.2 X-Spam-Status: No, score=-0.2 required=5.0 tests=ALL_TRUSTED, BAYES_50 autolearn=disabled version=3.4.0 X-Scan-Signature: d9b0ae15ee993d77aea4f0208a5c5b8c X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 Mar 2016 09:32:23 -0000 Hello, Would it be possible this has to do with the resolved 'system hangs when using ZFS caused by VFS' in 10.3-BETA3? https://lists.freebsd.org/pipermail/freebsd-stable/2016-February/084238.html Regards, Ronald. On Wed, 02 Mar 2016 10:12:31 +0100, Maxim Sobolev wrote: > Hi, I've encountered cp(1) process stuck in the vnread state on one of my > build machines that got recently upgraded to 10.3. > > 0 79596 1 0 20 0 17092 1396 wait I 1 > 0:00.00 > /bin/sh /usr/local/bin/autoreconf -f -i > 0 79602 79596 0 52 0 41488 9036 wait I 1 > 0:00.07 > /usr/local/bin/perl -w /usr/local/bin/autoreconf-2.69 -f -i > 0 79639 79602 0 72 0 0 0 - Z 1 > 0:00.27 > > 0 79762 79602 0 20 0 17092 1396 wait I 1 > 0:00.00 > /bin/sh /usr/local/bin/automake --add-missing --copy --force-missing > 0 79768 79762 0 52 0 49736 13936 wait I 1 > 0:00.11 > /usr/local/bin/perl -w /usr/local/bin/automake-1.15 --add-missing --copy > --force-missing > 0 79962 79768 0 20 0 12368 1024 vnread DL 1 > 0:00.00 > cp /usr/local/share/automake-1.15/compile ./compile > > I am not sure if it's related to that OS version upgrade, but I have not > seen any such issues on the same machine in 2-3 years running essentially > the same build process with version 9.x, 10.0, 10.1 and 10.2. > > $ uname -a > FreeBSD van01.sippysoft.com 10.3-PRERELEASE FreeBSD 10.3-PRERELEASE #1 > 80de3e2(master)-dirty: Tue Feb 2 12:19:57 PST 2016 > sobomax@abc.sippysoft.com:/usr/obj/usr/home/sobomax/projects/freebsd103/sys/ABC > amd64 > > The kernel stack trace is: > > (kgdb) thread 360 > [Switching to thread 360 (Thread 100515)]#0 0xffffffff8095244e in > sched_switch () > (kgdb) bt > #0 0xffffffff8095244e in sched_switch () > #1 0xffffffff809313b1 in mi_switch () > #2 0xffffffff8097089a in sleepq_wait () > #3 0xffffffff80930dd7 in _sleep () > #4 0xffffffff809b230e in bwait () > #5 0xffffffff80b511f3 in vnode_pager_generic_getpages () > #6 0xffffffff80dd1607 in VOP_GETPAGES_APV () > #7 0xffffffff80b4f59a in vnode_pager_getpages () > #8 0xffffffff80b30031 in vm_fault_hold () > #9 0xffffffff80b2f797 in vm_fault () > #10 0xffffffff80cb5a75 in trap_pfault () > #11 0xffffffff80cb51dd in trap () > #12 0xffffffff80c9b122 in calltrap () > #13 0xffffffff80cb36f1 in copyin () > #14 0xffffffff80977ddf in uiomove_faultflag () > > The FS stack configuration is somewhat unique, so I am not sure if I am > hitting some rare race condition or lock ordering issues specific to > that. > It's basically ZFS (ZRAID) on top of pair or SATA SSDs with big file on > that FS attached via md(4) and UFS2 on that md(4). The build itself runs > in > chroot with that UFS2 fs as its primary root. > > Just maybe additional bit of info, attempting to list the directory with > that UFS image also got my bash process stuck in "zfs" state, backtrace > from that is: > > (kgdb) thread 353 > [Switching to thread 353 (Thread 100508)]#0 0xffffffff8095244e in > sched_switch () > (kgdb) bt > #0 0xffffffff8095244e in sched_switch () > #1 0xffffffff809313b1 in mi_switch () > #2 0xffffffff8097089a in sleepq_wait () > #3 0xffffffff809069ad in sleeplk () > #4 0xffffffff809060e0 in __lockmgr_args () > #5 0xffffffff809b8b7c in vop_stdlock () > #6 0xffffffff80dd0a3b in VOP_LOCK1_APV () > #7 0xffffffff809d6d23 in _vn_lock () > #8 0xffffffff81a8c9cd in ?? () > #9 0x0000000000000000 in ?? () > _______________________________________________ > freebsd-fs@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" From owner-freebsd-fs@freebsd.org Wed Mar 2 09:43:49 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4839AAC1D1C for ; Wed, 2 Mar 2016 09:43:49 +0000 (UTC) (envelope-from 000.fbsd@quip.cz) Received: from elsa.codelab.cz (elsa.codelab.cz [94.124.105.4]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 0C6FB13D0 for ; Wed, 2 Mar 2016 09:43:48 +0000 (UTC) (envelope-from 000.fbsd@quip.cz) Received: from elsa.codelab.cz (localhost [127.0.0.1]) by elsa.codelab.cz (Postfix) with ESMTP id EDD052842E; Wed, 2 Mar 2016 10:43:38 +0100 (CET) Received: from illbsd.quip.test (ip-86-49-16-209.net.upcbroadband.cz [86.49.16.209]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by elsa.codelab.cz (Postfix) with ESMTPSA id 27C2128428; Wed, 2 Mar 2016 10:43:38 +0100 (CET) Message-ID: <56D6B5C9.7090409@quip.cz> Date: Wed, 02 Mar 2016 10:43:37 +0100 From: Miroslav Lachman <000.fbsd@quip.cz> User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:35.0) Gecko/20100101 Firefox/35.0 SeaMonkey/2.32 MIME-Version: 1.0 To: Bob Friesenhahn CC: Steven Hartland , freebsd-fs@freebsd.org Subject: Re: abnormally high CPU load after zfs destroy References: <56D4964D.3010604@quip.cz> <56D4B66D.4070007@multiplay.co.uk> <56D5CEC1.2070000@quip.cz> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 Mar 2016 09:43:49 -0000 Bob Friesenhahn wrote on 03/02/2016 03:50: > On Tue, 1 Mar 2016, Miroslav Lachman wrote: > >> Steven Hartland wrote on 02/29/2016 22:21: >>> Its likely churning through the actual delete, show system processes in >>> top and you'll see it. >> >> Yes, there are about 300 kernel thread doing ZFS work, but should it >> really bomb the system that way? The system is heavilly lagging for >> about 10 minutes. Are there any sysctl to control this behavior? > > Is it possible that you enabled the dedup feature? No, dedup is disabled and I never tried it on this machine. Miroslav Lachman From owner-freebsd-fs@freebsd.org Wed Mar 2 09:53:46 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 80FF7AC01C1; Wed, 2 Mar 2016 09:53:46 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 12CD21A48; Wed, 2 Mar 2016 09:53:45 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id u229reiU046583 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Wed, 2 Mar 2016 11:53:40 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua u229reiU046583 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id u229rd52046582; Wed, 2 Mar 2016 11:53:39 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Wed, 2 Mar 2016 11:53:39 +0200 From: Konstantin Belousov To: Maxim Sobolev Cc: stable@freebsd.org, freebsd-fs@freebsd.org, Kirk McKusick Subject: Re: Process stuck in "vnread" Message-ID: <20160302095339.GB67250@kib.kiev.ua> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 Mar 2016 09:53:46 -0000 On Wed, Mar 02, 2016 at 01:12:31AM -0800, Maxim Sobolev wrote: > Hi, I've encountered cp(1) process stuck in the vnread state on one of my > build machines that got recently upgraded to 10.3. > > 0 79596 1 0 20 0 17092 1396 wait I 1 0:00.00 > /bin/sh /usr/local/bin/autoreconf -f -i > 0 79602 79596 0 52 0 41488 9036 wait I 1 0:00.07 > /usr/local/bin/perl -w /usr/local/bin/autoreconf-2.69 -f -i > 0 79639 79602 0 72 0 0 0 - Z 1 0:00.27 > > 0 79762 79602 0 20 0 17092 1396 wait I 1 0:00.00 > /bin/sh /usr/local/bin/automake --add-missing --copy --force-missing > 0 79768 79762 0 52 0 49736 13936 wait I 1 0:00.11 > /usr/local/bin/perl -w /usr/local/bin/automake-1.15 --add-missing --copy > --force-missing > 0 79962 79768 0 20 0 12368 1024 vnread DL 1 0:00.00 > cp /usr/local/share/automake-1.15/compile ./compile > > I am not sure if it's related to that OS version upgrade, but I have not > seen any such issues on the same machine in 2-3 years running essentially > the same build process with version 9.x, 10.0, 10.1 and 10.2. > > $ uname -a > FreeBSD van01.sippysoft.com 10.3-PRERELEASE FreeBSD 10.3-PRERELEASE #1 > 80de3e2(master)-dirty: Tue Feb 2 12:19:57 PST 2016 > sobomax@abc.sippysoft.com:/usr/obj/usr/home/sobomax/projects/freebsd103/sys/ABC > amd64 > > The kernel stack trace is: > > (kgdb) thread 360 > [Switching to thread 360 (Thread 100515)]#0 0xffffffff8095244e in > sched_switch () > (kgdb) bt > #0 0xffffffff8095244e in sched_switch () > #1 0xffffffff809313b1 in mi_switch () > #2 0xffffffff8097089a in sleepq_wait () > #3 0xffffffff80930dd7 in _sleep () > #4 0xffffffff809b230e in bwait () > #5 0xffffffff80b511f3 in vnode_pager_generic_getpages () > #6 0xffffffff80dd1607 in VOP_GETPAGES_APV () > #7 0xffffffff80b4f59a in vnode_pager_getpages () > #8 0xffffffff80b30031 in vm_fault_hold () > #9 0xffffffff80b2f797 in vm_fault () > #10 0xffffffff80cb5a75 in trap_pfault () > #11 0xffffffff80cb51dd in trap () > #12 0xffffffff80c9b122 in calltrap () > #13 0xffffffff80cb36f1 in copyin () > #14 0xffffffff80977ddf in uiomove_faultflag () The backtrace indicates, with 99% certainity that the issue is in the requested read never finishing. But the backtrace is obviously not complete, and there might be something more happening. At least, we do not handle page-ins during uiomove() on user io for quite some time. If the vnode which io hung is UFS over md, you should look at the md worker thread state. > > The FS stack configuration is somewhat unique, so I am not sure if I am > hitting some rare race condition or lock ordering issues specific to that. > It's basically ZFS (ZRAID) on top of pair or SATA SSDs with big file on > that FS attached via md(4) and UFS2 on that md(4). The build itself runs in > chroot with that UFS2 fs as its primary root. > > Just maybe additional bit of info, attempting to list the directory with > that UFS image also got my bash process stuck in "zfs" state, backtrace > from that is: A deadlock in the underlying io layer is consistent with this (secondary) observation. > > (kgdb) thread 353 > [Switching to thread 353 (Thread 100508)]#0 0xffffffff8095244e in > sched_switch () > (kgdb) bt > #0 0xffffffff8095244e in sched_switch () > #1 0xffffffff809313b1 in mi_switch () > #2 0xffffffff8097089a in sleepq_wait () > #3 0xffffffff809069ad in sleeplk () > #4 0xffffffff809060e0 in __lockmgr_args () > #5 0xffffffff809b8b7c in vop_stdlock () > #6 0xffffffff80dd0a3b in VOP_LOCK1_APV () > #7 0xffffffff809d6d23 in _vn_lock () > #8 0xffffffff81a8c9cd in ?? () > #9 0x0000000000000000 in ?? () From owner-freebsd-fs@freebsd.org Wed Mar 2 11:02:05 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5D057AC1BBD for ; Wed, 2 Mar 2016 11:02:05 +0000 (UTC) (envelope-from sobomax@sippysoft.com) Received: from mail-wm0-x231.google.com (mail-wm0-x231.google.com [IPv6:2a00:1450:400c:c09::231]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id B11871BFB for ; Wed, 2 Mar 2016 11:02:04 +0000 (UTC) (envelope-from sobomax@sippysoft.com) Received: by mail-wm0-x231.google.com with SMTP id p65so72347789wmp.0 for ; Wed, 02 Mar 2016 03:02:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sippysoft-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc; bh=LZpjuCunY1PrcBphr3AjEXT/J/+yM52qJrdKzKAm+c0=; b=y22X0KdTWetTnwVajf7TM5HAXoldy4j0D7PTid2WCZbPxPDNlrvtLwJb1oiEnX8xb2 KbddKvj68XXsaY4TKwfKloQAaMKl7muVn7pmgYsX8m4fTsK890cH7RugEWvPHx5u14e6 7s4F3h6oejYO3cb+CRwnRnItwxaE071s/B+dgp/lMIx8Xl9LUAQqx/4Dw32N+ZJZGR5R jL1HY3GvekKg0duxBhTApUqpnNrY/LRwbIb6vH3/6iv1RZfYVcC50/3Vz2XWeMR2JMwD XJ90dl1C6zNlyxLk9nnn1I0moLc4pDaSJA3jTwz1NI20CXkV+A0rP90M3b0z2uslqclH N5Sw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc; bh=LZpjuCunY1PrcBphr3AjEXT/J/+yM52qJrdKzKAm+c0=; b=emNnG+bhW5siQaMyg5BWrpIUkbdNJ5dEP0AUF349p0DFGN8Yj2/mw21bRlKAsqctTA ZnIOyTkmdc3cWDZP5GGKE9z070UWbqekSRbqN88iqLzkY25G3dX5LvrFnMDPBZ8Sw5sj c8F+51IrEY0aFA76GjN0bSGCl2C33fLmLVxkIkhwagAMTbb0tIU4+DhCr9KesNpoXisZ UnT3B6IHJ6yhHHjm0jAI/Q7uJblF+ZfCuxuMJfmGk0w895T5fDbtj/noIcaX+kU2P57C 12FGE3XyMsIse+rNeNe1lrT8RnUe1Qb1YNiHB/MQLOGVzFSDzBhom/HRF7mE/TA0aD35 kjeg== X-Gm-Message-State: AD7BkJKrtLcUXe2NdggC1VQY7Gv/ga9YcL63fXjtT55fAmEJd0FyXHgNv5nfVfHceYRoAZ2qrmH1f5CyimXrUg4c MIME-Version: 1.0 X-Received: by 10.28.228.214 with SMTP id b205mr4281196wmh.94.1456916522736; Wed, 02 Mar 2016 03:02:02 -0800 (PST) Received: by 10.27.218.12 with HTTP; Wed, 2 Mar 2016 03:02:02 -0800 (PST) In-Reply-To: <20160302095339.GB67250@kib.kiev.ua> References: <20160302095339.GB67250@kib.kiev.ua> Date: Wed, 2 Mar 2016 03:02:02 -0800 Message-ID: Subject: Re: Process stuck in "vnread" From: Maxim Sobolev To: Konstantin Belousov Cc: stable@freebsd.org, freebsd-fs@freebsd.org, Kirk McKusick Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 Mar 2016 11:02:05 -0000 Thanks, Konstantin. Re: md(4) state: 0 88688 0 0 -8 0 0 16 tx->tx_s DL - 0:45.43 [md0] Its backtrace: About the backtrace, indeed, looks like you are right and some portion of it is not decoded properly, as it's loaded as a kernel module. The setup is somewhat even more complicated, the /usr/ports is mounted via NULLFS, so in this command: cp /usr/local/share/automake-1.15/compile ./compile The target (i.e. ./compile) here is a path on ZFS that is exported via NULLFS, while the source is a file on UFS2->md->ZFS. This is probably the reason stack trace is incomplete, both zfs.ko and nullfs.ko are loaded as modules and the next few frames point towards those. Unfortunately I cannot beat kgdb to read symbols from those .ko's and decode them. #13 0xffffffff80cb36f1 in copyin () #14 0xffffffff80977ddf in uiomove_faultflag () #15 0xffffffff819f699c in ?? () #16 0xfffffe0468a861a0 in ?? () #17 0xfffff80000000000 in ?? () #18 0xfffffe0468a861a0 in ?? () #19 0xfffff80176b39420 in ?? () #20 0x0000000000000001 in ?? () $ kldstat | grep 0xffffffff819 2 1 0xffffffff819bd000 aef8 nullfs.ko 3 1 0xffffffff819c8000 2fd2f0 zfs.ko On Wed, Mar 2, 2016 at 1:53 AM, Konstantin Belousov wrote: > On Wed, Mar 02, 2016 at 01:12:31AM -0800, Maxim Sobolev wrote: > > Hi, I've encountered cp(1) process stuck in the vnread state on one of my > > build machines that got recently upgraded to 10.3. > > > > 0 79596 1 0 20 0 17092 1396 wait I 1 > 0:00.00 > > /bin/sh /usr/local/bin/autoreconf -f -i > > 0 79602 79596 0 52 0 41488 9036 wait I 1 > 0:00.07 > > /usr/local/bin/perl -w /usr/local/bin/autoreconf-2.69 -f -i > > 0 79639 79602 0 72 0 0 0 - Z 1 > 0:00.27 > > > > 0 79762 79602 0 20 0 17092 1396 wait I 1 > 0:00.00 > > /bin/sh /usr/local/bin/automake --add-missing --copy --force-missing > > 0 79768 79762 0 52 0 49736 13936 wait I 1 > 0:00.11 > > /usr/local/bin/perl -w /usr/local/bin/automake-1.15 --add-missing --copy > > --force-missing > > 0 79962 79768 0 20 0 12368 1024 vnread DL 1 > 0:00.00 > > cp /usr/local/share/automake-1.15/compile ./compile > > > > I am not sure if it's related to that OS version upgrade, but I have not > > seen any such issues on the same machine in 2-3 years running essentially > > the same build process with version 9.x, 10.0, 10.1 and 10.2. > > > > $ uname -a > > FreeBSD van01.sippysoft.com 10.3-PRERELEASE FreeBSD 10.3-PRERELEASE #1 > > 80de3e2(master)-dirty: Tue Feb 2 12:19:57 PST 2016 > > sobomax@abc.sippysoft.com: > /usr/obj/usr/home/sobomax/projects/freebsd103/sys/ABC > > amd64 > > > > The kernel stack trace is: > > > > (kgdb) thread 360 > > [Switching to thread 360 (Thread 100515)]#0 0xffffffff8095244e in > > sched_switch () > > (kgdb) bt > > #0 0xffffffff8095244e in sched_switch () > > #1 0xffffffff809313b1 in mi_switch () > > #2 0xffffffff8097089a in sleepq_wait () > > #3 0xffffffff80930dd7 in _sleep () > > #4 0xffffffff809b230e in bwait () > > #5 0xffffffff80b511f3 in vnode_pager_generic_getpages () > > #6 0xffffffff80dd1607 in VOP_GETPAGES_APV () > > #7 0xffffffff80b4f59a in vnode_pager_getpages () > > #8 0xffffffff80b30031 in vm_fault_hold () > > #9 0xffffffff80b2f797 in vm_fault () > > #10 0xffffffff80cb5a75 in trap_pfault () > > #11 0xffffffff80cb51dd in trap () > > #12 0xffffffff80c9b122 in calltrap () > > #13 0xffffffff80cb36f1 in copyin () > > #14 0xffffffff80977ddf in uiomove_faultflag () > The backtrace indicates, with 99% certainity that the issue is in the > requested read never finishing. But the backtrace is obviously not > complete, and there might be something more happening. At least, > we do not handle page-ins during uiomove() on user io for quite > some time. > > If the vnode which io hung is UFS over md, you should look at the md > worker thread state. > > > > > The FS stack configuration is somewhat unique, so I am not sure if I am > > hitting some rare race condition or lock ordering issues specific to > that. > > It's basically ZFS (ZRAID) on top of pair or SATA SSDs with big file on > > that FS attached via md(4) and UFS2 on that md(4). The build itself runs > in > > chroot with that UFS2 fs as its primary root. > > > > Just maybe additional bit of info, attempting to list the directory with > > that UFS image also got my bash process stuck in "zfs" state, backtrace > > from that is: > A deadlock in the underlying io layer is consistent with this (secondary) > observation. > > > > > (kgdb) thread 353 > > [Switching to thread 353 (Thread 100508)]#0 0xffffffff8095244e in > > sched_switch () > > (kgdb) bt > > #0 0xffffffff8095244e in sched_switch () > > #1 0xffffffff809313b1 in mi_switch () > > #2 0xffffffff8097089a in sleepq_wait () > > #3 0xffffffff809069ad in sleeplk () > > #4 0xffffffff809060e0 in __lockmgr_args () > > #5 0xffffffff809b8b7c in vop_stdlock () > > #6 0xffffffff80dd0a3b in VOP_LOCK1_APV () > > #7 0xffffffff809d6d23 in _vn_lock () > > #8 0xffffffff81a8c9cd in ?? () > > #9 0x0000000000000000 in ?? () > > -- Maksym Sobolyev Sippy Software, Inc. Internet Telephony (VoIP) Experts Tel (Canada): +1-778-783-0474 Tel (Toll-Free): +1-855-747-7779 Fax: +1-866-857-6942 Web: http://www.sippysoft.com MSN: sales@sippysoft.com Skype: SippySoft From owner-freebsd-fs@freebsd.org Wed Mar 2 11:04:30 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 43306AC1DCC; Wed, 2 Mar 2016 11:04:30 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id D1B571EAB; Wed, 2 Mar 2016 11:04:29 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id u22B4Pt8066627 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Wed, 2 Mar 2016 13:04:25 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua u22B4Pt8066627 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id u22B4Poj066626; Wed, 2 Mar 2016 13:04:25 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Wed, 2 Mar 2016 13:04:25 +0200 From: Konstantin Belousov To: Maxim Sobolev Cc: stable@freebsd.org, freebsd-fs@freebsd.org, Kirk McKusick Subject: Re: Process stuck in "vnread" Message-ID: <20160302110425.GE67250@kib.kiev.ua> References: <20160302095339.GB67250@kib.kiev.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 Mar 2016 11:04:30 -0000 On Wed, Mar 02, 2016 at 03:02:02AM -0800, Maxim Sobolev wrote: > Thanks, Konstantin. > > Re: md(4) state: > > 0 88688 0 0 -8 0 0 16 tx->tx_s DL - 0:45.43 > [md0] > > Its backtrace: So md is stuck in ZFS. From owner-freebsd-fs@freebsd.org Wed Mar 2 11:04:42 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7F53DAC1E3C for ; Wed, 2 Mar 2016 11:04:42 +0000 (UTC) (envelope-from sobomax@sippysoft.com) Received: from mail-wm0-x22b.google.com (mail-wm0-x22b.google.com [IPv6:2a00:1450:400c:c09::22b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 0B21A1F70 for ; Wed, 2 Mar 2016 11:04:42 +0000 (UTC) (envelope-from sobomax@sippysoft.com) Received: by mail-wm0-x22b.google.com with SMTP id l68so72402037wml.1 for ; Wed, 02 Mar 2016 03:04:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sippysoft-com.20150623.gappssmtp.com; s=20150623; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc; bh=mRl4W3t7jcWA3apZrRKwxTafm78sJOdSTGznSzZzs+g=; b=FJT9ggbyM9Vrkhuo8MbFx0rk3JHL+61m7P06T67JXMWMMLcVZw9L0Lpd+RCBGFLTob RWnOIgpXTTOuwYTmlW2XAfh4dM3YryPFc2jS3Es7FproVGbmXOC1232LclCQ8PSlHq2M hmRHs7j9ssBiDR0AyNGWIWwWQuYF2DzkS3I6r0M0ZyawB+fC8YklTFCEbvPMIQSu22aC w3y2CMXKvVK5Q3MBfngX3TEnPV1SUG764Ofd/nLZd8bawN04NteL0tiOwv/KUBzcwUfS 3Si/LtqU1jw3atMfmoKjlXt2YB9UnE2CqbMhsaXqmA1RgtCEy0Puu9QEzR+RnLaW/FLK 00Vw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:date :message-id:subject:from:to:cc; bh=mRl4W3t7jcWA3apZrRKwxTafm78sJOdSTGznSzZzs+g=; b=OY01PKpEF50W/KCeoP4QnSb5BgLBnrQ+cgEcbE7RTWW+PkNZnAgNew0itZFetWOpSc cyVjQiqLPjlITDJ3lxFvH+XzELMRA4OPYl7mrvV9bY09pNIhCifTqnPgsY3+flHHYTuB MXSiAFU1lZS/ZB5jTzYx0vYjOUvVEuCisoiBpOGr7kZHVrSE9mFfQbdPZDMwee+LEzbE SCEY8/nHT/0JZODc0HvFQnsHi+q+sUPqxq1iSHblNvBSINxNLNFop66TItpsq+ZB+VC1 LdT6fLi3kFX+ckPnoz1PNV0oZpPfytJzuRNRK2x4Qko3TF2uDxaEK78AO3U5KLFTJra8 6fKA== X-Gm-Message-State: AD7BkJJGB5JQSIBiiRHVBpZ+L95iQ5wBBH73NXBqZMH6O7TQ0QXaczgHjM4wbSF9M3KXsieqRh8LB0rA0d1QkL5Q MIME-Version: 1.0 X-Received: by 10.28.92.195 with SMTP id q186mr3773417wmb.37.1456916680546; Wed, 02 Mar 2016 03:04:40 -0800 (PST) Sender: sobomax@sippysoft.com Received: by 10.27.218.12 with HTTP; Wed, 2 Mar 2016 03:04:40 -0800 (PST) In-Reply-To: References: <20160302095339.GB67250@kib.kiev.ua> Date: Wed, 2 Mar 2016 03:04:40 -0800 X-Google-Sender-Auth: ak_zCT0PyHsSvzbEEfCRL7TGNUg Message-ID: Subject: Re: Process stuck in "vnread" From: Maxim Sobolev To: Konstantin Belousov Cc: stable@freebsd.org, freebsd-fs@freebsd.org, Kirk McKusick Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 Mar 2016 11:04:42 -0000 Sorry gmail hit set too early. Backtrace from the md worker: [Switching to thread 357 (Thread 101131)]#0 0xffffffff8095244e in sched_switch () (kgdb) bt #0 0xffffffff8095244e in sched_switch () #1 0xffffffff809313b1 in mi_switch () #2 0xffffffff8097089a in sleepq_wait () #3 0xffffffff808d344d in _cv_wait () #4 0xffffffff81a42185 in ?? () #5 0xfffff803096d3960 in ?? () #6 0x0000000000000000 in ?? () On Wed, Mar 2, 2016 at 3:02 AM, Maxim Sobolev wrote: > Thanks, Konstantin. > > Re: md(4) state: > > 0 88688 0 0 -8 0 0 16 tx->tx_s DL - 0:45.43 > [md0] > > Its backtrace: > > > About the backtrace, indeed, looks like you are right and some portion of > it is not decoded properly, as it's loaded as a kernel module. The setup is > somewhat even more complicated, the /usr/ports is mounted via NULLFS, so in > this command: > > cp /usr/local/share/automake-1.15/compile ./compile > > The target (i.e. ./compile) here is a path on ZFS that is exported via > NULLFS, while the source is a file on UFS2->md->ZFS. This is probably the > reason stack trace is incomplete, both zfs.ko and nullfs.ko are loaded as > modules and the next few frames point towards those. Unfortunately I cannot > beat kgdb to read symbols from those .ko's and decode them. > > #13 0xffffffff80cb36f1 in copyin () > #14 0xffffffff80977ddf in uiomove_faultflag () > #15 0xffffffff819f699c in ?? () > #16 0xfffffe0468a861a0 in ?? () > #17 0xfffff80000000000 in ?? () > #18 0xfffffe0468a861a0 in ?? () > #19 0xfffff80176b39420 in ?? () > #20 0x0000000000000001 in ?? () > > $ kldstat | grep 0xffffffff819 > 2 1 0xffffffff819bd000 aef8 nullfs.ko > 3 1 0xffffffff819c8000 2fd2f0 zfs.ko > > > > > On Wed, Mar 2, 2016 at 1:53 AM, Konstantin Belousov > wrote: > >> On Wed, Mar 02, 2016 at 01:12:31AM -0800, Maxim Sobolev wrote: >> > Hi, I've encountered cp(1) process stuck in the vnread state on one of >> my >> > build machines that got recently upgraded to 10.3. >> > >> > 0 79596 1 0 20 0 17092 1396 wait I 1 >> 0:00.00 >> > /bin/sh /usr/local/bin/autoreconf -f -i >> > 0 79602 79596 0 52 0 41488 9036 wait I 1 >> 0:00.07 >> > /usr/local/bin/perl -w /usr/local/bin/autoreconf-2.69 -f -i >> > 0 79639 79602 0 72 0 0 0 - Z 1 >> 0:00.27 >> > >> > 0 79762 79602 0 20 0 17092 1396 wait I 1 >> 0:00.00 >> > /bin/sh /usr/local/bin/automake --add-missing --copy --force-missing >> > 0 79768 79762 0 52 0 49736 13936 wait I 1 >> 0:00.11 >> > /usr/local/bin/perl -w /usr/local/bin/automake-1.15 --add-missing --copy >> > --force-missing >> > 0 79962 79768 0 20 0 12368 1024 vnread DL 1 >> 0:00.00 >> > cp /usr/local/share/automake-1.15/compile ./compile >> > >> > I am not sure if it's related to that OS version upgrade, but I have not >> > seen any such issues on the same machine in 2-3 years running >> essentially >> > the same build process with version 9.x, 10.0, 10.1 and 10.2. >> > >> > $ uname -a >> > FreeBSD van01.sippysoft.com 10.3-PRERELEASE FreeBSD 10.3-PRERELEASE #1 >> > 80de3e2(master)-dirty: Tue Feb 2 12:19:57 PST 2016 >> > sobomax@abc.sippysoft.com: >> /usr/obj/usr/home/sobomax/projects/freebsd103/sys/ABC >> > amd64 >> > >> > The kernel stack trace is: >> > >> > (kgdb) thread 360 >> > [Switching to thread 360 (Thread 100515)]#0 0xffffffff8095244e in >> > sched_switch () >> > (kgdb) bt >> > #0 0xffffffff8095244e in sched_switch () >> > #1 0xffffffff809313b1 in mi_switch () >> > #2 0xffffffff8097089a in sleepq_wait () >> > #3 0xffffffff80930dd7 in _sleep () >> > #4 0xffffffff809b230e in bwait () >> > #5 0xffffffff80b511f3 in vnode_pager_generic_getpages () >> > #6 0xffffffff80dd1607 in VOP_GETPAGES_APV () >> > #7 0xffffffff80b4f59a in vnode_pager_getpages () >> > #8 0xffffffff80b30031 in vm_fault_hold () >> > #9 0xffffffff80b2f797 in vm_fault () >> > #10 0xffffffff80cb5a75 in trap_pfault () >> > #11 0xffffffff80cb51dd in trap () >> > #12 0xffffffff80c9b122 in calltrap () >> > #13 0xffffffff80cb36f1 in copyin () >> > #14 0xffffffff80977ddf in uiomove_faultflag () >> The backtrace indicates, with 99% certainity that the issue is in the >> requested read never finishing. But the backtrace is obviously not >> complete, and there might be something more happening. At least, >> we do not handle page-ins during uiomove() on user io for quite >> some time. >> >> If the vnode which io hung is UFS over md, you should look at the md >> worker thread state. >> >> > >> > The FS stack configuration is somewhat unique, so I am not sure if I am >> > hitting some rare race condition or lock ordering issues specific to >> that. >> > It's basically ZFS (ZRAID) on top of pair or SATA SSDs with big file on >> > that FS attached via md(4) and UFS2 on that md(4). The build itself >> runs in >> > chroot with that UFS2 fs as its primary root. >> > >> > Just maybe additional bit of info, attempting to list the directory with >> > that UFS image also got my bash process stuck in "zfs" state, backtrace >> > from that is: >> A deadlock in the underlying io layer is consistent with this (secondary) >> observation. >> >> > >> > (kgdb) thread 353 >> > [Switching to thread 353 (Thread 100508)]#0 0xffffffff8095244e in >> > sched_switch () >> > (kgdb) bt >> > #0 0xffffffff8095244e in sched_switch () >> > #1 0xffffffff809313b1 in mi_switch () >> > #2 0xffffffff8097089a in sleepq_wait () >> > #3 0xffffffff809069ad in sleeplk () >> > #4 0xffffffff809060e0 in __lockmgr_args () >> > #5 0xffffffff809b8b7c in vop_stdlock () >> > #6 0xffffffff80dd0a3b in VOP_LOCK1_APV () >> > #7 0xffffffff809d6d23 in _vn_lock () >> > #8 0xffffffff81a8c9cd in ?? () >> > #9 0x0000000000000000 in ?? () >> >> > > > -- > Maksym Sobolyev > Sippy Software, Inc. > Internet Telephony (VoIP) Experts > Tel (Canada): +1-778-783-0474 > Tel (Toll-Free): +1-855-747-7779 > Fax: +1-866-857-6942 > Web: http://www.sippysoft.com > MSN: sales@sippysoft.com > Skype: SippySoft > From owner-freebsd-fs@freebsd.org Wed Mar 2 11:57:22 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 26BAAAC03FA for ; Wed, 2 Mar 2016 11:57:22 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 0FA9F1B84 for ; Wed, 2 Mar 2016 11:57:22 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: by mailman.ysv.freebsd.org (Postfix) id 0D613AC03F9; Wed, 2 Mar 2016 11:57:22 +0000 (UTC) Delivered-To: fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0CFB9AC03F7 for ; Wed, 2 Mar 2016 11:57:22 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 81D231B83 for ; Wed, 2 Mar 2016 11:57:21 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id u22Bv8Vv082661 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Wed, 2 Mar 2016 13:57:08 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua u22Bv8Vv082661 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id u22Bv8Ag082660; Wed, 2 Mar 2016 13:57:08 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Wed, 2 Mar 2016 13:57:07 +0200 From: Konstantin Belousov To: Maxim Sobolev Cc: Kirk McKusick , peter@holm.cc, fs@freebsd.org Subject: Re: Process stuck in "vnread" Message-ID: <20160302115707.GF67250@kib.kiev.ua> References: <20160302095339.GB67250@kib.kiev.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 Mar 2016 11:57:22 -0000 On Wed, Mar 02, 2016 at 03:02:02AM -0800, Maxim Sobolev wrote: > About the backtrace, indeed, looks like you are right and some portion of > it is not decoded properly, as it's loaded as a kernel module. The setup is > somewhat even more complicated, the /usr/ports is mounted via NULLFS, so in > this command: > > cp /usr/local/share/automake-1.15/compile ./compile > > The target (i.e. ./compile) here is a path on ZFS that is exported via > NULLFS, while the source is a file on UFS2->md->ZFS. This is probably the > reason stack trace is incomplete, both zfs.ko and nullfs.ko are loaded as > modules and the next few frames point towards those. Unfortunately I cannot > beat kgdb to read symbols from those .ko's and decode them. Is nullfs mount put over ZFS only ? The backtrace you shown cannot happen for ZFS, since ZFS has its own pager vop. In fact, I would agree that the backtrace is reasonable for nullfs over UFS upper vnode. The following patch should fix the 'paging while faulting on uiomove' issue for nullfs over UFS. Peter, could you, please, test the patch ? It is purely nullfs change, and the most interesting situation is the ups' deadlock, but the whole set of nullfs tests would be good to check. diff --git a/sys/fs/nullfs/null_vfsops.c b/sys/fs/nullfs/null_vfsops.c index 64e1e29..49bae28 100644 --- a/sys/fs/nullfs/null_vfsops.c +++ b/sys/fs/nullfs/null_vfsops.c @@ -199,7 +199,7 @@ nullfs_mount(struct mount *mp) } mp->mnt_kern_flag |= MNTK_LOOKUP_EXCL_DOTDOT; mp->mnt_kern_flag |= lowerrootvp->v_mount->mnt_kern_flag & - MNTK_USES_BCACHE; + (MNTK_USES_BCACHE | MNTK_NO_IOPF | MNTK_UNMAPPED_BUFS); MNT_IUNLOCK(mp); mp->mnt_data = xmp; vfs_getnewfsid(mp); From owner-freebsd-fs@freebsd.org Wed Mar 2 15:14:23 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 95141AC1A42 for ; Wed, 2 Mar 2016 15:14:23 +0000 (UTC) (envelope-from pho@holm.cc) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 818D4160E for ; Wed, 2 Mar 2016 15:14:23 +0000 (UTC) (envelope-from pho@holm.cc) Received: by mailman.ysv.freebsd.org (Postfix) id 7D65AAC1A41; Wed, 2 Mar 2016 15:14:23 +0000 (UTC) Delivered-To: fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7CFECAC1A40 for ; Wed, 2 Mar 2016 15:14:23 +0000 (UTC) (envelope-from pho@holm.cc) Received: from relay01.pair.com (relay01.pair.com [209.68.5.15]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 5E3BD160D for ; Wed, 2 Mar 2016 15:14:23 +0000 (UTC) (envelope-from pho@holm.cc) Received: from x2.osted.lan (87-58-223-204-dynamic.dk.customer.tdc.net [87.58.223.204]) by relay01.pair.com (Postfix) with ESMTP id 94F14D008F6; Wed, 2 Mar 2016 10:14:15 -0500 (EST) Received: from x2.osted.lan (localhost [127.0.0.1]) by x2.osted.lan (8.14.9/8.14.9) with ESMTP id u22FEDfZ052457 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Wed, 2 Mar 2016 16:14:13 +0100 (CET) (envelope-from pho@x2.osted.lan) Received: (from pho@localhost) by x2.osted.lan (8.14.9/8.14.9/Submit) id u22FED04052456; Wed, 2 Mar 2016 16:14:13 +0100 (CET) (envelope-from pho) Date: Wed, 2 Mar 2016 16:14:13 +0100 From: Peter Holm To: Konstantin Belousov Cc: Maxim Sobolev , Kirk McKusick , fs@freebsd.org Subject: Re: Process stuck in "vnread" Message-ID: <20160302151413.GA52083@x2.osted.lan> References: <20160302095339.GB67250@kib.kiev.ua> <20160302115707.GF67250@kib.kiev.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160302115707.GF67250@kib.kiev.ua> User-Agent: Mutt/1.5.23 (2014-03-12) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 Mar 2016 15:14:23 -0000 On Wed, Mar 02, 2016 at 01:57:07PM +0200, Konstantin Belousov wrote: > On Wed, Mar 02, 2016 at 03:02:02AM -0800, Maxim Sobolev wrote: > > About the backtrace, indeed, looks like you are right and some portion of > > it is not decoded properly, as it's loaded as a kernel module. The setup is > > somewhat even more complicated, the /usr/ports is mounted via NULLFS, so in > > this command: > > > > cp /usr/local/share/automake-1.15/compile ./compile > > > > The target (i.e. ./compile) here is a path on ZFS that is exported via > > NULLFS, while the source is a file on UFS2->md->ZFS. This is probably the > > reason stack trace is incomplete, both zfs.ko and nullfs.ko are loaded as > > modules and the next few frames point towards those. Unfortunately I cannot > > beat kgdb to read symbols from those .ko's and decode them. > > Is nullfs mount put over ZFS only ? The backtrace you shown cannot > happen for ZFS, since ZFS has its own pager vop. In fact, I would > agree that the backtrace is reasonable for nullfs over UFS upper vnode. > The following patch should fix the 'paging while faulting on uiomove' > issue for nullfs over UFS. > > Peter, could you, please, test the patch ? It is purely nullfs change, > and the most interesting situation is the ups' deadlock, but the whole > set of nullfs tests would be good to check. > > diff --git a/sys/fs/nullfs/null_vfsops.c b/sys/fs/nullfs/null_vfsops.c > index 64e1e29..49bae28 100644 > --- a/sys/fs/nullfs/null_vfsops.c Sure. I'll try to reproduce the problem first. - Peter From owner-freebsd-fs@freebsd.org Wed Mar 2 17:06:39 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id BE56AAC0B14 for ; Wed, 2 Mar 2016 17:06:39 +0000 (UTC) (envelope-from sobomax@sippysoft.com) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 9DD5B1AC3 for ; Wed, 2 Mar 2016 17:06:39 +0000 (UTC) (envelope-from sobomax@sippysoft.com) Received: by mailman.ysv.freebsd.org (Postfix) id 9A94FAC0B13; Wed, 2 Mar 2016 17:06:39 +0000 (UTC) Delivered-To: fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9A180AC0B12 for ; Wed, 2 Mar 2016 17:06:39 +0000 (UTC) (envelope-from sobomax@sippysoft.com) Received: from mail-wm0-x236.google.com (mail-wm0-x236.google.com [IPv6:2a00:1450:400c:c09::236]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 2FA8C1AC1 for ; Wed, 2 Mar 2016 17:06:39 +0000 (UTC) (envelope-from sobomax@sippysoft.com) Received: by mail-wm0-x236.google.com with SMTP id l68so86993470wml.1 for ; Wed, 02 Mar 2016 09:06:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sippysoft-com.20150623.gappssmtp.com; s=20150623; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc; bh=vE3/SbSn4GPw5yw19z4KMWCPS5b2RzR1FPlq8nXZwPg=; b=f77oG48xrHXYegV7dX32ycX38PixzL4XOVk/zMPdAqiLZvttu251LlVrAruJGzeigG hUd/hsUcpXBckeewqjmWB/EBqn6HfiO22+AEn4SKug91A3IhXpsxFRqie4K7iM2469G7 77tm9m6pGwV4wnZQLVAmLqSGNWRRlVqVchtk0quov4Z9XT7TTJKzjzGQPjU3YMGkg9f7 Cdv7sGKgawjfkL69sKPVL6/dH/QNd/TLbRmwWmM5giGWxjaLqVLPue19CNG1UkUV4SnK kGmFt3/16uAmCbBpeF6azANSxdQQC7GdA4BAYp5rcNadW9QHI7Bv+TKFg2/AMLvDJQ+4 E3vg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:date :message-id:subject:from:to:cc; bh=vE3/SbSn4GPw5yw19z4KMWCPS5b2RzR1FPlq8nXZwPg=; b=P0jrw9jairZCw//diZJa7VNAZGydc/b3EI/ftOhHfxIY6ov12JnjkkZNCFGReeTtEF bYlWjGdzy3ZoL8bPg+cL6WVL82XpMpQTglf9nQv85ZD5OaV9m6CYsi9wyNqR85MWnCEX /8GqERa5m+VoqH+FuVTabSHzUeiH62HtVkNHrEfQFKB5A67WbfbkSwl75VbzPUZVTuW6 fJ2VAPxzr0mrSOpEKW5nyl/koPmiYKHgquoG9vWetPGGms+6nZDLBR9JtvGTfNbb4tEH EEnjyiGYz3Zmx9nlBbvjecE/8zvcJxZEy7JJlYGTbOUUXStG0tUxK4JTKSWZJ4wY/KdH U/oA== X-Gm-Message-State: AD7BkJLKRjVZItb1dhg+pwGxJoW663f4k0asHTWgQdLeyasgPPhgwgDpZJQImOoNeH/gLZeVJG0IMZcIJMHMfyUF MIME-Version: 1.0 X-Received: by 10.28.148.16 with SMTP id w16mr1018990wmd.90.1456938397611; Wed, 02 Mar 2016 09:06:37 -0800 (PST) Sender: sobomax@sippysoft.com Received: by 10.27.218.12 with HTTP; Wed, 2 Mar 2016 09:06:37 -0800 (PST) In-Reply-To: <20160302115707.GF67250@kib.kiev.ua> References: <20160302095339.GB67250@kib.kiev.ua> <20160302115707.GF67250@kib.kiev.ua> Date: Wed, 2 Mar 2016 09:06:37 -0800 X-Google-Sender-Auth: D9JK8-J6l3nQ0qjjHKZ6cMr5PvQ Message-ID: Subject: Re: Process stuck in "vnread" From: Maxim Sobolev To: Konstantin Belousov Cc: Kirk McKusick , peter@holm.cc, fs@freebsd.org Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 Mar 2016 17:06:39 -0000 Konstantin, this is nullfs mounted over UFS and nullfs is pointing over to the part of the ZFS tree. I am not sure if it's what you are talking about or not. storage/builder on /builder (zfs, local, nfsv4acls) md0 vnode 3200M /builder/tmp/sspicd_tmp.ufs /dev/md0 on /builder/mnt (ufs, asynchronous, local, noatime) /builder/usr/ports-bitbucket on /builder/mnt/usr/ports (nullfs, local) So, stuck process refers to file effectively being copied over from /builder/mnt/usr/local/share/automake-1.15/compile to /builder/usr/ports-bitbucket/SOMETHING/./compile by the process chrooted into /builder/mnt, and it could be either in the read path or in the write path. However looking at the full kernel side of stack trace of that cp(1), I'd say it's probably the latter, as this would have to traverse through top level vfs/ufs first, to nullfs layer and then via zfs, none of the last two is compiled in so that there is no proper traceback. The nullfs mount is used to allow it accessing ZFS tree on the upper level, i.e. /builder/usr. Unfortunately I cannot find a way to figure out specific system call that cp got stuck in. Attempting to attach gdb causes gdb to hang in turn. So unless somebody got any other ideas on how to get some useful post-mortem debug out of this situation I'll have to restart the box soon to recover it. I will put your patch in and see if it helps. I'd also compile nullfs statically, so at least if it hits again we have some post-mortem evidence to work with. ---- (kgdb) thread 362 [Switching to thread 362 (Thread 100515)]#0 0xffffffff8095244e in sched_switch () (kgdb) bt #0 0xffffffff8095244e in sched_switch () #1 0xffffffff809313b1 in mi_switch () #2 0xffffffff8097089a in sleepq_wait () #3 0xffffffff80930dd7 in _sleep () #4 0xffffffff809b230e in bwait () #5 0xffffffff80b511f3 in vnode_pager_generic_getpages () #6 0xffffffff80dd1607 in VOP_GETPAGES_APV () #7 0xffffffff80b4f59a in vnode_pager_getpages () #8 0xffffffff80b30031 in vm_fault_hold () #9 0xffffffff80b2f797 in vm_fault () #10 0xffffffff80cb5a75 in trap_pfault () #11 0xffffffff80cb51dd in trap () #12 0xffffffff80c9b122 in calltrap () #13 0xffffffff80cb36f1 in copyin () #14 0xffffffff80977ddf in uiomove_faultflag () #15 0xffffffff819f699c in ?? () #16 0xfffffe0468a861a0 in ?? () #17 0xfffff80000000000 in ?? () #18 0xfffffe0468a861a0 in ?? () #19 0xfffff80176b39420 in ?? () #20 0x0000000000000001 in ?? () #21 0xfffff801ee76f500 in ?? () #22 0xfffffe0468a86960 in ?? () #23 0x00000001e3a72d80 in ?? () #24 0xfffff80176b39420 in ?? () #25 0xfffff803e3a72d80 in ?? () #26 0xfffffe0468a86960 in ?? () #27 0xfffff801881130e8 in ?? () #28 0xfffff801ee76f500 in ?? () #29 0x0000000000001ca5 in ?? () #30 0xfffffe0468a86200 in ?? () #31 0xffffffff819f68b2 in ?? () #32 0x0000000000001ca5 in ?? () #33 0x0000000000001ca5 in ?? () #34 0xfffff80188113000 in ?? () #35 0xfffffe0468a86960 in ?? () #36 0xfffffe0468a86440 in ?? () #37 0xffffffff81a90a77 in ?? () #38 0xfffff80100000002 in ?? () #39 0x0000000181a6c5c2 in ?? () #40 0x0000000000000000 in ?? () On Wed, Mar 2, 2016 at 3:57 AM, Konstantin Belousov wrote: > On Wed, Mar 02, 2016 at 03:02:02AM -0800, Maxim Sobolev wrote: > > About the backtrace, indeed, looks like you are right and some portion of > > it is not decoded properly, as it's loaded as a kernel module. The setup > is > > somewhat even more complicated, the /usr/ports is mounted via NULLFS, so > in > > this command: > > > > cp /usr/local/share/automake-1.15/compile ./compile > > > > The target (i.e. ./compile) here is a path on ZFS that is exported via > > NULLFS, while the source is a file on UFS2->md->ZFS. This is probably the > > reason stack trace is incomplete, both zfs.ko and nullfs.ko are loaded as > > modules and the next few frames point towards those. Unfortunately I > cannot > > beat kgdb to read symbols from those .ko's and decode them. > > Is nullfs mount put over ZFS only ? The backtrace you shown cannot > happen for ZFS, since ZFS has its own pager vop. In fact, I would > agree that the backtrace is reasonable for nullfs over UFS upper vnode. > The following patch should fix the 'paging while faulting on uiomove' > issue for nullfs over UFS. > > Peter, could you, please, test the patch ? It is purely nullfs change, > and the most interesting situation is the ups' deadlock, but the whole > set of nullfs tests would be good to check. > > diff --git a/sys/fs/nullfs/null_vfsops.c b/sys/fs/nullfs/null_vfsops.c > index 64e1e29..49bae28 100644 > --- a/sys/fs/nullfs/null_vfsops.c > +++ b/sys/fs/nullfs/null_vfsops.c > @@ -199,7 +199,7 @@ nullfs_mount(struct mount *mp) > } > mp->mnt_kern_flag |= MNTK_LOOKUP_EXCL_DOTDOT; > mp->mnt_kern_flag |= lowerrootvp->v_mount->mnt_kern_flag & > - MNTK_USES_BCACHE; > + (MNTK_USES_BCACHE | MNTK_NO_IOPF | MNTK_UNMAPPED_BUFS); > MNT_IUNLOCK(mp); > mp->mnt_data = xmp; > vfs_getnewfsid(mp); > From owner-freebsd-fs@freebsd.org Wed Mar 2 17:39:57 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C845CAC0F98 for ; Wed, 2 Mar 2016 17:39:57 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id B11F319C7 for ; Wed, 2 Mar 2016 17:39:57 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: by mailman.ysv.freebsd.org (Postfix) id B05EEAC0F97; Wed, 2 Mar 2016 17:39:57 +0000 (UTC) Delivered-To: fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id AFF41AC0F96 for ; Wed, 2 Mar 2016 17:39:57 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 3013A19C6; Wed, 2 Mar 2016 17:39:56 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id u22Hdkvv029701 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Wed, 2 Mar 2016 19:39:47 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua u22Hdkvv029701 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id u22Hdk9S029700; Wed, 2 Mar 2016 19:39:46 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Wed, 2 Mar 2016 19:39:46 +0200 From: Konstantin Belousov To: Maxim Sobolev Cc: Kirk McKusick , peter@holm.cc, fs@freebsd.org Subject: Re: Process stuck in "vnread" Message-ID: <20160302173946.GL67250@kib.kiev.ua> References: <20160302095339.GB67250@kib.kiev.ua> <20160302115707.GF67250@kib.kiev.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 Mar 2016 17:39:58 -0000 On Wed, Mar 02, 2016 at 09:06:37AM -0800, Maxim Sobolev wrote: > Konstantin, this is nullfs mounted over UFS and nullfs is pointing over to > the part of the ZFS tree. I am not sure if it's what you are talking about > or not. > > > storage/builder on /builder (zfs, local, nfsv4acls) > > md0 vnode 3200M /builder/tmp/sspicd_tmp.ufs > > /dev/md0 on /builder/mnt (ufs, asynchronous, local, noatime) > /builder/usr/ports-bitbucket on /builder/mnt/usr/ports (nullfs, local) > > So, stuck process refers to file effectively being copied over from > /builder/mnt/usr/local/share/automake-1.15/compile to > /builder/usr/ports-bitbucket/SOMETHING/./compile by the process chrooted > into /builder/mnt, and it could be either in the read path or in the write > path. However looking at the full kernel side of stack trace of that cp(1), > I'd say it's probably the latter, as this would have to traverse through > top level vfs/ufs first, to nullfs layer and then via zfs, none of the last > two is compiled in so that there is no proper traceback. The nullfs mount > is used to allow it accessing ZFS tree on the upper level, i.e. > /builder/usr. > > Unfortunately I cannot find a way to figure out specific system call that > cp got stuck in. Attempting to attach gdb causes gdb to hang in turn. So > unless somebody got any other ideas on how to get some useful post-mortem > debug out of this situation I'll have to restart the box soon to recover it. > > I will put your patch in and see if it helps. I'd also compile nullfs > statically, so at least if it hits again we have some post-mortem evidence > to work with. No, my patch would not help you. Your problem is hung ZFS. From owner-freebsd-fs@freebsd.org Wed Mar 2 22:54:56 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 77640AC127F for ; Wed, 2 Mar 2016 22:54:56 +0000 (UTC) (envelope-from sobomax@sippysoft.com) Received: from mail-wm0-x22b.google.com (mail-wm0-x22b.google.com [IPv6:2a00:1450:400c:c09::22b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 1273A1E7D for ; Wed, 2 Mar 2016 22:54:56 +0000 (UTC) (envelope-from sobomax@sippysoft.com) Received: by mail-wm0-x22b.google.com with SMTP id l68so9295182wml.1 for ; Wed, 02 Mar 2016 14:54:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sippysoft-com.20150623.gappssmtp.com; s=20150623; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc; bh=nU3osJWbyQOM357JpDpoOF6Va6vl07ADZaiAHO+nnck=; b=mytrZUcSgfp7C4F2856sRQWppVsPiJSquT65Z/EFWUjA2mlZfZalfIoeZ5xeRUdedg xredgRu+44IWQtWqxcMvv2+20B4dLPiocLV0FJVLjGW4Xb6cxMLDHcqJUPbH977wg5PI 37+bnYxGc1qI0PLXtNqExNUysDrjBhuIlUU+nB/aAlYWFinJAsusPoPA56LBWuNr2J+k VXDKTPDPelEIbKYLy7LR4YHH1pCW8Lzb0R+LWf55xmQrCdZ2GwNA1vZkAPYYqQxky+y+ 9vSgIlAn2nizkC4Tl125zxWEjz1QoTmy9Yc+dzhnKywc7gg0DAGSGXzwH2E6sPATFxxJ dsLw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:date :message-id:subject:from:to:cc; bh=nU3osJWbyQOM357JpDpoOF6Va6vl07ADZaiAHO+nnck=; b=i+Vu6txvr7hwc28L7XgOPdjNAxw5SVbribxHHuGxkuZQEVp8gG2yBLjEpF+RegIGMP FRRsvfGZ865w6NQMJI5FCQGPW4BsGLLFMxFBBbLLVPmojGJFGHMInjasMOEY/YJESpxi Joq/d+9OzcEpoe8UQWz/l630toCXg4zJ330weuavWF7fBRrsfyo1b1b0cMNoq/HletLm BqzGFKzCZIjd0BfLO6RulCQThZmiztPR/xb21+BFIwhWBk+TbiK7GC6uP/0ZQS+MyT6K VksDiRbsc13LN5GK/JjNQMXskoJw9DC0J/5scvKJQpdTObqojdRjSVw1y46I8powP2x/ /Iow== X-Gm-Message-State: AD7BkJKa2s4jdSP0s2L6sFpcIva9hIPT5HMIH51eHxYDKJ8FAFqaE9jM16dSOXWkdBzNToKCN9/xwnCth5SduiLM MIME-Version: 1.0 X-Received: by 10.28.148.16 with SMTP id w16mr2339385wmd.90.1456959294472; Wed, 02 Mar 2016 14:54:54 -0800 (PST) Sender: sobomax@sippysoft.com Received: by 10.27.218.12 with HTTP; Wed, 2 Mar 2016 14:54:54 -0800 (PST) In-Reply-To: References: Date: Wed, 2 Mar 2016 14:54:54 -0800 X-Google-Sender-Auth: vlc9Hpm8VViwLMGy3RyDLj1nYAo Message-ID: Subject: Re: Process stuck in "vnread" From: Maxim Sobolev To: Ronald Klop Cc: stable@freebsd.org, freebsd-fs@freebsd.org, Kirk McKusick , kib@freebsd.org Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 Mar 2016 22:54:56 -0000 Thanks Ronald, I think this is at least a possibility that it's related, I've bumped myself up to the latest FreeBSD 10.3-BETA3 (svn revision 296326) and compiled nullfs statically, so I'll run those builds and see how it goes. -Max On Wed, Mar 2, 2016 at 1:32 AM, Ronald Klop wrote: > Hello, > > Would it be possible this has to do with the resolved 'system hangs when > using ZFS caused by VFS' in 10.3-BETA3? > > https://lists.freebsd.org/pipermail/freebsd-stable/2016-February/084238.html > > Regards, > Ronald. > > > On Wed, 02 Mar 2016 10:12:31 +0100, Maxim Sobolev > wrote: > > Hi, I've encountered cp(1) process stuck in the vnread state on one of my >> build machines that got recently upgraded to 10.3. >> >> 0 79596 1 0 20 0 17092 1396 wait I 1 0:00.00 >> /bin/sh /usr/local/bin/autoreconf -f -i >> 0 79602 79596 0 52 0 41488 9036 wait I 1 0:00.07 >> /usr/local/bin/perl -w /usr/local/bin/autoreconf-2.69 -f -i >> 0 79639 79602 0 72 0 0 0 - Z 1 0:00.27 >> >> 0 79762 79602 0 20 0 17092 1396 wait I 1 0:00.00 >> /bin/sh /usr/local/bin/automake --add-missing --copy --force-missing >> 0 79768 79762 0 52 0 49736 13936 wait I 1 0:00.11 >> /usr/local/bin/perl -w /usr/local/bin/automake-1.15 --add-missing --copy >> --force-missing >> 0 79962 79768 0 20 0 12368 1024 vnread DL 1 0:00.00 >> cp /usr/local/share/automake-1.15/compile ./compile >> >> I am not sure if it's related to that OS version upgrade, but I have not >> seen any such issues on the same machine in 2-3 years running essentially >> the same build process with version 9.x, 10.0, 10.1 and 10.2. >> >> $ uname -a >> FreeBSD van01.sippysoft.com 10.3-PRERELEASE FreeBSD 10.3-PRERELEASE #1 >> 80de3e2(master)-dirty: Tue Feb 2 12:19:57 PST 2016 >> sobomax@abc.sippysoft.com: >> /usr/obj/usr/home/sobomax/projects/freebsd103/sys/ABC >> amd64 >> >> The kernel stack trace is: >> >> (kgdb) thread 360 >> [Switching to thread 360 (Thread 100515)]#0 0xffffffff8095244e in >> sched_switch () >> (kgdb) bt >> #0 0xffffffff8095244e in sched_switch () >> #1 0xffffffff809313b1 in mi_switch () >> #2 0xffffffff8097089a in sleepq_wait () >> #3 0xffffffff80930dd7 in _sleep () >> #4 0xffffffff809b230e in bwait () >> #5 0xffffffff80b511f3 in vnode_pager_generic_getpages () >> #6 0xffffffff80dd1607 in VOP_GETPAGES_APV () >> #7 0xffffffff80b4f59a in vnode_pager_getpages () >> #8 0xffffffff80b30031 in vm_fault_hold () >> #9 0xffffffff80b2f797 in vm_fault () >> #10 0xffffffff80cb5a75 in trap_pfault () >> #11 0xffffffff80cb51dd in trap () >> #12 0xffffffff80c9b122 in calltrap () >> #13 0xffffffff80cb36f1 in copyin () >> #14 0xffffffff80977ddf in uiomove_faultflag () >> >> The FS stack configuration is somewhat unique, so I am not sure if I am >> hitting some rare race condition or lock ordering issues specific to that. >> It's basically ZFS (ZRAID) on top of pair or SATA SSDs with big file on >> that FS attached via md(4) and UFS2 on that md(4). The build itself runs >> in >> chroot with that UFS2 fs as its primary root. >> >> Just maybe additional bit of info, attempting to list the directory with >> that UFS image also got my bash process stuck in "zfs" state, backtrace >> from that is: >> >> (kgdb) thread 353 >> [Switching to thread 353 (Thread 100508)]#0 0xffffffff8095244e in >> sched_switch () >> (kgdb) bt >> #0 0xffffffff8095244e in sched_switch () >> #1 0xffffffff809313b1 in mi_switch () >> #2 0xffffffff8097089a in sleepq_wait () >> #3 0xffffffff809069ad in sleeplk () >> #4 0xffffffff809060e0 in __lockmgr_args () >> #5 0xffffffff809b8b7c in vop_stdlock () >> #6 0xffffffff80dd0a3b in VOP_LOCK1_APV () >> #7 0xffffffff809d6d23 in _vn_lock () >> #8 0xffffffff81a8c9cd in ?? () >> #9 0x0000000000000000 in ?? () >> _______________________________________________ >> freebsd-fs@freebsd.org mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-fs >> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" >> > > From owner-freebsd-fs@freebsd.org Thu Mar 3 02:20:31 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5AB2AAC2DDA for ; Thu, 3 Mar 2016 02:20:31 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4B1CC180E for ; Thu, 3 Mar 2016 02:20:31 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id u232KVrS017267 for ; Thu, 3 Mar 2016 02:20:31 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 207667] Reproducable kernel panics whilst scrubbing a zvol pool... Date: Thu, 03 Mar 2016 02:20:31 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 9.3-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: michelle@sorbs.net X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-bugs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 03 Mar 2016 02:20:31 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D207667 Michelle Sullivan changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |freebsd-fs@FreeBSD.org --=20 You are receiving this mail because: You are on the CC list for the bug.= From owner-freebsd-fs@freebsd.org Thu Mar 3 20:48:00 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9F9A4A9495A for ; Thu, 3 Mar 2016 20:48:00 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 922E4A17 for ; Thu, 3 Mar 2016 20:48:00 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id u23Km0k3050012 for ; Thu, 3 Mar 2016 20:48:00 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 207464] Panic when destroying ZFS snapshot on boot filesystem Date: Thu, 03 Mar 2016 20:48:00 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 10.2-STABLE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Many People X-Bugzilla-Who: dustinwenz@ebureau.com X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 03 Mar 2016 20:48:00 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D207464 --- Comment #4 from dustinwenz@ebureau.com --- I think this should be easy for anyone to reproduce at this point. I've confirmed the panic occurs consistently even after a clean install from the official 10.2 memstick image, using both UFS and ZFS boot on any new zpool = I've created. --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-fs@freebsd.org Fri Mar 4 00:51:10 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2E57AA94C93 for ; Fri, 4 Mar 2016 00:51:10 +0000 (UTC) (envelope-from zbeeble@gmail.com) Received: from mail-yk0-x231.google.com (mail-yk0-x231.google.com [IPv6:2607:f8b0:4002:c07::231]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id E50C2B1D for ; Fri, 4 Mar 2016 00:51:09 +0000 (UTC) (envelope-from zbeeble@gmail.com) Received: by mail-yk0-x231.google.com with SMTP id v186so16929296yke.2 for ; Thu, 03 Mar 2016 16:51:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to; bh=gnHPHUHFlMX2+jaMsGYByrdfCProdfOWJ/lO1tkmWnI=; b=ZvOhxISDfYnLOjcj0hplzYA6Oa0v3luVIn7VWfmxeKx+gZ5NREfQAbNSHHqhu5GPW+ AZuFRWADvMNT6MBpaMJhaLkRgxX6e+HfytPk2ulB5GjSOZ7aq2yod1hPy7Ivs+w+N5Zb yMXuCkpx1dFOz05ossl9RqSul9ohWxkd1hME9oFO5Ug4samw7xQxjptyFNegoXlmqjid +SzMSdPpVOq2Mq09TbcTSsc+URYooaPSg3kuFFXLEmnUwwGhfzH1aG6DRhhG3wtVPAvY BsUsAbVd7vEY5siURBnYAh0jyYoXtrwpEjclkTC0IPei8LWjH7zQg3IxJOlkQi+pEJI0 /Cnw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:date:message-id:subject:from:to; bh=gnHPHUHFlMX2+jaMsGYByrdfCProdfOWJ/lO1tkmWnI=; b=l8jFSbEESfdpVB50+A49PxXzj8Gdz7rbIcYw+VM+GgmhnP1g19uHoV3O2wxlzribbt Ta1qEg/9b5p+69UAZez1ELsqLvoz1foqL8O8PL9m0NjGuc3HionO/ekN7v73DnJ8hSdn 3c6sKZmVhbJSkIqE1j/HFIB3QPGgz7oEpLYf8eBsBg3ulSlh0UW0K3cSWTEulh04Z8DI nhEtlrEIbgiYeVyE+ftmsmbYfNn/zTr/S82Ynfzaw+5mYnVQt2nRrHl+zHcSlhFREDir bxOfoTeXt3KasIyScfPST1qubV+BFCvRMP/RxqBBon7tePivRLgdEao2EMBIsKYzu+to sxPw== X-Gm-Message-State: AD7BkJI8hrgAR9gpr5R0XIoX0lb6HQps4W/LGDOX3e+Bq2cWITrVODMfvakMKcnXk7/2tY1HNRDLPembln1r1A== MIME-Version: 1.0 X-Received: by 10.37.47.76 with SMTP id v73mr3174764ybv.42.1457052669069; Thu, 03 Mar 2016 16:51:09 -0800 (PST) Received: by 10.37.27.130 with HTTP; Thu, 3 Mar 2016 16:51:09 -0800 (PST) Date: Thu, 3 Mar 2016 19:51:09 -0500 Message-ID: Subject: hw.mfi.allow_cam_disk_passthrough missing? From: Zaphod Beeblebrox To: freebsd-fs Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.21 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 Mar 2016 00:51:10 -0000 so... I see in my source (10.2-STABLE): static int mfi_allow_disks = 0; TUNABLE_INT("hw.mfi.allow_cam_disk_passthrough", &mfi_allow_disks); SYSCTL_INT(_hw_mfi, OID_AUTO, allow_cam_disk_passthrough, CTLFLAG_RD, &mfi_allow_disks, 0, "event message locale"); ... sysctl hw.mfi.allow_cam_disk_passthrough: [1:6:306]root@virtual:~> sysctl hw.mfi.allow_cam_disk_passthrough sysctl: unknown oid 'hw.mfi.allow_cam_disk_passthrough': No such file or directory even though this is a generic kernel that contains mfi and the card in question probes up just fine. What gives? From owner-freebsd-fs@freebsd.org Fri Mar 4 00:55:02 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id AA5B3A94D07 for ; Fri, 4 Mar 2016 00:55:02 +0000 (UTC) (envelope-from chris@stankevitz.com) Received: from mango.stankevitz.com (mango.stankevitz.com [208.79.93.194]) by mx1.freebsd.org (Postfix) with ESMTP id 9C902C54 for ; Fri, 4 Mar 2016 00:55:02 +0000 (UTC) (envelope-from chris@stankevitz.com) Received: from Chriss-MacBook-Pro.local (209-203-101-124.static.twtelecom.net [209.203.101.124]) (using TLSv1.2 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mango.stankevitz.com (Postfix) with ESMTPSA id 1CA995C43D for ; Thu, 3 Mar 2016 16:54:56 -0800 (PST) From: Chris Stankevitz Subject: Processes freeze when they try to access particular ZFS directory To: FreeBSD Filesystems Message-ID: <56D8DCDF.9050600@stankevitz.com> Date: Thu, 3 Mar 2016 16:54:55 -0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 Mar 2016 00:55:02 -0000 I have a FreeBSD 10.1-RELEASE-p13 machine on an air-gapped network. When I perform an 'ls -l' on a particular directory (with ~5 files), the ls process freezes and does not return. kill -9 and closing the shell doesn't make the process go away either. Samba has some of these files open and I suspect the samba processes are stuck too. Something similar happened a month ago that required a "pull the plug" reboot as the FS was in such a state that it would not unmount. Please [quickly -- i'll probably have to reboot soon] let me know if you would like to see any debug output. Last line of 'truss ls -l' (before the freeze) is: lpathconf(...) = 1 (0x1) The frozen 'ls -l' process is in the DX+ state. zpool status is happy The pool has a 24 TB capacity. The problem FS is using 1TB with ~1.5 million files. Chris From owner-freebsd-fs@freebsd.org Fri Mar 4 09:16:05 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 142D19DA3AB for ; Fri, 4 Mar 2016 09:16:05 +0000 (UTC) (envelope-from killing@multiplay.co.uk) Received: from mail-wm0-x229.google.com (mail-wm0-x229.google.com [IPv6:2a00:1450:400c:c09::229]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id B91698B5 for ; Fri, 4 Mar 2016 09:16:04 +0000 (UTC) (envelope-from killing@multiplay.co.uk) Received: by mail-wm0-x229.google.com with SMTP id n186so25449965wmn.1 for ; Fri, 04 Mar 2016 01:16:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=multiplay-co-uk.20150623.gappssmtp.com; s=20150623; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-transfer-encoding; bh=gJXCOfEFic7krLnsWGWwSyM/6puaZnJXN2N+MYjiqCs=; b=mqvUSv6cJ3a7ri+kI8XhiyNe7+wFZ8lbuXY6ySO2bsw/z09PzfOG2ev6/yOaguBnzX H/1AYh7GmmWDbmllohMj8PoVsD+serjMa0eXquoyq70twi8sE/X8Ubd5HK04YHDgnEgy IIOy8IMjFj7VsU7kncGn0GDZD2WlK/7OSdcreP5kdTDAl5kNApgq5Udrpbo6m+t8j+fn sUU29eGGrhJxcJ0MlX5n7N+88WLtkEvAwXjNcX0Fb2COWJauf+8AB2J5KzvzUaciH5ox SvNEMQ4Sn4zfgvDEnp+bFzqMW6AXrzBpvjWa7ytwRcFsT/Oue4h1x87PuK9mNDIWuTzI t2VQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding; bh=gJXCOfEFic7krLnsWGWwSyM/6puaZnJXN2N+MYjiqCs=; b=hrdV5Do5undOoTSBH4hsfKjMisV3qWAndvVIJCt87nZoUeCM8OrQT4M07Uv+R4rvdL 6NpSjqhwBe8cTs/3//eAgNpaarj79+V/5lyyQi41MzsEgdu9MPiIjScjB6RfA8GyeDXL 0Ee6tzlaoBJkmzY3C7rXfw/t+VQP7uNmspCoDdjizIJlPPsj3J5367TTGIAM3FfPhm8L tkIARq+Ot8seNxvJfUeGu/4LQ282lNKdhrhtIiZPZjPvB4m3nUnZx+SgVMeO8+Kzhxpt suYRnIxc3pkxvFkGTJMP+iw6Gfd3x+3ui2XQkoUAVA0OtJ9bBTUGWNatWBOHZ82dvoHD ypWw== X-Gm-Message-State: AD7BkJIQe5zaMjAIGLdO/jKnSc4JsYr+D9U+jN2NdzFl3NOjPCUFs640wdFrCSnDcCYWz0mU X-Received: by 10.28.87.65 with SMTP id l62mr4102370wmb.102.1457082963208; Fri, 04 Mar 2016 01:16:03 -0800 (PST) Received: from [10.10.1.58] (liv3d.labs.multiplay.co.uk. [82.69.141.171]) by smtp.gmail.com with ESMTPSA id 3sm2371763wmp.14.2016.03.04.01.16.01 for (version=TLSv1/SSLv3 cipher=OTHER); Fri, 04 Mar 2016 01:16:01 -0800 (PST) Subject: Re: hw.mfi.allow_cam_disk_passthrough missing? To: freebsd-fs@freebsd.org References: From: Steven Hartland Message-ID: <56D95251.5080800@multiplay.co.uk> Date: Fri, 4 Mar 2016 09:16:01 +0000 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 Mar 2016 09:16:05 -0000 Do you have the mfip module loaded? This is separate from mfi. If you want direct drives you may be better off using mrsas by adding the following to /boot/loader.conf: hw.mfi.mrsas_enable="1" On 04/03/2016 00:51, Zaphod Beeblebrox wrote: > so... I see in my source (10.2-STABLE): > > static int mfi_allow_disks = 0; > TUNABLE_INT("hw.mfi.allow_cam_disk_passthrough", &mfi_allow_disks); > SYSCTL_INT(_hw_mfi, OID_AUTO, allow_cam_disk_passthrough, CTLFLAG_RD, > &mfi_allow_disks, 0, "event message locale"); > > ... sysctl hw.mfi.allow_cam_disk_passthrough: > > [1:6:306]root@virtual:~> sysctl hw.mfi.allow_cam_disk_passthrough > sysctl: unknown oid 'hw.mfi.allow_cam_disk_passthrough': No such file or > directory > > even though this is a generic kernel that contains mfi and the card in > question probes up just fine. > > What gives? > _______________________________________________ > freebsd-fs@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" From owner-freebsd-fs@freebsd.org Fri Mar 4 11:51:56 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 621259DA26E for ; Fri, 4 Mar 2016 11:51:56 +0000 (UTC) (envelope-from artem@artem.ru) Received: from smtp38.i.mail.ru (smtp38.i.mail.ru [94.100.177.98]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id C38F778F for ; Fri, 4 Mar 2016 11:51:55 +0000 (UTC) (envelope-from artem@artem.ru) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=mail.ru; s=mail2; h=Content-Transfer-Encoding:Content-Type:MIME-Version:Date:Message-ID:Subject:From:To; bh=uYGcPUGgc+tj19ej4WsOVEx/SZDkjoX+nmhhcYelFCY=; b=Fty19pmr0u9UvQMkLwvQEz//Hl4Z6GBFUBt711crqaEny9t+vG0GvRRRte/pkD25T6f8QyG80t0nB8Cm0u8W9uZNYwO4SQ8nbUCfM5z7RZIGEtdRdtgBRcfmZPSNW+kNC51LwTKEmBudQ+19wbWj08INqWfMi11Nc6PvBa3glvU=; Received: from [109.188.127.16] (port=57774 helo=[192.168.0.12]) by smtp38.i.mail.ru with esmtpa (envelope-from ) id 1aboGn-0000LL-KI for freebsd-fs@freebsd.org; Fri, 04 Mar 2016 14:51:46 +0300 To: freebsd-fs@freebsd.org From: Artem Kuchin Subject: Huge quota.group file Message-ID: <56D976F5.4090605@artem.ru> Date: Fri, 4 Mar 2016 14:52:21 +0300 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-Mras: Ok X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 Mar 2016 11:51:56 -0000 Hello! Today i noticed second time that quotarep run 100% of CPU for crazy amount of time. I killed it. Then checked /quota.group file It is over 40GB in size Tried googling about such problem and found only that quota.group becomes crazy huge when there are files with unknown group and that i should find such file using find / -nogroup >filelist_group.txt fix group on them and rerun checkquota But i cannot do it because i am running several jails and using quota from host but jails have their own group and users. Still, i i don't understand why such crazy behavior. Any comments or suggestions of this? Also, how to reinit quota.group with rebooting? FreeBSD XXX 10.2-STABLE FreeBSD 10.2-STABLE #1 r293034M: Sat Jan 2 14:55:41 MSK 2016 XXX@XXX:/usr/obj/usr/src/sys/OMNI amd64 Artem From owner-freebsd-fs@freebsd.org Fri Mar 4 05:14:43 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C30469DAEF3; Fri, 4 Mar 2016 05:14:43 +0000 (UTC) (envelope-from fjwcash@gmail.com) Received: from mail-ig0-x231.google.com (mail-ig0-x231.google.com [IPv6:2607:f8b0:4001:c05::231]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 892AAEC1; Fri, 4 Mar 2016 05:14:43 +0000 (UTC) (envelope-from fjwcash@gmail.com) Received: by mail-ig0-x231.google.com with SMTP id g6so3646657igt.1; Thu, 03 Mar 2016 21:14:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc; bh=5Z1gJ4aEiPZcVdlW3BJaif1Mvu3T9EWGyCLZ7Gbzpp4=; b=N1raiAZ975lVT1wzgSxmNLnCs7Gs+dwVFMaWoOahNGdype7l5PioOwwv9Z2opSYMe+ YSfyZEhz2L+OXrlGp/Fsxh6CZ84sF7yZTn8aNs5DcXwdRbphzeczvqc5pC4XNbytQpqx lQdaqtKz2W06xdFY8Dk5LAqez15FJhXqMEdIohB8BVKO+5lbJs7FwPz6N5UGf2YEGlKx 3G7J/bDspPj/l61e/Te2ttggP6vU/r4Y1a9nC+cjOweW4HOSXNMTxKhQYhii4eDUyq0p t7yrV8xRvwyTBe5i5GRHMO0HroyjJEm2MBwC1mep6xigzNiBKgfghcKB9KQjrPFSrAKi GOrQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc; bh=5Z1gJ4aEiPZcVdlW3BJaif1Mvu3T9EWGyCLZ7Gbzpp4=; b=BeSdAkvHVEcuJOHa97ubZU5vAa2v95fc8TTiTVk06cKC/DNRx4eKsLdE7Y9mMAHbvj 36vyzrB4/wnV8tM6Ooa8T1JLopTa8Py2knZUBuHRTsJn+qmxpkQO2dGY9T2MX5Qv+/iT OUkvjuxTKWUAzyfDMEI66ffGKymTENOIVmgMiBWu8lCJRiKRmpFjBk0fSdd9SJIcS4zR o6vS0jKboaS2IRcmvD7GRw8UvCL9AC5VPEh7NJMDZkB4Uk5Chrcl4tZhe+LUZmo3Bbxn nynY9FfNgquTxTDI5LXf/w8qW/ecr25JLdmiKhIPi9SekeU/a2Bn06TVnUnWBUMdeKlJ ZrKQ== X-Gm-Message-State: AD7BkJKbJCtJKFGeP12mCnxu/0XcUSyRal9MvwRvn0SEeHVnyMp708XnW8YQltv0vF3z/sJFTZDUC2H90XVaYA== MIME-Version: 1.0 X-Received: by 10.50.70.39 with SMTP id j7mr2851072igu.40.1457068482858; Thu, 03 Mar 2016 21:14:42 -0800 (PST) Received: by 10.107.140.129 with HTTP; Thu, 3 Mar 2016 21:14:42 -0800 (PST) Received: by 10.107.140.129 with HTTP; Thu, 3 Mar 2016 21:14:42 -0800 (PST) In-Reply-To: References: <95563acb-d27b-4d4b-b8f3-afeb87a3d599@me.com> <56D87784.4090103@broken.net> Date: Thu, 3 Mar 2016 21:14:42 -0800 Message-ID: Subject: Re: [zfs] an interesting survey -- the zpool with most disks you have ever built From: Freddie Cash To: zfs@lists.illumos.org Cc: omnios-discuss , illumos-dev , "freebsd-fs@FreeBSD.org" , "zfs-devel@freebsd.org" , Discussion list for OpenIndiana , developer , "developer@lists.open-zfs.org" , "zfs-discuss@list.zfsonlinux.org" , "smartos-discuss@lists.smartos.org" X-Mailman-Approved-At: Fri, 04 Mar 2016 12:12:21 +0000 Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.21 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 Mar 2016 05:14:44 -0000 On Mar 3, 2016 8:36 PM, "Fred Liu" wrote: > > Hi, > > Today when I was reading Jeff's new nuclear weapon -- DSSD D5's CUBIC RAID introduction, > the interesting survey -- the zpool with most disks you have ever built popped in my brain. We have two backups servers with 90 drives each (2x 45-bay JBODs connected to a head unit using LSI 9211-8e controllers) configured into 6-disk raidz2 vdevs in a single storage pool. They'll support 4x JBODs each without daisy-chaining, although we don't have plans for don't that for another couple years. From owner-freebsd-fs@freebsd.org Fri Mar 4 13:47:33 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6BCAC9DA104 for ; Fri, 4 Mar 2016 13:47:33 +0000 (UTC) (envelope-from chris@stankevitz.com) Received: from mango.stankevitz.com (mango.stankevitz.com [208.79.93.194]) by mx1.freebsd.org (Postfix) with ESMTP id 5CD66E6B for ; Fri, 4 Mar 2016 13:47:32 +0000 (UTC) (envelope-from chris@stankevitz.com) Received: from Chriss-MacBook-Pro.local (209-203-101-124.static.twtelecom.net [209.203.101.124]) (using TLSv1.2 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mango.stankevitz.com (Postfix) with ESMTPSA id C6EFC5C4BB for ; Fri, 4 Mar 2016 05:47:31 -0800 (PST) Subject: Re: Processes freeze when they try to access particular ZFS directory To: FreeBSD Filesystems References: <56D8DCDF.9050600@stankevitz.com> From: Chris Stankevitz Message-ID: <56D991F3.7000408@stankevitz.com> Date: Fri, 4 Mar 2016 05:47:31 -0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-Version: 1.0 In-Reply-To: <56D8DCDF.9050600@stankevitz.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 Mar 2016 13:47:33 -0000 On 3/3/16 4:54 PM, Chris Stankevitz wrote: > I have a FreeBSD 10.1-RELEASE-p13 machine on an air-gapped network. When > I perform an 'ls -l' on a particular directory (with ~5 files), the ls > process freezes and does not return. kill -9 and closing the shell > doesn't make the process go away either. The file was in a nullfs mount of a ZFS in a jail. Apparently that is an issue so I stopped doing it. Chris From owner-freebsd-fs@freebsd.org Fri Mar 4 15:23:40 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 396389DB739 for ; Fri, 4 Mar 2016 15:23:40 +0000 (UTC) (envelope-from pho@holm.cc) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 24C8FB12 for ; Fri, 4 Mar 2016 15:23:40 +0000 (UTC) (envelope-from pho@holm.cc) Received: by mailman.ysv.freebsd.org (Postfix) id 1FD6F9DB736; Fri, 4 Mar 2016 15:23:40 +0000 (UTC) Delivered-To: fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 1F6D49DB735 for ; Fri, 4 Mar 2016 15:23:40 +0000 (UTC) (envelope-from pho@holm.cc) Received: from relay01.pair.com (relay01.pair.com [209.68.5.15]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id EDDE7B11 for ; Fri, 4 Mar 2016 15:23:39 +0000 (UTC) (envelope-from pho@holm.cc) Received: from x2.osted.lan (87-58-223-204-dynamic.dk.customer.tdc.net [87.58.223.204]) by relay01.pair.com (Postfix) with ESMTP id 498F6D00621; Fri, 4 Mar 2016 10:23:32 -0500 (EST) Received: from x2.osted.lan (localhost [127.0.0.1]) by x2.osted.lan (8.14.9/8.14.9) with ESMTP id u24FNTim075307 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Fri, 4 Mar 2016 16:23:29 +0100 (CET) (envelope-from pho@x2.osted.lan) Received: (from pho@localhost) by x2.osted.lan (8.14.9/8.14.9/Submit) id u24FNSQm075306; Fri, 4 Mar 2016 16:23:28 +0100 (CET) (envelope-from pho) Date: Fri, 4 Mar 2016 16:23:28 +0100 From: Peter Holm To: Konstantin Belousov Cc: Maxim Sobolev , Kirk McKusick , fs@freebsd.org Subject: Re: Process stuck in "vnread" Message-ID: <20160304152328.GA71951@x2.osted.lan> References: <20160302095339.GB67250@kib.kiev.ua> <20160302115707.GF67250@kib.kiev.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160302115707.GF67250@kib.kiev.ua> User-Agent: Mutt/1.5.23 (2014-03-12) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 Mar 2016 15:23:40 -0000 On Wed, Mar 02, 2016 at 01:57:07PM +0200, Konstantin Belousov wrote: > On Wed, Mar 02, 2016 at 03:02:02AM -0800, Maxim Sobolev wrote: > > About the backtrace, indeed, looks like you are right and some portion of > > it is not decoded properly, as it's loaded as a kernel module. The setup is > > somewhat even more complicated, the /usr/ports is mounted via NULLFS, so in > > this command: > > > > cp /usr/local/share/automake-1.15/compile ./compile > > > > The target (i.e. ./compile) here is a path on ZFS that is exported via > > NULLFS, while the source is a file on UFS2->md->ZFS. This is probably the > > reason stack trace is incomplete, both zfs.ko and nullfs.ko are loaded as > > modules and the next few frames point towards those. Unfortunately I cannot > > beat kgdb to read symbols from those .ko's and decode them. > > Is nullfs mount put over ZFS only ? The backtrace you shown cannot > happen for ZFS, since ZFS has its own pager vop. In fact, I would > agree that the backtrace is reasonable for nullfs over UFS upper vnode. > The following patch should fix the 'paging while faulting on uiomove' > issue for nullfs over UFS. > > Peter, could you, please, test the patch ? It is purely nullfs change, > and the most interesting situation is the ups' deadlock, but the whole > set of nullfs tests would be good to check. > > diff --git a/sys/fs/nullfs/null_vfsops.c b/sys/fs/nullfs/null_vfsops.c > index 64e1e29..49bae28 100644 > --- a/sys/fs/nullfs/null_vfsops.c I have tested this patch with all of the nullfs scenarios I have, multiple times. I also tested the UPS' scenario on nullfs without finding any problems. - Peter From owner-freebsd-fs@freebsd.org Fri Mar 4 21:12:33 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 92E339DA486 for ; Fri, 4 Mar 2016 21:12:33 +0000 (UTC) (envelope-from zbeeble@gmail.com) Received: from mail-yk0-x22a.google.com (mail-yk0-x22a.google.com [IPv6:2607:f8b0:4002:c07::22a]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 52BBBD19 for ; Fri, 4 Mar 2016 21:12:33 +0000 (UTC) (envelope-from zbeeble@gmail.com) Received: by mail-yk0-x22a.google.com with SMTP id n145so1042587ykb.1 for ; Fri, 04 Mar 2016 13:12:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc; bh=pwUYM3so1cavIUn4yVVFPGSba1++ELDcapV1AJCxtfY=; b=z67lRNAJPYIMnkKFaHnPqyyVFNMshV8fA+LT1HNA8tXZHzAeV3nSxF4qkIBLjTcRQZ tV2NvcOPdg+Xk41W9dHs36QcPCqX4RiFM/R0DtOSaPwOUrR7hgsVJx9w97F32afwamDN s/vwLtAapKxnyLSl7MUkA6QD6uW+38vghPyW/fmXJkb0kebuNp/Td9k4FcURLK0v3Oe0 CEcVkxmtHRdt5ARw2ZX0chP+V/pWTmHm7QErhRKknlwSRAj7j2xepE0mt4bTZajY/DGY ATd2VSPEw7zA/r9Ns77RNS3HZlIcEIjHbQvD9zrSmFSS/fuy+TdN/dIDf75JCUoCcSrB 5qeA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc; bh=pwUYM3so1cavIUn4yVVFPGSba1++ELDcapV1AJCxtfY=; b=kNG+vq4vKltsCR6V9gR+w5/+KJY15gN6txk/90cnZOzkmyjOZeMW+Cu1YsamBMplor 05ZDMlqCfwJYpH1eHyWR+VRaN5/EOpFENef8nZu9EnHFQAyFpIhFPWhH6CPI7BDstZf6 0r5PtrmrqCDis+qbRLT4isN6B5dsBaSLISSz7ogfeyBjTF8hLdz5FAMYaZzjo/gmaArB 0kHcS4+cEeemNBuRGAcvGpe+d7AGyRRqwZGmyxD4QTYU+2S4H7vcqFIAcd66QZhwObvb GmZvW0zvdnnZ8pLmuBuv4J9pfVLAXnKItEfS/QTO8qJCWRfPvwUdfdRvDkkSeGHxrYSi AXYg== X-Gm-Message-State: AD7BkJK/XGadg4CK+VwU/cGAWsXDy9bdGicrYgk6tHRSFSTfAQy61dD5tScx5hOQf92eM7BxCkFkq8PQDvd/Sw== MIME-Version: 1.0 X-Received: by 10.37.64.207 with SMTP id n198mr6067506yba.12.1457125952504; Fri, 04 Mar 2016 13:12:32 -0800 (PST) Received: by 10.37.27.130 with HTTP; Fri, 4 Mar 2016 13:12:32 -0800 (PST) In-Reply-To: <56D95251.5080800@multiplay.co.uk> References: <56D95251.5080800@multiplay.co.uk> Date: Fri, 4 Mar 2016 16:12:32 -0500 Message-ID: Subject: Re: hw.mfi.allow_cam_disk_passthrough missing? From: Zaphod Beeblebrox To: Steven Hartland Cc: freebsd-fs Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.21 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 Mar 2016 21:12:33 -0000 OK. Good call on mfip ---> but where's the documentation on mfip? mrsas is not an option as this is a 1078 based card (not supported AFAICT by mrsas). Since I'm here, what's the opinion on mfip? On Fri, Mar 4, 2016 at 4:16 AM, Steven Hartland wrote: > Do you have the mfip module loaded? This is separate from mfi. > > If you want direct drives you may be better off using mrsas by adding the > following to /boot/loader.conf: > hw.mfi.mrsas_enable="1" > > > On 04/03/2016 00:51, Zaphod Beeblebrox wrote: > >> so... I see in my source (10.2-STABLE): >> >> static int mfi_allow_disks = 0; >> TUNABLE_INT("hw.mfi.allow_cam_disk_passthrough", &mfi_allow_disks); >> SYSCTL_INT(_hw_mfi, OID_AUTO, allow_cam_disk_passthrough, CTLFLAG_RD, >> &mfi_allow_disks, 0, "event message locale"); >> >> ... sysctl hw.mfi.allow_cam_disk_passthrough: >> >> [1:6:306]root@virtual:~> sysctl hw.mfi.allow_cam_disk_passthrough >> sysctl: unknown oid 'hw.mfi.allow_cam_disk_passthrough': No such file or >> directory >> >> even though this is a generic kernel that contains mfi and the card in >> question probes up just fine. >> >> What gives? >> _______________________________________________ >> freebsd-fs@freebsd.org mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-fs >> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" >> > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > From owner-freebsd-fs@freebsd.org Fri Mar 4 21:31:45 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8D3699DAAE4 for ; Fri, 4 Mar 2016 21:31:45 +0000 (UTC) (envelope-from mike@sentex.net) Received: from smarthost2.sentex.ca (smarthost2.sentex.ca [IPv6:2607:f3e0:80:80::2]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client CN "smarthost.sentex.ca", Issuer "smarthost.sentex.ca" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 1D1FE7EF for ; Fri, 4 Mar 2016 21:31:45 +0000 (UTC) (envelope-from mike@sentex.net) Received: from lava.sentex.ca (lava.sentex.ca [IPv6:2607:f3e0:0:5::11]) by smarthost2.sentex.ca (8.15.2/8.15.2) with ESMTPS id u24LVhkL031000 (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Fri, 4 Mar 2016 16:31:43 -0500 (EST) (envelope-from mike@sentex.net) Received: from [IPv6:2607:f3e0:0:4:5c30:ed1b:e203:c55c] ([IPv6:2607:f3e0:0:4:5c30:ed1b:e203:c55c]) by lava.sentex.ca (8.14.9/8.14.9) with ESMTP id u24LVgJd096217; Fri, 4 Mar 2016 16:31:42 -0500 (EST) (envelope-from mike@sentex.net) Subject: Re: hw.mfi.allow_cam_disk_passthrough missing? To: Zaphod Beeblebrox , Steven Hartland References: <56D95251.5080800@multiplay.co.uk> Cc: freebsd-fs From: Mike Tancsa Organization: Sentex Communications Message-ID: <56D9FEB2.9080904@sentex.net> Date: Fri, 4 Mar 2016 16:31:30 -0500 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.78 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 Mar 2016 21:31:45 -0000 On 3/4/2016 4:12 PM, Zaphod Beeblebrox wrote: > > Since I'm here, what's the opinion on mfip? I have been using it on a couple of boxes with the 9240-8i controller for some time with very good results. You see the disks via cam. In the box below, I use the disks as JBODs for a large ZFS pool. # camcontrol devlist at scbus0 target 8 lun 0 (pass0,da0) at scbus0 target 9 lun 0 (pass1,da1) at scbus0 target 10 lun 0 (pass2,da2) at scbus0 target 11 lun 0 (pass3,da3) at scbus0 target 13 lun 0 (pass4,da4) at scbus0 target 15 lun 0 (pass5,da5) at scbus1 target 0 lun 0 (pass6,da6) at scbus1 target 1 lun 0 (pass7,da7) at scbus1 target 2 lun 0 (pass8,da8) at scbus1 target 3 lun 0 (pass9,da9) at scbus1 target 4 lun 0 (pass10,da10) at scbus1 target 5 lun 0 (pass11,da11) at scbus1 target 6 lun 0 (pass12,da12) smartctl works too smartctl -a /dev/da0 -d sat smartctl 6.4 2015-06-04 r4109 [FreeBSD 10.3-BETA3 amd64] (local build) Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Western Digital Red Device Model: WDC WD40EFRX-68WT0N0 Serial Number: WD-WCC4E0911515 LU WWN Device Id: 5 0014ee 209be76c0 Firmware Version: 80.00A80 User Capacity: 4,000,787,030,016 bytes [4.00 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate: 5400 rpm Device is: In smartctl database [for details use: -P show] ATA Version is: ACS-2 (minor revision not indicated) SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is: Fri Mar 4 16:30:04 2016 EST SMART support is: Available - device has SMART capability. SMART support is: Enabled # mfiutil show adapter mfi0 Adapter: Product Name: LSI MegaRAID SAS 9240-8i Serial Number: SP34906493 Firmware: 20.13.1-0208 RAID Levels: JBOD, RAID0, RAID1, RAID10 Battery Backup: not present NVRAM: 32K Onboard Memory: 0M Minimum Stripe: 8K Maximum Stripe: 64K ---Mike -- ------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada http://www.tancsa.com/ From owner-freebsd-fs@freebsd.org Sat Mar 5 09:53:33 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C4BB1A0A138 for ; Sat, 5 Mar 2016 09:53:33 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id B585D262 for ; Sat, 5 Mar 2016 09:53:33 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id u259rXOb009386 for ; Sat, 5 Mar 2016 09:53:33 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-fs@FreeBSD.org Subject: [Bug 167109] [zfs] [panic] zfs diff kernel panic Fatal trap 9: general protection fault Date: Sat, 05 Mar 2016 09:53:33 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 9.0-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: claudius@ambtec.de X-Bugzilla-Status: Closed X-Bugzilla-Resolution: Unable to Reproduce X-Bugzilla-Priority: Normal X-Bugzilla-Assigned-To: freebsd-fs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: resolution bug_status Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 05 Mar 2016 09:53:33 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D167109 claudius@ambtec.de changed: What |Removed |Added ---------------------------------------------------------------------------- Resolution|--- |Unable to Reproduce Status|In Progress |Closed --- Comment #2 from claudius@ambtec.de --- Never occurred again, I'm running on 10.2 now everything is fine. Bug can be closed --=20 You are receiving this mail because: You are the assignee for the bug.= From owner-freebsd-fs@freebsd.org Sat Mar 5 19:54:34 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E7D9C9DBAB9 for ; Sat, 5 Mar 2016 19:54:34 +0000 (UTC) (envelope-from bra@fsn.hu) Received: from dg.fsn.hu (dg.fsn.hu [84.2.225.196]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "dg.fsn.hu", Issuer "dg.fsn.hu" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id A9934BA0 for ; Sat, 5 Mar 2016 19:54:34 +0000 (UTC) (envelope-from bra@fsn.hu) Received: by dg.fsn.hu (Postfix, from userid 1003) id 2E4212E7E; Sat, 5 Mar 2016 20:46:08 +0100 (CET) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MF-ACE0E1EA [pR: 4.9824] X-CRM114-CacheID: sfid-20160305_20460_81529E6B X-CRM114-Status: UNSURE (4.9824) This message is 'unsure'; please train it! X-DSPAM-Result: Whitelisted X-DSPAM-Processed: Sat Mar 5 20:46:08 2016 X-DSPAM-Confidence: 0.9899 X-DSPAM-Probability: 0.0000 X-DSPAM-Signature: 56db3780242171590725600 X-DSPAM-Factors: 27, just, 0.01000, Date*20+46, 0.01000, Received*online.co.hu+[195.228.243.99]), 0.01000, or, 0.01000, Subject*at, 0.01000, an, 0.01000, from, 0.01000, From*"Nagy, Attila" , 0.01000, User-Agent*Mozilla/5.0, 0.01000, says, 0.01000, million, 0.01000, hard, 0.01000, hard, 0.01000, times, 0.01000, a+limitation, 0.01000, Received*Mar+2016, 0.01000, Received*by+dg.fsn.hu, 0.01000, Received*, 0.01000, Date*46+07, 0.01000, Received*Sat+5, 0.01000, To*fs+freebsd.org, 0.01000, Received*from+[IPv6, 0.01000, X-Spambayes-Classification: ham; 0.01 Received: from [IPv6:::1] (japan.t-online.co.hu [195.228.243.99]) by dg.fsn.hu (Postfix) with ESMTPSA id B49F12E7C for ; Sat, 5 Mar 2016 20:46:07 +0100 (CET) To: freebsd-fs@freebsd.org From: "Nagy, Attila" Subject: zfs and st_nlink limit at 32767 Message-ID: <56DB377F.9020205@fsn.hu> Date: Sat, 5 Mar 2016 20:46:07 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 05 Mar 2016 19:54:35 -0000 Hi, If I create a million hard links to a file, stat -s says it has 32767: $ stat -s 900402.24.t st_dev=1709683738 st_ino=719745 st_mode=0100644 st_nlink=32767 st_uid=1001 st_gid=0 st_rdev=4294967295 st_size=81688 st_atime=1455881393 st_mtime=1455881393 st_ctime=1457206643 st_birthtime=1457206536 st_blksize=81920 st_blocks=67 st_flags=2048 Is this a limitation somewhere which is hard to remove, or just an easily fixable "legacy" from the times, when all filesystems contained this limit? From owner-freebsd-fs@freebsd.org Sat Mar 5 21:23:19 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9AC00A13FAE for ; Sat, 5 Mar 2016 21:23:19 +0000 (UTC) (envelope-from sodynet1@gmail.com) Received: from mail-lb0-x22c.google.com (mail-lb0-x22c.google.com [IPv6:2a00:1450:4010:c04::22c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 25066211 for ; Sat, 5 Mar 2016 21:23:19 +0000 (UTC) (envelope-from sodynet1@gmail.com) Received: by mail-lb0-x22c.google.com with SMTP id x1so95094784lbj.3 for ; Sat, 05 Mar 2016 13:23:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc; bh=NlNSpn1w8HKcGhBcLxYa3N5NzGMD5pnqIq+ESAFdrAM=; b=FCduHOpwjPDNbvX/PbB3LRQ8GSTab5AWugNDtwVJ5m/Y2xH3qHKllqg0wAh8csTi4p iHeJqV66w5keJqZ7GtTdNxo6tduvxI0rqDanbZk8ZOcDkGQ/NpQoCRU939KsZIUHcDS4 ++PM/e4trR1tQ1xs4GRp6OsY6O2c7I6Etcq759V0khWO8RoqHqmRB2schKtuyYPMArll 3CCzIcrVWxBtpJidpLavOdPT5rPYZp09nx6cfKo/kjmXLtzcenfVW/X0dQ27ji/JIz1M pyda4UsQsKe6nhiQ7Fv62iRIM8cyFs+jgmpLXqUTeyUljaRSB9riovKj3o81DQMKZdAF 2Bcw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc; bh=NlNSpn1w8HKcGhBcLxYa3N5NzGMD5pnqIq+ESAFdrAM=; b=G5/8LtbP55KIMGKPJQ2oivdY0YoAXZ1vcsN2EblGM86Um95RvFkdQreN6PWTaNWZoP hk9NW88HVhxTU06yXHNi/9QGBxKzm2cj1r0tvOKLmo8ytGMzei6q3CbLjGzGZlgGNtDB qg6fXqd/19ousDfCeExE43TPLwPhHfwHzsybmAsOozimtZ6DSvZNbi6RrIc9wf3oU7a2 hZhQz5y8LOV5WI8z9n2Bb0fHelBQVUkRMKbMKXWF0pQxx8DsyKC9IhoumubWzeKrOoeJ 0txFMtthmlAr7s2LKmQqI3hqKF6DJ+vl4xIfDOUTvatNNpOZQzEPUfSZ7rCJnw4cXkhQ X5JQ== X-Gm-Message-State: AD7BkJJKWwVxzPD5qdvbq/rPvM5Y821PvmkA+CAkFWB5woLd77yy8fFaB5LJSgDdvKpBDhj7LEw9SDK49EMtfg== MIME-Version: 1.0 X-Received: by 10.112.139.104 with SMTP id qx8mr5203289lbb.88.1457212996710; Sat, 05 Mar 2016 13:23:16 -0800 (PST) Received: by 10.112.44.68 with HTTP; Sat, 5 Mar 2016 13:23:16 -0800 (PST) Received: by 10.112.44.68 with HTTP; Sat, 5 Mar 2016 13:23:16 -0800 (PST) In-Reply-To: <56DB377F.9020205@fsn.hu> References: <56DB377F.9020205@fsn.hu> Date: Sat, 5 Mar 2016 23:23:16 +0200 Message-ID: Subject: Re: zfs and st_nlink limit at 32767 From: Sami Halabi To: "Nagy, Attila" Cc: freebsd-fs@freebsd.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.21 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 05 Mar 2016 21:23:19 -0000 it seems an integer limit (32bit).. I guess easily to fix by converting to unsigned int / 64 bit. sami =D7=91=D7=AA=D7=90=D7=A8=D7=99=D7=9A 5 =D7=91=D7=9E=D7=A8=D7=A5 2016 9:54 P= M,=E2=80=8F "Nagy, Attila" =D7=9B=D7=AA=D7=91: > Hi, > > If I create a million hard links to a file, stat -s says it has 32767: > $ stat -s 900402.24.t > st_dev=3D1709683738 st_ino=3D719745 st_mode=3D0100644 st_nlink=3D32767 st= _uid=3D1001 > st_gid=3D0 st_rdev=3D4294967295 st_size=3D81688 st_atime=3D1455881393 > st_mtime=3D1455881393 st_ctime=3D1457206643 st_birthtime=3D1457206536 > st_blksize=3D81920 st_blocks=3D67 st_flags=3D2048 > > Is this a limitation somewhere which is hard to remove, or just an easily > fixable "legacy" from the times, when all filesystems contained this limi= t? > _______________________________________________ > freebsd-fs@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > From owner-freebsd-fs@freebsd.org Sat Mar 5 21:24:56 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 86470A920E8 for ; Sat, 5 Mar 2016 21:24:56 +0000 (UTC) (envelope-from killing@multiplay.co.uk) Received: from mail-wm0-x22e.google.com (mail-wm0-x22e.google.com [IPv6:2a00:1450:400c:c09::22e]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 338BA2FF for ; Sat, 5 Mar 2016 21:24:56 +0000 (UTC) (envelope-from killing@multiplay.co.uk) Received: by mail-wm0-x22e.google.com with SMTP id n186so35773380wmn.1 for ; Sat, 05 Mar 2016 13:24:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=multiplay-co-uk.20150623.gappssmtp.com; s=20150623; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-transfer-encoding; bh=qlJDNlfs9rVdWJIWxqIuEqamECdZPBTYqc5V/IVod9Y=; b=1yUNsAjyvf1fIzoaHGsZjlhkM/AaaPfF2a2B621vYemTmbpLkpLYZu7GyufI8jjPfv KA4XnW5hdrSTMhV1a3ZCMFfN98ZRuXBJvKWMD3R1FnzXxjxmQBqMiHA2hv61p3BsF0YM n20jVTLlhKIpKvy58f4jg8AB/omiQVBAj2FUxpt4z7uJUYMw/zxcypCxAj7F9abNrcrj Za84JZugb/Qi7kQLH4ZxJIfDx0itp7yvAD3Yjz16aj+Cog51NefkMFUfKoGUE1qX0FkW 4/gGhmwiVnxOrB+axiYGcadrCkp8R0TOG6buecgOMhAGZ9486kXq4+XHOa2cefVz8uKV 4CpA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding; bh=qlJDNlfs9rVdWJIWxqIuEqamECdZPBTYqc5V/IVod9Y=; b=TTt7bAy99FkbAMTUPnwZZyORuOKz7ugez1El9W5UUpT9dTAKheY7Jygt24MuZlliXH kqxq5ItmV+SrPPoNkTupw6fqE8vcJYi+gh7DXU1yUx7OZZGWw+qNydjrANUzSVyW2ExF 9dS0kvWfNMtQHyV3IgM7l0sGddf0wjrfHZE/G4HSOws6qbnutxgIUXXb6o2qsylaHRmY wc+quCEhSeTLBPDLKiiTUXvKU233QdfXy1olftbXPMZCXW2ErMuvIHnX816gsQBPaUAU 3PWx+YQIVEYj61tZhAxjFdiJ9KCAJw3eBy3IA60HvQlByRkrleYUjLVd1ri30hf/XRm3 FX3w== X-Gm-Message-State: AD7BkJL+OMIt5Yt1CJoIhhBoqobSuJQlqjpjlJk7+umasQHO+ssbjNaFWqHJNMarcL8RdNDz X-Received: by 10.194.123.102 with SMTP id lz6mr16884574wjb.2.1457213094536; Sat, 05 Mar 2016 13:24:54 -0800 (PST) Received: from [10.10.1.58] (liv3d.labs.multiplay.co.uk. [82.69.141.171]) by smtp.gmail.com with ESMTPSA id gt7sm9837044wjc.1.2016.03.05.13.24.53 for (version=TLSv1/SSLv3 cipher=OTHER); Sat, 05 Mar 2016 13:24:53 -0800 (PST) Subject: Re: zfs and st_nlink limit at 32767 To: freebsd-fs@freebsd.org References: <56DB377F.9020205@fsn.hu> From: Steven Hartland Message-ID: <56DB4EA7.7050002@multiplay.co.uk> Date: Sat, 5 Mar 2016 21:24:55 +0000 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0 MIME-Version: 1.0 In-Reply-To: <56DB377F.9020205@fsn.hu> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 05 Mar 2016 21:24:56 -0000 Correct stat st_nlink is a nlink_t which is defined as uint16_t, its not clear why its clamping at what looks like int16_t max. It looks like the kernel version in nstat is a uint32_t so internally it should be correct. You may have some joy changing it to uint32_t but is likely everything will rebuilding and even then there may be some edge cases which break one that sticks out is linux compat support which doesn't use nlink_t. Regards Steve On 05/03/2016 19:46, Nagy, Attila wrote: > Hi, > > If I create a million hard links to a file, stat -s says it has 32767: > $ stat -s 900402.24.t > st_dev=1709683738 st_ino=719745 st_mode=0100644 st_nlink=32767 > st_uid=1001 st_gid=0 st_rdev=4294967295 st_size=81688 > st_atime=1455881393 st_mtime=1455881393 st_ctime=1457206643 > st_birthtime=1457206536 st_blksize=81920 st_blocks=67 st_flags=2048 > > Is this a limitation somewhere which is hard to remove, or just an > easily fixable "legacy" from the times, when all filesystems contained > this limit? > _______________________________________________ > freebsd-fs@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" From owner-freebsd-fs@freebsd.org Sat Mar 5 21:44:05 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id AA61EA92A23 for ; Sat, 5 Mar 2016 21:44:05 +0000 (UTC) (envelope-from bra@fsn.hu) Received: from dg.fsn.hu (dg.fsn.hu [84.2.225.196]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "dg.fsn.hu", Issuer "dg.fsn.hu" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 6A902EAD for ; Sat, 5 Mar 2016 21:44:05 +0000 (UTC) (envelope-from bra@fsn.hu) Received: by dg.fsn.hu (Postfix, from userid 1003) id 523792F6A; Sat, 5 Mar 2016 22:44:03 +0100 (CET) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MF-ACE0E1EA [pR: 11.6696] X-CRM114-CacheID: sfid-20160305_22440_1E5F7734 X-CRM114-Status: Good ( pR: 11.6696 ) X-DSPAM-Result: Whitelisted X-DSPAM-Processed: Sat Mar 5 22:44:03 2016 X-DSPAM-Confidence: 0.9899 X-DSPAM-Probability: 0.0000 X-DSPAM-Signature: 56db5323260621374911608 X-DSPAM-Factors: 27, could, 0.01000, could, 0.01000, but, 0.01000, but, 0.01000, %e2%80%8f, 0.01000, %e2%80%8f, 0.01000, just, 0.01000, just, 0.01000, st_mode=0100644+st_nlink=32767, 0.01000, st_mode=0100644+st_nlink=32767, 0.01000, by+converting, 0.01000, 32767+$, 0.01000, org/mailman/listinfo/freebsd+fs, 0.01000, org/mailman/listinfo/freebsd+fs, 0.01000, has+32767, 0.01000, has+32767, 0.01000, look+into, 0.01000, look+into, 0.01000, "freebsd, 0.01000, "freebsd, 0.01000, Received*online.co.hu+[195.228.243.99]), 0.01000, or, 0.01000, or, 0.01000, Subject*at, 0.01000, an, 0.01000, an, 0.01000, X-Spambayes-Classification: ham; 0.00 Received: from [IPv6:::1] (japan.t-online.co.hu [195.228.243.99]) by dg.fsn.hu (Postfix) with ESMTPSA id B7AAE2F68; Sat, 5 Mar 2016 22:44:02 +0100 (CET) Subject: Re: zfs and st_nlink limit at 32767 To: Sami Halabi References: <56DB377F.9020205@fsn.hu> Cc: freebsd-fs@freebsd.org From: "Nagy, Attila" Message-ID: <56DB5322.30804@fsn.hu> Date: Sat, 5 Mar 2016 22:44:02 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-Content-Filtered-By: Mailman/MimeDel 2.1.21 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 05 Mar 2016 21:44:05 -0000 You mean a (signed) short. I haven't taken a look into the code, it may be from simple to difficult. UFS's (on disk) data structure is where this limit comes from, of course that must not change, but for other file systems (like zfs), this could be raised to an ulong in upper layers. And there would be some conversion needed, which may not be desireable. On 03/05/16 22:23, Sami Halabi wrote: > > it seems an integer limit (32bit).. I guess easily to fix by > converting to unsigned int / 64 bit. > > sami > > בתאריך 5 במרץ 2016 9:54 PM,‏ "Nagy, Attila" > כתב: > > Hi, > > If I create a million hard links to a file, stat -s says it has 32767: > $ stat -s 900402.24.t > st_dev=1709683738 st_ino=719745 st_mode=0100644 st_nlink=32767 > st_uid=1001 st_gid=0 st_rdev=4294967295 st_size=81688 > st_atime=1455881393 st_mtime=1455881393 st_ctime=1457206643 > st_birthtime=1457206536 st_blksize=81920 st_blocks=67 st_flags=2048 > > Is this a limitation somewhere which is hard to remove, or just an > easily fixable "legacy" from the times, when all filesystems > contained this limit? > _______________________________________________ > freebsd-fs@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to > "freebsd-fs-unsubscribe@freebsd.org > " > From owner-freebsd-fs@freebsd.org Sat Mar 5 23:16:31 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8E811A13A9B for ; Sat, 5 Mar 2016 23:16:31 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from gw.catspoiler.org (unknown [IPv6:2602:304:b010:ef20::f2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "gw.catspoiler.org", Issuer "gw.catspoiler.org" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 77607AD5 for ; Sat, 5 Mar 2016 23:16:31 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.15.2/8.15.2) with ESMTP id u25NGOaT079417; Sat, 5 Mar 2016 15:16:27 -0800 (PST) (envelope-from truckman@FreeBSD.org) Message-Id: <201603052316.u25NGOaT079417@gw.catspoiler.org> Date: Sat, 5 Mar 2016 15:16:23 -0800 (PST) From: Don Lewis Subject: Re: zfs and st_nlink limit at 32767 To: killing@multiplay.co.uk cc: freebsd-fs@freebsd.org In-Reply-To: <56DB4EA7.7050002@multiplay.co.uk> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 05 Mar 2016 23:16:31 -0000 On 5 Mar, Steven Hartland wrote: > Correct stat st_nlink is a nlink_t which is defined as uint16_t, its not > clear why its clamping at what looks like int16_t max. > > It looks like the kernel version in nstat is a uint32_t so internally it > should be correct. > > You may have some joy changing it to uint32_t but is likely everything > will rebuilding and even then there may be some edge cases which break > one that sticks out is linux compat support which doesn't use nlink_t. Yeah, changing it would change the stat() ABI, so you would have to recompile everything that calls stat(). Also the syscall would have to be versioned so that executables built on previous FreeBSD versions would still get the old version of struct stat. Something else to look out for is archive formats. It's possible that nlinks is embedded in them. Breaking the ability to read your old backup tapes would be a real bummer.