Date: Fri, 22 Jan 2010 22:37:49 +0200 From: Mikolaj Golub <to.my.trociny@gmail.com> To: Rick Macklem <rmacklem@uoguelph.ca> Cc: freebsd-fs@FreeBSD.org, freebsd-stable@FreeBSD.org Subject: Re: FreeBSD NFS client/Linux NFS server issue Message-ID: <86my05x4de.fsf@kopusha.onet> In-Reply-To: <Pine.GSO.4.63.1001221400590.29868@muncher.cs.uoguelph.ca> (Rick Macklem's message of "Fri\, 22 Jan 2010 14\:37\:48 -0500 \(EST\)") References: <86ocl272mb.fsf@kopusha.onet> <86tyuqnz9x.fsf@zhuzha.ua1> <86zl4awmon.fsf@zhuzha.ua1> <86vdeywmha.fsf@zhuzha.ua1> <86vdeuuo2y.fsf@zhuzha.ua1> <Pine.GSO.4.63.1001221400590.29868@muncher.cs.uoguelph.ca>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 22 Jan 2010 14:37:48 -0500 (EST) Rick Macklem wrote: >> --- nfs_bio.c.orig 2010-01-22 15:38:02.000000000 +0000 >> +++ nfs_bio.c 2010-01-22 15:39:58.000000000 +0000 >> @@ -1385,7 +1385,7 @@ again: >> */ >> if (!gotiod) { >> iod = nfs_nfsiodnew(); >> - if (iod != -1) >> + if ((iod != -1) && (nfs_iodwant[iod] == NULL)) >> gotiod = TRUE; >> } >> > > Unfortunately, I don't think the above fixes the problem. > If another thread that called nfs_asyncio() has "stolen" the this "iod", > it will have set nfs_iodwant[iod] == NULL (set non-NULL at #238) > and it will remain NULL until the other thread is done with it. I see. I have missed this. Thanks. > > There should probably be some sort of 3 way handshake between > the code in nfs_asyncio() after calling nfs_nfsnewiod() and the > code near the beginning of nfssvc_iod(), but I think the following > somewhat cheesy fix might do the trick: > > if (!gotiod) { > iod = nfs_nfsiodnew(); > if (iod != -1) { > if (nfs_iodwant[iod] == NULL) { > /* > * Either another thread has acquired this > * iod or I acquired the nfs_iod_mtx mutex > * before the new iod thread did in > * nfssvc_iod(). To be safe, go back and > * try again after allowing another thread > * to acquire the nfs_iod_mtx mutex. > */ > mtx_unlock(&nfs_iod_mtx); > /* > * So long as mtx_lock() implements some > * sort of fairness, nfssvc_iod() should > * get nfs_iod_mtx here and set > * nfs_iodwant[iod] != NULL for the case > * where the iod has not been "stolen" by > * another thread for a different mount > * point. > */ > mtx_lock(&nfs_iod_mtx); > goto again; > } > gotiod = TRUE; > } > } > > Does anyone else have a better solution? > (Mikolaj, could you by any chance test this? You can test yours, but I > think it breaks.) Unfortunately we observed this only on our production servers. A week ago we made some changes in configuration as workaround -- reconfigure cron no to run scripts simultaneously, set the scripts in cron that just periodically write a line to the file on nfs share (to "unlock" it if it is locked). We have not been observed problems since then and we would not like to experiment in production. If I manage to produce good test case in test environment I will be able to test the patch but I am not sure... -- Mikolaj Golub
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?86my05x4de.fsf>