Date: Tue, 1 Jul 1997 13:23:59 -0700 (MST) From: Terry Lambert <terry@lambert.org> To: lederer@bonn-online.com (Sebastian Lederer) Cc: terry@lambert.org, freebsd-hackers@FreeBSD.ORG Subject: Re: NFS locking, was: Re: NFS V3 is it stable? Message-ID: <199707012024.NAA27875@phaeton.artisoft.com> In-Reply-To: <33B95B56.41C67EA6@bonn-online.com> from "Sebastian Lederer" at Jul 1, 97 09:32:38 pm
next in thread | previous in thread | raw e-mail | index | archive | help
> Let me see if I understand all issues correctly: > > > 1) The POSIX semantics make it difficult for rpc.lockd > > to have only one file handle per file regardless of > > the number of clients with the file open. This is > >[...] > > So the rpc.lockd (on the server) would have to keep a list of all active > locks on a file and only close the file when all locks are cleared. No. The rpc.lockd could not convert a handle into an fd, and subsequently close the fd, until all fd's for that file had been closed by all clients. In general, you want to: 1) Convert handle to fd 2) fstat( fd, ...) 3) Compare st_ino and st_dev to see if it's already open 4) If it is, add a reference count to the already open struct and close the new fd Step #4 destroys all locks with the new POSIX close semantics. See, you are assuming, incorrectly, that each FS will generate file handles using the same schema. File handles are, in fact, FS specific opaque data. If you don't believe me, look at the ISOFS handle code. > > 2) The assertion of a lock can not immediately result in > > a coelesce if the operation may be backed out. But > >[...] > > This should only affect the client, since if the lock fails on the > server, nothing happens there, only a nlm_denied rpc is sent back. Then > the client has to deal with the mess, because he had already set the > lock locally. Yes. Unless you are not using a veto interface (ie: if you are using the current locking code). If you do not use a veto interface, you could fail to assert the lock locally after asserting the lock remotely. In that case, now it's a server problem. Either case only occurs when there is client locking, though. By delaying the coelesce, and doing it seperately from the assert (ie: having a "commit" phase), you resolve the race window. > > allow the client to recover lock state in the event of > > a transient server failure (ie: the server is rebooted, > > etc.). > > For lock recovery, the lockd on the client would also keep a list of all > active locks, and, in case of a server crash, would be notified by the > rpc.statd and reissue all lock requests. If a lock request can't be > reissued, the lockd should send a SIGLOST signal to the involved > processes. > Correct ? No. NFS client locks are not forwarded through a local lockd, they are reissued as a result of an rpc.statd server notification. If the client crashes, the lock state on the server is destroyed. If the server crashes, the clients have an idea of the lock state at the time of the crash, and it must be restored. > > 4) So that server locking works on all file systems, the > > lock list must be hung off the vnode instead of the > > inode; one consequence of this is that it drastically > > And I thought that this was already the case... No. Look at the code. It is hung off the inode. > > Doug Rabson has the kernel patches for everything, minus the handle > > conversion call, and minus the POSIX semantic override. There *IS* > > a bug in the namei() code, which I was able to test everywhere but > > the NFS client (I only have one FreeBSD box at this location). If > > you are interested in helping locate this bug, I can send you a test > > framework for kernel memory leak detection, and my test set for > > the namei() buffers, specifically. > > Sure, go ahead. I don't have a SUN or something like that here, however, > for testing with a "real" rpc.lockd. I do have two FreeBSD machines > connected via ethernet, so I can do some non-local testing. The testing in this case is collateral bugs in the namei() code changes, which I didn't seperate from the rpc.lockd changes. If I can resolve the namei() leaks, Doug will commit the code to the FreeBSD source repository, and it will show up on the next CVSup. Are you running -current? I can probably make a set of diffs against a -current source treee this coming weekend (sorry, that's the fastest I can get to it; I'm currently job-hunting). > As I have already pointed out, I would be willing to invest some time in > implementing the rpc.lockd. The main problems (from my point of view) > are: > > Details of the locking protocol: > > * How are blocking locks implemented? Blocking locks are rather trivial, compared to some of the other issues; I'd be happy to discuss straegy on them with you once you get going. > * On which side are the locks coalesced? On the client's or > on the server's rpc.lockd ? The locks are coelesced on the client in the lock code, and on the server in the lock code. The rpc.lockd only proxies the locking calls. One problem here: FreeBSD does not have NFS client locking code. Andrew (the guy who wrote the rpc.lockd) has the ISO standard, and is probably a good reference on that. Without a working client, you won't be able to test the server. > * What is the cookie in the nlm_lockargs struct ? > (probably used by the client to identify the > result messages) > * What is the file handle in the nlm_lock struct ? > (seems to be device/inode/generation number) Andrew knows both of these for sure. I don't, or I would have made the handle conversion call already. The answer on the handle is "it's opaque data; you are not allowed to interpret it as anything other than a hash key, if that". > * What is the owner handle in the nlm_lock struct ? > (IP Address of the client? process id ?) There is a 32 bit system ID and a process ID for that system. This may be problematic at some later time, if threads are supposed to be able to lock against each other. 8-(. > Converting the nfs file handle into an open file: This seems to me the > most important point for the lockd implementation. Without this, > I can't actually lock the file. Yes, this is the conversion call. I nee to know the ire handle format so I can change it into the FS specific parameter in the kernel when it is passed to the fcntl() for the conversion. This is Andrew's ball park here, since he has the docs. > Client side locking: The lock requests must somehow be communicated > from the kernel to local lockd, which then forwards it > to the server's lockd. No. The calls are rpc'ed from the NFS FS module, which is the NFS client. The only question here is how the connection state is established and maintained by rpc.statd. > If I know all these details, it should be possible for me to > complete the rpc.lockd implementation. Yes. > So, if anybody has any knowledge on these issues, please contact > me. It would be greatly appreciated. And of course, if someone else > also wants to work on this, you are welcome. We need the information from Andrew to proceed on the handle conversion, and on the client code. I suspect it will be a much better idea to get an interoperable server before going after the client code. The only caveat here is that the architecture of the kernel code must not preclude a later client implementation. > It may still possible that we end up with an at least basically > working nfs locking implementation :-) It should be pretty easy, in fact. The hardest part, which is the adjustment to the kernel code, has already been done (twice, now). Regards, Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199707012024.NAA27875>