Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 16 Dec 2000 09:57:38 -0800
From:      Don Coleman <don@coleman.org>
To:        "David E. Cross" <crossd@cs.rpi.edu>
Cc:        freebsd-hackers@freebsd.org
Subject:   Re: rpc.lockd and true NFS locks? 
Message-ID:  <200012161757.JAA22961@eozoon.coleman.org>
In-Reply-To: Your message of "Thu, 14 Dec 2000 19:40:48 EST."

next in thread | raw e-mail | index | archive | help

David,

I wrote the NFS lockd code for BSD/OS (it's based on some user land
stuff Keith Bostic did, and then Kirk McKusick helped clean up my
basic design and the VFS layering for the server/kernel side).

It has passed the connectathon tests, and has been being
used by BSD/OS customers for a while, but mostly for locking
mail files (why they don't use POP or IMAP, I've never been able to
figure out), and locking files being edited by vi over an NFS export.
I think we've had 1 bug report/fix, at least that got
back to me ... it's been in BSD/OS for like 2 years now, but given
the lack of bug reports, I doubt it's limits are being pushed.

The main feature the BSD/OS lockd code is missing, is the client side of
server side recovery... BSD/OS never crashes ;->, so our clients have never 
reported this problem (we mention it in our man page).

If the lockd server crashes/reboots, we do go through a grace period,
and we notify the clients they need to re-establish their locks, but our
client side doesn't track the current lock states (even though the client
kernel has that complete information, the user mode lock daemon on the
client side doesn't keep a copy).  So a BSD/OS server, with non-BSD/OS
clients is fully functional.

This problem isn't hard to fix... there is a two step fix, all user-land.

First, make the client side user level lockd a single process (under BSD/OS,
it is two processes)... the problem is that each process has a piece of
the responsibility/knowledge you need to re-establish locks on your
server when it crashes.

Second, the client side user level lockd needs to be able to figure out what
lockd locks the client as a whole has been given...  the lockd client needs
to keep it's own idea of what lock the client still holds (just keep it in
memory, ordered by server, simple and quick).

I already have the first part done, it just hasn't been fully tested, and
since BSD/OS is customer driven, and no customers have been pushing us,
I haven't committed it.  Anyone who wants it, let me know, and I'll give it
to you.

There are also some improvements I'd suggest as you merge the code (a
pair of fresh eyes and fingers is always an opportunity!)

1) we use a FIFO to pass data from the kernel to the user land process.
   we should at least use a socket...

2) we use a private field in the proc structure, even though it is only
   used by the lockd process... this was so we could clean up all the
   server side data if the user level process of the server side lockd
   crashed or was killed (Solaris admins kill lockd all the time).  Using
   a kernel level on_exit(), or perhaps making a lockd vfs (so a lockd_close()
   would be called when the lockd process exits), are both possibilities.
   The general idea is that if you kill and restart the lockd process, it
   should behave just as if the server was rebooted.

And of course there are the little stuff, like we don't back-off and re-try
as well as we should, and errors can be dropped, and little stuff like that.

I'd be happy to answer any questions, and help in any way I can.

don

ps: I'll be travelling in South East Asia (Bangkok, Mayanmar, Bali)
from Jan 9 to Feb 15, so I will not be able to help during that time...




To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200012161757.JAA22961>