Date: Wed, 21 Jan 2009 20:30:58 +0100 From: Matthias Schuendehuette <msch@snafu.de> To: freebsd-net@freebsd.org Subject: NFS-Locking problem with 6.4/7.1-RELEASE Message-ID: <5523A7AD-6A64-4A40-B46F-8208BEA87C0A@snafu.de>
next in thread | raw e-mail | index | archive | help
--Apple-Mail-2-855988622 Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Hi, one of our FreeBSD-Servers is acting as NFS-Server for $HOME for approx. 50 HP-UX Workstations, since the WS itself and the disks in there become quite old in the meantime. That works quite good with FreeBSD 6.3-RELEASE-pxx but doesn't work with 6.4/7.1 any more. I looked with 'wireshark' on the problem and it seems to be a locking problen, probably related to PR 'kern/130628', but I'm not sure. Here what I know so far: Server-OS: FreeBSD 6.4-RELEASE/7.1-RELEASE (same problems) Workstation-OS: HP-UX 11iv1 (11.11) NFS-Version: V3/tcp or V3/udp (NFS-V2 works!) I found no records of the problem on the client side (HP-UX) whereas on FreeBSD 'rpc.lockd -d 3' produces the following entries in /var/log/messages: Jan 21 12:07:33 bsd1dw kernel: NLM: new host hp13 (sysid 5) Jan 21 12:07:33 bsd1dw kernel: nlm_do_cancel(): caller_name = hp13 (sysid = 5) Jan 21 12:07:53 bsd1dw kernel: nlm_do_cancel(): caller_name = hp13 (sysid = 5) Jan 21 12:08:13 bsd1dw kernel: nlm_do_cancel(): caller_name = hp13 (sysid = 5) Jan 21 12:08:32 bsd1dw kernel: nlm_do_lock(): caller_name = hp13 (sysid = 5) Jan 21 12:08:33 bsd1dw kernel: nlm_do_cancel(): caller_name = hp13 (sysid = 5) Jan 21 12:08:43 bsd1dw kernel: nlm_do_lock(): caller_name = hp13 (sysid = 5) Jan 21 12:08:53 bsd1dw kernel: nlm_do_cancel(): caller_name = hp13 (sysid = 5) Jan 21 12:09:03 bsd1dw kernel: nlm_do_lock(): caller_name = hp13 (sysid = 5) Jan 21 12:09:13 bsd1dw kernel: nlm_do_cancel(): caller_name = hp13 (sysid = 5) Jan 21 12:09:13 bsd1dw kernel: nlm_do_lock(): caller_name = hp13 (sysid = 5) Jan 21 12:09:23 bsd1dw kernel: nlm_do_lock(): caller_name = hp13 (sysid = 5) Jan 21 12:09:33 bsd1dw kernel: nlm_do_cancel(): caller_name = hp13 (sysid = 5) What happens is as follows: When logging in to an account with the home directory on the NFS- Server, the shell reads '.profile' and the tries to get a lock on '.sh_history'. From a FreeBSD 6.3 server the shell gets the lock whereas a 6.4/7.1 server replies with "V4 LOCK_RES Call NLM_FAILED". Of course the HP-UX shell assumes the file is already locked, waits some time and tries again. This game leads to a complete lock of the account... :-( This does not happen if commandline-history is disabled but nontheless it's an error anyway. I have recorded the network traffic for a NFSv2 session, a NFSv3/tcp session with a 6.3 server and a NFSv3/tcp session with a 7-STABLE server. If the wireshark dumps are of interest beyond of what I described here they are available on request. I hope my informations help those who are able to fix it... Matthew -- Ciao/BSD - Matthias Matthias Schuendehuette <msch [at] snafu.de>, Berlin (Germany) --Apple-Mail-2-855988622--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5523A7AD-6A64-4A40-B46F-8208BEA87C0A>