FreeBSD Mail Archives

Date:      Tue, 4 May 1999 11:42:56 -0700 (PDT)
From:      Matthew Dillon <dillon@apollo.backplane.com>
To:        Doug Rabson <dfr@nlsystems.com>, Kevin Day <toasty@home.dragondata.com>, jso@research.att.com
Cc:        hackers@freebsd.org
Subject:   Re: kern/11470: V3 NFS problem (fwd)
Message-ID:  <199905041842.LAA18532@apollo.backplane.com>
References:   <Pine.BSF.4.05.9905040943490.637-100000@herring.nlsystems.com>

I'm going to move this to -hackers to allow people to pool their interest
in regards to fixing the remaining NFS problems. I am also CCing the
parties involved.

In regards to kern/11470. This bug report is relative to FreeBSD-3.1.
I haven't had rm -rf problems myself, and there were a huge number of bugs
fixed in NFS in FreeBSD-stable (3.x) since the 3.1 release.

I would ask jso@research.att.com to update his 3.1 machines to the latest
FreeBSD-stable and repeat his tests.

I will comment on Kevin's bug report below next to his itemized list.

-Matt
Matthew Dillon
<dillon@backplane.com>

:Did you see this one? It is almost certainly caused by the client being
:confused when we invalidate their directory cookies. The BSD nfs server is
:pretty fascist about cookies and invalidates them too often. Can you think
:of a better scheme than the directory timestamp for cookies?
:
:--
:Doug Rabson Mail: dfr@nlsystems.com
:Nonlinear Systems Ltd. Phone: +44 181 442 9037
:
:
:---------- Forwarded message ----------
:Date: Mon, 3 May 1999 14:23:47 -0700 (PDT)
:From: jso@research.att.com
:To: freebsd-gnats-submit@freebsd.org
:Subject: kern/11470: V3 NFS problem
:
:>Number: 11470
:>Category: kern
:>Synopsis: V3 NFS problem
:>Confidential: no
:>Severity: critical
:>Priority: high
:>Responsible: freebsd-bugs
:>State: open
:>Quarter:
:>Keywords:
:>Date-Required:
:>Class: sw-bug
:>Submitter-Id: current-users
:>Arrival-Date: Mon May 3 14:30:00 PDT 1999
:>Closed-Date:
:>Last-Modified:
:>Originator: Jerry So
:>Release: 3.1
:>Organization:
:AT&T Labs-Research
:>Environment:
:FreeBSD spaceless 3.1-RELEASE FreeBSD 3.1-RELEASE #0: Tue Apr 20 17:57:03 EDT 1999 root@spaceless:/usr/src/sys/compile/SPACELESS i386
:
:>Description:
:NFS client is solaris 2.6 or irix 6.4
:NFS server is freebsd 3.1
:
:For example:
:When doing a rm -Rf gcc-2.8.1 on NFS client
:rm: Unable to remove directory gcc-2.8.1/config/i386: File exists
:rm: Unable to remove directory gcc-2.8.1/config/m68k: File exists
:rm: Unable to remove directory gcc-2.8.1/config: File exists
:rm: Unable to remove directory gcc-2.8.1: File exists
:
:resulted.
:
:Only NFS v3 is having problem. Machines with V2 is OK.
:
:>How-To-Repeat:
:
:Repeat any time
:>Fix:
:
:
:>Release-Note:
:>Audit-Trail:
:>Unformatted:
:
:
:To Unsubscribe: send mail to majordomo@FreeBSD.org
:with "unsubscribe freebsd-bugs" in the body of the message
:
:

:From: Kevin Day <toasty@home.dragondata.com>
:
:Ok, I've been playing with your last patches (just before they were
:committed).
:
:I still see at least three outstanding things. :)
:
:
:1) if I 'sysctl -w vfs.nfs.async=1' on the server, the client will
:eventually get deadlocked, with most processes stuck in 'nfsrcvlk' or
:'nfsinval'(i think)

Yes, I'm sure there are still a couple of lockup situations that
we need to fix in this area. I need to know whether this is via
NFSV2 or NFSV3 and whether this is a UDP or TCP mount. And, if it is
a TCP mount, whether the problem occurs with a UDP mount. A similar
situation occured with TCP when I was doing makes that turned out to be
a data corruption bug related to multiple RPC's winding up in the same
mbuf.

Note: If your *SERVER* is not running the latest -current, you have to
upgrade it. If your server is running FreeBSD-stable, the TCP fix (which
is a server-side bug) has NOT yet been committed to FreeBSD-stable.

:2) If I set a cpu time limit for a process, and the executable file is being
:ran over NFS, if it exceeds the CPU limit, i get flooded with "vm_fault: pager
:error"'s

This is definitely a bug. I'll bet you are using an 'intr' or 'soft'
mount, yes? There are still some serious bugs with 'intr' mounts
interacting badly with the VM system, but they should be relatively easy
to fix.

:3) See PR 7728. NFS server is also a web server, dumping logs into user's
:home directories. Our FTP server is an NFS client. When clients try to
:download their log files, the ftpd process gets stuck (kill -9 won't kill
:it). This also happens when they try to upload over top of a file they just
:viewed on the web server.
:
:Processes seem to get stuck in 'sbwait' (which really doesn't seem like it's
:stuck), or 'nfsrcv'

What is occuring is that existing VM cache pages are being ripped out from
under the client and the client is getting confused. I'll need to work
up a reliable way to reproduce the problem between a client and server
in order to squash it. If someone else can come up with a simple script
to run on the client & the server that reproduces the problem, we will
be able to squash it more quickly.

-Matt

:In all though, thanks a *lot* for your help with NFS. :) It seems much more
:stable now, i'm not afraid to compile things over nfs anymore. :)
:
:Kevin

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199905041842.LAA18532>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation