Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 31 Mar 1999 00:58:26 -0500
From:      "David E. Cross" <crossd@cs.rpi.edu>
To:        Matthew Dillon <dillon@apollo.backplane.com>
Cc:        "David E. Cross" <crossd@cs.rpi.edu>, freebsd-hackers@FreeBSD.ORG, schimken@cs.rpi.edu, crossd@cs.rpi.edu
Subject:   Re: More death to nfsiod 
Message-ID:  <199903310558.AAA29634@cs.rpi.edu>
In-Reply-To: Message from Matthew Dillon <dillon@apollo.backplane.com>  of "Sat, 27 Mar 1999 16:53:57 PST." <199903280053.QAA20231@apollo.backplane.com> 

next in thread | previous in thread | raw e-mail | index | archive | help
:This bug was tickeled by performig the folowing tasks on the NFS mounted
:directory (NFSv2/UDP from a server of the eaxact same build).
:First I did a "mkisofs112 -J -o netscape.iso .netscape" .netscape contains about 70M of data, 12209 files, and 38 directories.  I tickeled the bug before by
:using mkisofs112 to create an iso image of ie5.0 (arround 60M, 50 files, no directoreis).  None of the mkisofs's have ever finished while nfsiod is running.  That caused the symptoms, but resolved after a short time.  then I issued a
:...
:inc: [install-mh aborted]
:sh-2.02$ pwd
:/amd/stagger/home1/a/crossd
:sh-2.02$ ls -la /amd/stagger/home1/a/crossd
:ls: /amd/stagger/home1/a/crossd: Not a directory

    If any of your NFS mounts are running over AMD, please try running
    mkisofs using direct NFS mounts ( non-amd ) and see if that fixes
    your problem.

    I've so far been able to run mkisofs over NFS V3 mounts without any
    trouble, but I'll run my test script overnight and also try it with
    NFS V2 mounts.

    It would be helpful if you d escribed what the bug was... all I know
    is that the mkisofs's don't finish.  Ok... is the rest of the system
    still operable?  Does it lock up?  Crash?  What does 'ps axl' say?
    etc.

Ok, this appears to have done the trick.  At least I have not been able to
reproduce it useing a hand-mounted path.  This raises an interesting question
as to why it matters.  My understanding is that AMD sits on the local host as
a virtual NFS server and hands out mount(2) commands when it is needed.  Thus
when I enter the "/amd" hierarchy, amd is not involved at all.  In fact I can
"kill -9 amd" and still access /amd without any problems, where the virtual
root directories created by AMD are now a recipe for DiskWait.  I have tried
a hand mount with all the options that I believe AMD uses to mount the 
partitions (us there a way to query the OS for the options from a mounted
partition?).  Is it the periodic umount() attempts that amd makes the cause 
of these problems?

As an aside we had problems with amd on SGI late last year, everything seemed
to point to amd, yet when we hand mounted the partition with the exact same
options we received the same error.  So I am a bit skeptical that AMD, by its
nature, is causing this; if I/we can find the options that tickle this I think
we will have nailed a real bug.

--
David Cross



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199903310558.AAA29634>