From owner-freebsd-current  Wed Aug  4 23:40:27 1999
Delivered-To: freebsd-current@freebsd.org
Received: from apollo.backplane.com (apollo.backplane.com [209.157.86.2])
	by hub.freebsd.org (Postfix) with ESMTP id C685514D69
	for <current@freebsd.org>; Wed,  4 Aug 1999 23:40:25 -0700 (PDT)
	(envelope-from dillon@apollo.backplane.com)
Received: (from dillon@localhost)
	by apollo.backplane.com (8.9.3/8.9.1) id XAA33933;
	Wed, 4 Aug 1999 23:40:06 -0700 (PDT)
	(envelope-from dillon)
Date: Wed, 4 Aug 1999 23:40:06 -0700 (PDT)
From: Matthew Dillon <dillon@apollo.backplane.com>
Message-Id: <199908050640.XAA33933@apollo.backplane.com>
To: Stephen Hocking-Senior Programmer PGS Tensor Perth <shocking@prth.pgs.com>
Cc: current@freebsd.org
Subject: Re: Interesting NFS hangs under current 
References:  <199908050625.OAA20277@ariadne.tensor.pgs.com>
Sender: owner-freebsd-current@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG


:No sooner received than done....
:
:(kgdb) frame 18
:#18 0xc01ef2d6 in mmap (p=0xc5e49020, uap=0xc5f45f80) at ../../vm/vm_mmap.c:330
:330             error = vm_mmap(&p->p_vmspace->vm_map, &addr, size, prot, maxprot,
:(kgdb) print *p
:$2 = {p_procq = {tqe_next = 0xc0290ed0, tqe_prev = 0x0}, p_list = {
:...
:  p_comm = "rpc.rstatd\000\000\000\000\000\000", p_pgrp = 0xc0a42ae0, 
:...
:  p_sysent = 0xc025bbc0, p_rtprio = {type = 1, prio = 0}, p_prison = 0x0, 
:  p_addr = 0xc5f44000, p_md = {md_regs = 0xc5f45fa8}, p_xstat = 0, 
:---Type <return> to continue, or q <return> to quit---
:  p_acflag = 2, p_ru = 0x0, p_nthreads = 0, p_aioinfo = 0x0, p_wakeup = 0, 
:  p_peers = 0x0, p_leader = 0xc5e49020, p_asleep = {as_priority = 0, 
:    as_timo = 0}, p_emuldata = 0x0}
:...
:(kgdb) print *uap
:$3 = {addr = 0xc07a8180 "", addr_ = 0xc0a41744 "À\206zÀ\001", 
:  len = 3229255360, len_ = 0xc0a41748 "\001", prot = 65537, 
:  prot_ = 0xc0a4174c "\001", flags = 1, flags_ = 0xc0a41750 "\200>¢ÀÄ#&À\002", 
:  fd = -1063108992, fd_ = 0xc0a41754 "Ä#&À\002", pad = -1071242300, 
:  pad_ = 0xc0a41758 "\002", pos = 17592186044418, pos_ = 0xc0a41760 ""}
:(kgdb) 

    Uh oh, this is rpc.statd.

    I don't get it.  How can rpc.statd be in an mmap() call at this point?
    rpc.statd should have been running long before you even started the link.

    Are you running rpc.statd from inetd.conf by any chance?  If so, try
    removing it from inetd.conf and run it manually.

:# ps -axl -N /sys/compile/bleep/kernel.debug -M /var/crash/vmcore.2
:  UID   PID  PPID CPU PRI NI   VSZ  RSS WCHAN  STAT  TT       TIME COMMAND
:    0   259     1  32  10  0   508    0 wait   I    #C1-   0:00.00  (sh)
:   88   276   259   0   2  0 11132    0 -      R    #C1-   0:00.00  (mysqld)
:...

    The large number of processes in "R"un state indicates that the machine
    is stuck in a supervisor loop in the mmap() code.  i.e. interrupts have
    woken up these processes but they have been unable to actually get any
    cpu.

    Nothing else seems to be an issue.  I see you are running vinum. I do
    not know if that is an issue.

:    0  4667  4659  14  -2  0   340    0 getblk I+   #C1    0:00.00  (install)
:    0   182     1   0  -1  0   208    0 nfsrcv I     ??    0:00.00  (nfsiod)
:    0   183     1   0   2  0   208    0 -      R     ??    0:00.00  (nfsiod)
:    0   242     1   0   2  0  1452    0 -      Rs    ??    0:00.00  (ppp)
:    0   351     1   1  -6  0   476    0 vinum  Ds    ??    0:00.00  (vinum)
:    0  3486   292   0   2  0  1296    0 -      R     ??    0:00.00  (sshd1)
:    0  4672   199 1502 105  0   868    0 -      Rs    ??    0:00.00  (rpc.rstatd
:# 
:...

					-Matt
					Matthew Dillon 
					<dillon@backplane.com>


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message