From owner-freebsd-current Wed Aug 4 23:40:27 1999 Delivered-To: freebsd-current@freebsd.org Received: from apollo.backplane.com (apollo.backplane.com [209.157.86.2]) by hub.freebsd.org (Postfix) with ESMTP id C685514D69 for ; Wed, 4 Aug 1999 23:40:25 -0700 (PDT) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.9.3/8.9.1) id XAA33933; Wed, 4 Aug 1999 23:40:06 -0700 (PDT) (envelope-from dillon) Date: Wed, 4 Aug 1999 23:40:06 -0700 (PDT) From: Matthew Dillon Message-Id: <199908050640.XAA33933@apollo.backplane.com> To: Stephen Hocking-Senior Programmer PGS Tensor Perth Cc: current@freebsd.org Subject: Re: Interesting NFS hangs under current References: <199908050625.OAA20277@ariadne.tensor.pgs.com> Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG :No sooner received than done.... : :(kgdb) frame 18 :#18 0xc01ef2d6 in mmap (p=0xc5e49020, uap=0xc5f45f80) at ../../vm/vm_mmap.c:330 :330 error = vm_mmap(&p->p_vmspace->vm_map, &addr, size, prot, maxprot, :(kgdb) print *p :$2 = {p_procq = {tqe_next = 0xc0290ed0, tqe_prev = 0x0}, p_list = { :... : p_comm = "rpc.rstatd\000\000\000\000\000\000", p_pgrp = 0xc0a42ae0, :... : p_sysent = 0xc025bbc0, p_rtprio = {type = 1, prio = 0}, p_prison = 0x0, : p_addr = 0xc5f44000, p_md = {md_regs = 0xc5f45fa8}, p_xstat = 0, :---Type to continue, or q to quit--- : p_acflag = 2, p_ru = 0x0, p_nthreads = 0, p_aioinfo = 0x0, p_wakeup = 0, : p_peers = 0x0, p_leader = 0xc5e49020, p_asleep = {as_priority = 0, : as_timo = 0}, p_emuldata = 0x0} :... :(kgdb) print *uap :$3 = {addr = 0xc07a8180 "", addr_ = 0xc0a41744 "À\206zÀ\001", : len = 3229255360, len_ = 0xc0a41748 "\001", prot = 65537, : prot_ = 0xc0a4174c "\001", flags = 1, flags_ = 0xc0a41750 "\200>¢ÀÄ#&À\002", : fd = -1063108992, fd_ = 0xc0a41754 "Ä#&À\002", pad = -1071242300, : pad_ = 0xc0a41758 "\002", pos = 17592186044418, pos_ = 0xc0a41760 ""} :(kgdb) Uh oh, this is rpc.statd. I don't get it. How can rpc.statd be in an mmap() call at this point? rpc.statd should have been running long before you even started the link. Are you running rpc.statd from inetd.conf by any chance? If so, try removing it from inetd.conf and run it manually. :# ps -axl -N /sys/compile/bleep/kernel.debug -M /var/crash/vmcore.2 : UID PID PPID CPU PRI NI VSZ RSS WCHAN STAT TT TIME COMMAND : 0 259 1 32 10 0 508 0 wait I #C1- 0:00.00 (sh) : 88 276 259 0 2 0 11132 0 - R #C1- 0:00.00 (mysqld) :... The large number of processes in "R"un state indicates that the machine is stuck in a supervisor loop in the mmap() code. i.e. interrupts have woken up these processes but they have been unable to actually get any cpu. Nothing else seems to be an issue. I see you are running vinum. I do not know if that is an issue. : 0 4667 4659 14 -2 0 340 0 getblk I+ #C1 0:00.00 (install) : 0 182 1 0 -1 0 208 0 nfsrcv I ?? 0:00.00 (nfsiod) : 0 183 1 0 2 0 208 0 - R ?? 0:00.00 (nfsiod) : 0 242 1 0 2 0 1452 0 - Rs ?? 0:00.00 (ppp) : 0 351 1 1 -6 0 476 0 vinum Ds ?? 0:00.00 (vinum) : 0 3486 292 0 2 0 1296 0 - R ?? 0:00.00 (sshd1) : 0 4672 199 1502 105 0 868 0 - Rs ?? 0:00.00 (rpc.rstatd :# :... -Matt Matthew Dillon To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message