From owner-freebsd-stable Thu Feb 15 15:36:41 2001 Delivered-To: freebsd-stable@freebsd.org Received: from arg1.demon.co.uk (arg1.demon.co.uk [194.222.34.166]) by hub.freebsd.org (Postfix) with ESMTP id 0080437B491 for ; Thu, 15 Feb 2001 15:36:32 -0800 (PST) Received: by arg1.demon.co.uk (Postfix, from userid 300) id 68A039B02; Thu, 15 Feb 2001 23:36:30 +0000 (GMT) Received: from localhost (localhost [127.0.0.1]) by arg1.demon.co.uk (Postfix) with ESMTP id 612BE5D16; Thu, 15 Feb 2001 23:36:30 +0000 (GMT) Date: Thu, 15 Feb 2001 23:36:30 +0000 (GMT) From: Andrew Gordon X-Sender: arg@server.arg.sj.co.uk To: "Morten A . Middelthon" Cc: freebsd-stable@freebsd.org Subject: Re: Portmap going berserk(!) In-Reply-To: <20010215192135.A95579@freenix.no> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-stable@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG On Thu, 15 Feb 2001, Morten A . Middelthon wrote: > I got a FreeBSD 4.1.1-STABLE built Oct 8, which seems to have been running just > fine for about 40 days. But now, all of a sudden, portmap forks off nnn > processes, and the load on the box goes up to about 150 (not kidding). Running I've seen over 400... > portmap with -v doesn't give me anything, running it with -d starts spitting > out thousands of 'server: about to do a switch' messages to my console. I > tried rebooting the box, but it starts all over again. You want to reboot the NIS client boxes which are bashing it, rather than the machine itself. > The box is running as a DHCP, NFS, Samba, NIS, Apache, named and printserver, > so it's quite an important box in my network. > > Is there any known portmap-related problems? Right now I'm building with new > updated sources, hoping desperately it will help. I don't think it will help. This is a known problem (there's a few messages in the archives). I suspect the reason noone's fixed it is that it's hard to reproduce, and in the kind of situation where you do reproduce it (NIS server in large network with lots of traffic), there isn't much opportunity to investigate because the 'phones are ringing like crazy with all the users that can't log in... The work-around is to list all your NIS servers explicitly (described as "many-cast" in the ypbind manpage) rather than broadcasting. The few times it's happened to me, I observed: 1) Each time it happened, it was preceded by some kind of network packet loss incident (one time it was a bad ethernet switch, another time a dud cable). This merely seems to trigger the problem: once the problem has started, you can mend the network but the problem won't go away until you reboot all the servers. 2) When this is going on, the symptoms are that one or more NIS clients are churning out vast quantities of broadcasts. The portmapper on the NIS server dutifully tries to reply to each one, but the client isn't listening to the replies. A tcpdump trace goes something like: client.xxx->broadcast.portmap UDP server.portmap->client.xxx UDP client->server ICMP port unreachable To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message