Date: Mon, 11 Feb 2002 19:47:32 +0100 (CET) From: Per Hedeland <per@bluetail.com> To: FreeBSD-gnats-submit@freebsd.org Subject: misc/34842: VmWare port + NIS causes "broadcast storm" Message-ID: <200202111847.g1BIlWf14160@tordmule.bluetail.com>
next in thread | raw e-mail | index | archive | help
>Number: 34842
>Category: misc
>Synopsis: VmWare port + NIS causes "broadcast storm"
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: freebsd-bugs
>State: open
>Quarter:
>Keywords:
>Date-Required:
>Class: sw-bug
>Submitter-Id: current-users
>Arrival-Date: Mon Feb 11 10:50:01 PST 2002
>Closed-Date:
>Last-Modified:
>Originator: Per Hedeland
>Release: FreeBSD 4.5-RELEASE i386
>Organization:
none
>Environment:
System: FreeBSD tordmule.bluetail.com 4.5-RELEASE FreeBSD 4.5-RELEASE #0: Mon Feb 4 19:33:09 CET 2002 root@:/usr/src/sys/compile/TORDMULE i386
Vmware2 port: vmware2-2.0.4.1142
>Description:
If running the vmware2 port in bridging mode on a system that is
a NIS client (without specific server(s) specified to ypbind),
and the NIS server is unavailable for some length of time while
no vmware host is running, the system will start sending UDP
broadcasts to port 111 at an extremely high rate. Observed with
a Pentium-III 800Mhz on 100Mb Ethernet: 200 broadcast
packets/sec sent - which in turn cause 200 response packets/sec,
which in turn cause 200 ICMP port unreachable packets/sec from
the FreeBSD, system since nothing there is listening for the
responses - in total 600 packets/sec.
>How-To-Repeat:
See description. Killing the ypserv process on the (only) NIS
server will cause the problem to appear after some time (not
measured); it can be more quickly reproduced by killing and
restarting ypbind on the FreeBSD system while the NIS server is
down.
>Fix:
The above is actually caused by the interaction of a series of
problems:
1) When bridging is chosen at installation of the vmware2 port,
the vmnet1 interface is still configured with a "dummy" IP
address of its own (192.168.0.1), netmask (255.255.255.0),
and corresponding broadcast address (192.168.0.255). As far
as I understand, a set of bridged interfaces should have at
most one IP address total among them.
2) If packets are sent to *any* address in the "vmnet1 net"
besides the configured one (192.168.0.1) when no vmware host
is running, sendto() (or whatever) will soon return ENOBUFS,
since the "send queue" has filled up. (Needless to say,
nothing will ever really receive such packets - but they seem
to "disappear" if a vmware host is running.)
3) The RPC broadcast function (/usr/src/lib/libc/rpc/pmap_rmt.c/
clnt_broadcast()) gives up sending immediately (returning
RPC_CANTSEND), without even waiting for responses, if sending
to any one of the broadcast-capable interfaces fails for
whatever reason.
4) Ypbind, when getting any error back from clnt_broadcast(),
retries immediately, without any delay or backoff strategy.
So, in this scenario, ypbind calls clnt_broadcast(), which sends
a packet out the physical interface, then a packet on the vmnet1
interface, gets ENOBUFS and gives up, and ypbind starts the
process over again, ad infinitum.
The "storm" can be prevented by fixing any one of the problems
1)-4); a real fix (allowing ypbind to succeed) requires a fix to
one of 1)-3). Ideally all should be fixed, of course.
I worked around problem 1) in FreeBSD 4.2-RELEASE and an other
version of the vmware2 port by modifying the vmware config and
startup scripts to simply not configure an IP address on vmnet1
when bridging is used - however this does not work in current
versions (vmware complains about not being able to get the
interface address), at least not the trivial way I did it.
Instead I now looked at problem 2), and came up with the first
patch below. It seems reasonable to me, solves problem 2), and
doesn't seem to affect the "normal" traffic to the vmware host
in any way - but I guess I could be missing something... I've
also enclosed what seems to me to be a reasonable fix for
problem 3), however this is totally untested. I haven't looked
for a fix for problem 4), but someone probably should...
Patch for problem 2):
--- /usr/src/sys/net/if_tap.c.ORIG Thu Jul 27 15:57:05 2000
+++ /usr/src/sys/net/if_tap.c Mon Feb 11 15:20:30 2002
@@ -467,10 +467,10 @@
/*
* do not junk pending output if we are in VMnet mode.
* XXX: can this do any harm because of queue overflow?
+ * Ummm, yes...
*/
- if (((tp->tap_flags & TAP_VMNET) == 0) &&
- ((tp->tap_flags & TAP_READY) != TAP_READY)) {
+ if ((tp->tap_flags & TAP_READY) != TAP_READY) {
struct mbuf *m = NULL;
TAPDEBUG("%s%d not ready. minor = %#x, tap_flags = 0x%x\n",
Patch for problem 3):
--- /usr/src/lib/libc/rpc/pmap_rmt.c.ORIG Fri Mar 3 14:04:58 2000
+++ /usr/src/lib/libc/rpc/pmap_rmt.c Mon Feb 11 12:54:33 2002
@@ -330,15 +330,19 @@
* minute or so
*/
for (t.tv_sec = 4; t.tv_sec <= 14; t.tv_sec += 2) {
+ int success = 0;
for (i = 0; i < nets; i++) {
baddr.sin_addr = addrs[i];
if (sendto(sock, outbuf, outlen, 0,
(struct sockaddr *)&baddr,
- sizeof (struct sockaddr)) != outlen) {
- perror("Cannot send broadcast packet");
- stat = RPC_CANTSEND;
- goto done_broad;
+ sizeof (struct sockaddr)) == outlen) {
+ success++;
}
+ }
+ if (!success) {
+ perror("Cannot send broadcast packet");
+ stat = RPC_CANTSEND;
+ goto done_broad;
}
if (eachresult == NULL) {
stat = RPC_SUCCESS;
>Release-Note:
>Audit-Trail:
>Unformatted:
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-bugs" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200202111847.g1BIlWf14160>
