From owner-freebsd-hackers Sun Jun 2 13:30:56 1996 Return-Path: owner-hackers Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id NAA08796 for hackers-outgoing; Sun, 2 Jun 1996 13:30:56 -0700 (PDT) Received: from central.picker.com (central.picker.com [144.54.31.2]) by freefall.freebsd.org (8.7.5/8.7.3) with SMTP id NAA08777 for ; Sun, 2 Jun 1996 13:30:53 -0700 (PDT) Received: from ct.picker.com by central.picker.com with smtp (Smail3.1.28.1 #3) id m0uQJj5-0004rmC; Sun, 2 Jun 96 16:26 EDT Received: from elmer.picker.com ([144.54.52.5]) by ct.picker.com (4.1/SMI-4.1) id AA28598; Sun, 2 Jun 96 16:26:02 EDT Received: by elmer.picker.com (SMI-8.6/SMI-SVR4) id QAA26731; Sun, 2 Jun 1996 16:26:52 -0400 From: rhh@ct.picker.com (Randall Hopper) Message-Id: <199606022026.QAA26731@elmer.picker.com> Subject: automounter hangs on boot (possible bug found) To: freebsd-hackers@freebsd.org Date: Sun, 2 Jun 1996 16:26:52 -0400 (EDT) Reply-To: rhh@ct.picker.com Organization: Picker International, CT Division X-Mailer: ELM [version 2.4 PL24 PGP3 *ALPHA*] Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-hackers@freebsd.org X-Loop: FreeBSD.org Precedence: bulk PROBLEM: I believe I've found a possible bug in the way netmasks are computed in amd, but I'd appreciate it if someone could confirm this (I'm not a networking expert). The bug in question causes a few spurious DNS lookups which, on my dial-up subnet, hangs the machine for a while during boot while the DNS requests issued by amd time out. I went looking for the cause, and it as well as workarounds I've found are detailed below. The specific network setup I'm working with is: Subnet Mask : 255.255.255.240 (4-bit hostids) Router Host elmer : 144.54.61.1, 144.54.61,17 (3 interfaces) Host stealth : 144.54.61.10 (interface 1) Host voyager : 144.54.61.18 (interface 2) Dial-up host (interface 3) I have the following entry in /etc/networks on all machines: net1 144.54.61.0 net2 144.54.61.16 elmer is a router to several subnets. stealth is on "net1" and voyager is on "net2". When I start amd on voyager, it does a getnetbyaddr (usr.sbin/amd/amd/wire.c:getwire()) on the network 144.54.61.16 as one would expect. It finds this in /etc/networks so it doesn't need to ping the DNS server for this information. However, when I start amd on stealth, it does a getnetbyaddr on the network "0.144.54.61", which it "doesn't" find in the file, so it falls back and and does a gethostbyaddr on 144.54.61.0. This results in two PTR? queries which also fail or time out (depending on whether the dial-up link is up or not). CAUSE: The underlying problem seems to be that wire.c:getwire() doesn't determine "mask" correctly when the number of bits in the hostid isn't 8. For Class B addresses, it starts with the 0xFFFF0000 netmask and increases that 8 bits at a time (?why?). It computes this mask from the subnet (?), and then applies it TO the subnet. In net1's case above it ends up with a 0xFFFFFFFF mask and in "net2"'s case it ends up with a 0xFFFFFF00 mask. I don't know whether this is a bug, or correct (albeit strange) behavior documented in an RFC somewhere. To compute mask, why not start with the raw subnet "mask" (as opposed to subnet address), and shift it right 8 bits so long as the low 8 bits are 0? WORKAROUNDS: I'd be interested in the right thing to do if anyone can tell me. One work-around for now is to just put a (seemingly) bogus net1 entry in /etc/networks: net1 144.54.61 net2 144.54.61.16 Another is to just "ifconfig down" the route to the DNS server on subnet machines while they're bringing up amd :). Any advice, pointers, or corrections regarding this would be appreciated. Randall Hopper rhh@ct.picker.com