From owner-freebsd-questions Sun Sep 22 08:38:43 1996 Return-Path: owner-questions Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id IAA14122 for questions-outgoing; Sun, 22 Sep 1996 08:38:43 -0700 (PDT) Received: from lariat.lariat.org ([129.72.251.2]) by freefall.freebsd.org (8.7.5/8.7.3) with ESMTP id IAA14060; Sun, 22 Sep 1996 08:38:35 -0700 (PDT) Received: (from brett@localhost) by lariat.lariat.org (8.8.Alpha.4/8.8.Alpha.4) id JAA09370; Sun, 22 Sep 1996 09:38:25 -0600 (MDT) Date: Sun, 22 Sep 1996 09:38:25 -0600 (MDT) From: Brett Glass Message-Id: <199609221538.JAA09370@lariat.lariat.org> To: dwhite@resnet.uoregon.edu, gpalmer@freebsd.org Subject: Re: systat/netstat utilities buggy? Cc: brett@lariat.org, questions@freebsd.org Sender: owner-questions@freebsd.org X-Loop: FreeBSD.org Precedence: bulk > If the kernel's tables change while netstat (for example) is > traversing it (because it gets stuck on a nameserver lookup > somewhere), then it could get to a part of the table which doesn't > exist anymore, and hence fall over with that error. It's quite common > to see it on WWW servers, for example, where the network table changes > very rapidly, and you typically have lots of connections from remote > sites and hence nameserver lookups take a long time. The machine is a SLiRP-based PPP dialup, WWW server, and a mail server. I am sure that the tables change CONSTANTLY, since every socket opened by any of the dialup users has a corresponding socket on the host. Many nameserver lookups fail, as many of the machines that hit our Web server do not have names. I would be VERY surprised, though, if a programmer were careless enough to traverse a linked list whose links could potentially change without locking resources -- or deferring insertions and deletions until after a traversal. Doing nameserver lookups in the MIDDLE of the traversal is EXTREMELY poor practice, since it maximizes the danger of inconsistencies even if the list is traversed safely. Do you have the code to FreeBSD's netstat? (I don't have room for the source tree here.) Is it really coded this badly? > As for `odd system behaviour', more details would be needed. We're talking core dumps, occasional crashes (when systat -netstat is running), and hung processes. As if kernel memory were being corrupted. --Brett