Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 19 Dec 1997 20:02:16 -0500
From:      Gary Palmer <gjp@erols.com>
To:        current@freebsd.org
Subject:   crash (in networking code?)
Message-ID:  <349B1918.794BDF32@erols.com>

next in thread | raw e-mail | index | archive | help
Hi,

We have a weird proxying system here running 100% custom code.
As a test we put a new version on a FreeBSD snap release just
to see how well it lasts compared to the Sun and Linux boxes we
had been testing (and using previously). I used the latest
``stable'' snap:

FreeBSD pproxy6.erols.com 3.0-971208-SNAP FreeBSD 3.0-971208-SNAP #0:
Fri Dec 19 01:16:45 EST 1997    
root@install.noc.erols.net:/usr/src/sys/compile/PPROXY  i386

The problem is when we come to kill and restart the proxy daemon
(it, for obvious reasons, is its own daemon/listener). The first
time I went to console and sent it a SIGINT, most of the processes
exited cleanly, except one, which was stuck in `D' wait (I forgot
to look at the wait channel, sorry). Interestingly, the one in D
wait was *not* the parent listener, but a child, presumably one
that was handling a client connection at the time. kill -9 didn't
work, and when I typed `reboot' the machine didn't shut down
cleanly but instead panic'd. I was working on another console
and the panic flashed past, sorry.

Then later, a co-worker restarted it again. This time it didn't
panic immediately, but rather when he restarted the daemon. Again,
no panic message (for some reason the kernel message buffer didn't
get preserved and dmesg didn't pick it up after it rebooted).

This is pretty simple proxy code (although I'm not responsible
for it :-) ), and I'm *very* surprised that its having such 
a detremental effect on the system. It is running as root, although
it could just as easly bind to the priv port and run as a user. It
does fork off children to handle the incoming client connections,
and open a TCP connection to a backend server depending on the
client.

Anyone have any ideas what is going on? I know its pretty vague
still, and I hope to debug it further on a sacraficial machine
tonight, but I was hoping that someone would know if this was
already fixed, or if this needs me to start looking deep into
kernel internals

Thanks,

Gary



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?349B1918.794BDF32>