Date: Mon, 22 Oct 2001 19:41:13 +1000 From: Stanley Hopcroft <Stanley.Hopcroft@IPAustralia.Gov.AU> To: Questions@FreeBSD.ORG Subject: 4.3-RELEASE kaput after running Perl applications that forks (and pings). Message-ID: <20011022194111.B375@IPAustralia.Gov.AU>
next in thread | raw e-mail | index | archive | help
Dear Ladies and Gentlemen,
I am writing to invite your comment on something I feel more likely to
occur on an MS Operating system.
An unloaded FreeBSD 4.3-RELEASE reboots part way through a Perl script
that forks a few hundred processes that exec 'ping'.
There is nothing in dmesg (apart from boot messages) or in
/var/log/messages.
Should I enable a debugging kernel ?
Why wasn't the user (un priviledged process) simply terminated ?
When the application runs, the system
. shows a load average of no more than 30
. total CPU utilisation bottoming at 0% with <= 40% system utilisation
. is running no more than 500 user processes.
. has no exceptional swap activity (can't remember what it was)
If there was a panic, I don't know what the message was; the only
indication I had of the machine failing was the ssh connection closing.
Here is the application; it failed when it tried to have 255 processes
running at one time (replace 100 by 255 in Parallel::ForkManager
constructor; this is a CPAN class that while new seems Ok)
#!/usr/bin/perl -w
use strict ;
use Parallel::ForkManager ;
use constant TIMEOUT => 5 ;
use constant COUNT => 5 ;
use constant PING_CMD => 'ping -q -n -c ' . COUNT . ' -t '. TIMEOUT ;
use constant DEBUG => 0 ;
close STDOUT unless DEBUG ;
close STDERR unless DEBUG ;
my ($pm, $start, $host, $address) ;
$pm = new Parallel::ForkManager(100) ;
foreach $start (3..7, 10, 96, 98, 100) {
foreach $host (0..255) {
$address = "10.0.$start.$host" ;
print STDERR "${ \PING_CMD } $address\n" ;
$pm->start and next ;
exec "${ \PING_CMD } $address" ;
}
}
$pm->wait_all_children ;
The reboot is repeatable.
There is nothing extraordinary running when the application that causs
the reboot runs.
Thank you,
Yours sincerely.
--
------------------------------------------------------------------------
Stanley Hopcroft IP Australia
Network Specialist
+61 2 6283 3189 +61 2 6281 1353 (FAX) Stanley.Hopcroft@IPAustralia.Gov.AU
------------------------------------------------------------------------
cursor address, n:
"Hello, cursor!"
-- Stan Kelly-Bootle, "The Devil's DP Dictionary"
Notes
1 The culprit
wins> uname -a
FreeBSD wins.aipo.gov.au 4.3-RELEASE FreeBSD 4.3-RELEASE #2: Wed Jul 4
19:09:37 EST 2001 root@wins.aipo.gov.au:/usr/src/sys/compile/WINS
i386
wins>
2 It's usual slothful activity
last pid: 94364; load averages: 0.01, 0.03, 0.00
up 3+08:04:31 19:38:32
29 processes: 1 running, 28 sleeping
CPU states: % user, % nice, % system, % interrupt, %
idle
Mem: 12M Active, 38M Inact, 23M Wired, 12K Cache, 35M Buf, 176M Free
Swap: 256M Total, 256M Free
PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND
132 bind 2 0 4248K 3904K select 2:52 0.00% 0.00% named
248 root 2 0 2760K 2224K select 0:56 0.00% 0.00% nmbd
5009 root 2 -12 1256K 920K select 0:10 0.00% 0.00% ntpd
259 root 2 0 1500K 1112K select 0:06 0.00% 0.00% httpd
169 root 10 0 980K 740K nanslp 0:03 0.00% 0.00% cron
128 root 2 0 928K 636K select 0:03 0.00% 0.00% syslogd
175 root 2 0 2148K 1516K select 0:02 0.00% 0.00% sshd
78845 root 2 0 3336K 3056K select 0:00 0.00% 0.00% dhcpd
250 root -6 0 1916K 1276K piperd 0:00 0.00% 0.00% nmbd
151 root 2 0 1120K 812K select 0:00 0.00% 0.00% amd
94220 root 2 0 2232K 1816K select 0:00 0.00% 0.00% sshd
94221 anwsmh 18 0 1272K 928K pause 0:00 0.00% 0.00% csh
94364 anwsmh 28 0 1884K 1096K RUN 0:00 0.00% 0.00% top
265 nobody 2 0 1548K 1196K accept 0:00 0.00% 0.00% httpd
266 nobody 2 0 1548K 1196K accept 0:00 0.00% 0.00% httpd
136 daemon 2 0 940K 636K select 0:00 0.00% 0.00% portmap
272 root 3 0 936K 652K ttyin 0:00 0.00% 0.00% getty
271 root 3 0 936K 652K ttyin 0:00 0.00% 0.00% getty
273 root 3 0 936K 652K ttyin 0:00 0.00% 0.00% getty
172 root 2 0 928K 636K select 0:00 0.00% 0.00% lpd
263 nobody 2 0 1500K 1124K accept 0:00 0.00% 0.00% httpd
264 nobody 2 0 1500K 1124K accept 0:00 0.00% 0.00% httpd
5612 nobody 2 0 1508K 1144K accept 0:00 0.00% 0.00% httpd
267 nobody 2 0 1500K 1124K accept 0:00 0.00% 0.00% httpd
28 root 18 0 208K 92K pause 0:00 0.00% 0.00%
adjkerntz
143 root 10 0 208K 80K nfsidl 0:00 0.00% 0.00% nfsiod
145 root 10 0 208K 80K nfsidl 0:00 0.00% 0.00% nfsiod
146 root 10 0 208K 80K nfsidl 0:00 0.00% 0.00% nfsiod
144 root 10 0 208K 80K nfsidl 0:00 0.00% 0.00% nfsiod
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-questions" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20011022194111.B375>
