Date: Thu, 14 Oct 2004 11:46:47 +0200 From: Herve Boulouis <amon@sockar.homeip.net> To: freebsd-sparc64@freebsd.org Subject: Strange timing problems with BETA7 Message-ID: <20041014114647.A69222@ra.aabs>
next in thread | raw e-mail | index | archive | help
Hi, I'm having very strange stability problems with BETA7 which seems related to timing/clock : Hardware is a Netra t 1125 with 2 CPU. Symptoms : After a fresh reboot, when I do a standard ping on any ip adress, the interval between the pings is not constant and is generally lower than the 1 second it should be by default. I sometimes also get negative latencies with ping or traceroute : # ping 62.4.16.70 PING 62.4.16.70 (62.4.16.70): 56 data bytes 64 bytes from 62.4.16.70: icmp_seq=0 ttl=60 time=-432.827 ms 64 bytes from 62.4.16.70: icmp_seq=1 ttl=60 time=1.955 ms # traceroute 62.4.16.70 traceroute to 62.4.16.70 (62.4.16.70), 64 hops max, 52 byte packets 1 gi0-12-swr102-mix-courbevoie (213.215.63.1) 436.046 ms 0.733 ms 0.611 ms 2 gi0-2-3-edou.nerim.net (194.79.130.114) 0.619 ms -434.763 ms 435.882 ms 3 gi0-3-32-svenny.nerim.net (194.79.130.1) 1.737 ms 1.435 ms 1.715 ms After a few hours of activity (this box is an ftp server), the kernel gives this kind of message : calcru: negative runtime of -893918 usec for pid 1344 (pure-ftpd) calcru: negative runtime of -761379 usec for pid 1339 (pure-ftpd) calcru: negative runtime of -1687109 usec for pid 1337 (pure-ftpd) calcru: negative runtime of -295856 usec for pid 7 (pagedaemon) calcru: runtime went backwards from 162673274 usec to 159978646 usec for pid 29 (intr2017: hme0) calcru: runtime went backwards from 33673531 usec to 30674086 usec for pid 4 (g_down) calcru: runtime went backwards from 102734677682 usec to 102731983847 usec for pid 12 (idle: cpu0) calcru: runtime went backwards from 102678868452 usec to 102678764016 usec for pid 11 (idle: cpu1) At this point, doing a netstat -Iw 1 gives nothing but the fields header. In the same fashion, pinging any ip address gives a single reply and the ping command is then stuck. (both processes are in select() state when they are stuck and are interruptible with ^C) When doing a reboot after a few hours of uptime, the reboot process seems to get stuck after killing all the running processes, I never see the kernel shutdown messages and have to power cycle the box. Some apps seem to have problems with timing too : wget gives randomly : Assertion failed: (msecs >= 0), function calc_rate, file retr.c, line 262. Abort trap (core dumped) This started when I upgraded from 5.2.1 to BETA3 and the problem is still present in BETA7 (last cvsup from Oct 5). I reseted the date according to the heads up about the mk48txx commit. I tried mpsafenet=0 with same result. My kernel config is pretty much like GENERIC except that I'm using SCHED_4BSD, maxusers 512 and ZERO_COPY_SOCKETS (no WITNESS, no INVARIANTS). Any ideas on this ? Can this be a hardware problem ? -- Herve Boulouis
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20041014114647.A69222>