Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 13 Jul 2007 19:08:29 +0900
From:      Pyun YongHyeon <pyunyh@gmail.com>
To:        "Li-Lun Wang (Leland Wang)" <llwang@infor.org>
Cc:        freebsd-current@freebsd.org
Subject:   Re: threadlock and msk watchdog timeout
Message-ID:  <20070713100829.GC17801@cdnetworks.co.kr>
In-Reply-To: <20070713084325.GA47351@Athena.infor.org>
References:  <20070713084325.GA47351@Athena.infor.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Jul 13, 2007 at 04:43:25PM +0800, Li-Lun Wang (Leland Wang) wrote:
 > -----BEGIN PGP SIGNED MESSAGE-----
 > Hash: SHA1
 > 
 > Hi,
 > 
 > After making world a couple of days ago, my msk(4) became very
 > unstable.  Under moderate network load, the interface hung and I
 > received
 > 
 > 	kernel: msk0: watchdog timeout (missed Tx interrupts) -- recovering
 > 
 > at least once every several minutes and 
 > 
 > 	kernel: msk0: Rx FIFO overrun!
 > 
 > occasionally.
 > 
 > It was so annoying that I took the trouble of binary searching the
 > kernel version to find the one destabilized my msk(4).
 > 
 > The outcome of the search turned out te be strange.  Instead of
 > finding a date after which msk(4) became so very unstable, it *seemed*
 > that the older the kernel version the stabler msk(4) I got, and the
 > newer the kernel version the easier and more often msk(4) hung.
 > 
 > I managed to pin down that with the kernel as of 2007.06.04.12.00.00,
 > it seemed not to give me any msk watchdog timeout at all, and that
 > with the kernel as of 2007.06.05.12.00.00, msk(4) began to hang and
 > the watchdog began to timeout once in a while.  There may be a latter
 > commit that made my msk(4) even more unstable, but I am not sure about
 > this part as it is not easy to measure the level of "unstableness" of
 > the network.
 > 
 > It seems that the most significant commit between 2007.06.04.12.00.00
 > and 2007.06.05.12.00.00 was threadlock by jeff@.  I don't know why or
 > how it would affect msk(4), though.  I was using SCHED_SMP on a C2D,
 > but switched back to SCHED_ULE when I did the search.
 > 
 > I discovered a couple other funny phenomena during the search that may
 > also suggest this be related to threadlock.  One is that msk(4) seemed
 > to hang less frequently when the system was busy building world or
 > kernel.  The other thing is that I seemed to be able to help unhang
 > the interface by switching the input focus in X Window by move my
 > mouse cursor to another window.
 > 
 > My result might not be accurate, though, as I only rebuilt the kernel,
 > not the whole world, when I did the search.
 > 

Does msk(4) use shared interrupt?
Show me the output of "vmstat -i".

-- 
Regards,
Pyun YongHyeon



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070713100829.GC17801>