Date: Wed, 4 Oct 2006 12:31:54 +0200 From: Guy Brand <gb@isis.u-strasbg.fr> To: freebsd-stable@freebsd.org Subject: Re: CALL FOR TESTERS! [Re: 6.2 SHOWSTOPPER - em completely unusable on 6.2] Message-ID: <20061004103154.GK1276@isis.u-strasbg.fr> In-Reply-To: <20060930011904.GA62626@nowhere> References: <451AA7B1.5080202@samsco.org> <20060927191402.GB932@turion.vk2pj.dyndns.org> <20060927210349.GG14975@tnn.dglawrence.com> <451AEB02.2090806@samsco.org> <002201c6e290$45ece980$b3db87d4@multiplay.co.uk> <451BD89F.8080203@samsco.org> <451C1F6D.2020302@mail.uni-mainz.de> <7.0.1.0.0.20060928152807.17bbe448@sentex.net> <451C271A.9040904@samsco.org> <20060930011904.GA62626@nowhere>
next in thread | previous in thread | raw e-mail | index | archive | help
Craig Boston (craig@feniz.gank.org) on 29/09/2006 at 20:19 wrote: > One thing this patch definitely did do though, is break the nvidia > driver pretty badly. Couldn't keep the X server running for more than a > minute before it froze solid. Lots of Xid: blah blah blah messages. > Yes I remembered to rebuild the kernel module ;) Hi, Since rebuilding to 6.2-PRERELEASE FreeBSD 6.2-PRERELEASE #1: Mon Oct 2 15:24:04 CEST 2006 DEBUG i386 on a box having em sharing IRQ with nvidia (NVIDIA-FreeBSD-x86-1.0-8756): interrupt total rate irq1: atkbd0 5 0 irq14: ata0 47 0 irq16: nvidia0 em+ 86545 185 irq17: fwohci0 7 0 irq21: twe0 6426 13 cpu0: timer 927735 1986 Total 1020765 2185 I freeze the box by starting firefox which reloads a few tabs I keep open in my session when under X. This is perfectly reproductible. From the logs, first I see: Oct 2 16:47:39 mojito kernel: NVRM: Xid (0001:00): 16, Head 00000000 Count 00010597 Oct 2 16:47:43 mojito kernel: NVRM: Xid (0001:00): 8, Channel 00000000 Oct 2 16:47:47 mojito kernel: NVRM: Xid (0001:00): 16, Head 00000000 Count 00010598 Oct 2 16:47:55 mojito kernel: NVRM: Xid (0001:00): 16, Head 00000000 Count 00010599 Oct 2 16:48:03 mojito kernel: NVRM: Xid (0001:00): 16, Head 00000000 Count 0001059a Oct 2 16:48:11 mojito kernel: NVRM: Xid (0001:00): 16, Head 00000000 Count 0001059b Oct 2 16:48:19 mojito kernel: NVRM: Xid (0001:00): 16, Head 00000000 Count 0001059c Oct 2 16:48:27 mojito kernel: NVRM: Xid (0001:00): 16, Head 00000000 Count 0001059d Oct 2 16:48:35 mojito kernel: NVRM: Xid (0001:00): 16, Head 00000000 Count 0001059e Oct 2 16:48:43 mojito kernel: NVRM: Xid (0001:00): 16, Head 00000000 Count 0001059f Oct 2 16:48:52 mojito kernel: NVRM: Xid (0001:00): 16, Head 00000000 Count 000105a0 then come the watchdogs: Oct 2 16:48:56 mojito kernel: em0: watchdog timeout -- resetting Oct 2 16:48:56 mojito kernel: em0: link state changed to DOWN Oct 2 16:48:58 mojito kernel: em0: link state changed to UP Oct 2 16:49:00 mojito kernel: NVRM: Xid (0001:00): 16, Head 00000000 Count 000105a1 Oct 2 16:49:06 mojito kernel: em0: watchdog timeout -- resetting Oct 2 16:49:06 mojito kernel: em0: link state changed to DOWN Oct 2 16:49:08 mojito kernel: NVRM: Xid (0001:00): 16, Head 00000000 Count 000105a2 Oct 2 16:49:08 mojito kernel: em0: link state changed to UP Oct 2 16:49:16 mojito kernel: NVRM: Xid (0001:00): 16, Head 00000000 Count 000105a3 Oct 2 16:49:16 mojito kernel: em0: watchdog timeout -- resetting Oct 2 16:49:16 mojito kernel: em0: link state changed to DOWN Oct 2 16:49:18 mojito kernel: em0: link state changed to UP Oct 2 16:49:24 mojito kernel: NVRM: Xid (0001:00): 16, Head 00000000 Count 000105a4 Oct 2 16:49:26 mojito kernel: em0: watchdog timeout -- resetting Oct 2 16:49:26 mojito kernel: em0: link state changed to DOWN Oct 2 16:49:29 mojito kernel: em0: link state changed to UP Oct 2 16:49:32 mojito kernel: NVRM: Xid (0001:00): 16, Head 00000000 Count 000105a5 Oct 2 16:49:36 mojito kernel: em0: watchdog timeout -- resetting Oct 2 16:49:36 mojito kernel: em0: link state changed to DOWN Oct 2 16:49:39 mojito kernel: em0: link state changed to UP Oct 2 16:49:47 mojito kernel: em0: watchdog timeout -- resetting Oct 2 16:49:47 mojito kernel: em0: link state changed to DOWN Oct 2 16:49:49 mojito kernel: em0: link state changed to UP and the box ends up frozen less than a minute later. The traffic on the Intel card can be low (pinging a host for a few dozen of seconds), medium (reloading a few pages in the tabs of Firefox) or high (downloading several iso images from our local FTP mirror): whatever I do, if both nvidia and em0 are used, the box freezes. Note that I can't freeze the box when doing several simultaneous big downloads or taring up a lot of files but NOT running X. So I guess it is a shared nvidia/em IRQ issue. FreeBSD 6.1-STABLE #0: Fri Jun 23 17:00:43 CEST 2006 had no such problem. The "DEBUG" kernconf is GENERIC + witness options enabled (but they do not help in this case). I traced back to find which changeset introduced the trouble. The results are: #*default release=cvs tag=RELENG_6 date=2006.06.23.17.00.00 # OK ... #*default release=cvs tag=RELENG_6 date=2006.08.08.09.12.56 # OK # #*default release=cvs tag=RELENG_6 date=2006.08.08.09.21.00 # BROKEN ... #*default release=cvs tag=RELENG_6 # BROKEN From sys commitlogs the culprit commits are: glebius 2006-08-08 09:19:25 utc freebsd src repository modified files: (branch: releng_6) sys/dev/em if_em.c log: sync with head. this includes the following changes in chronological order: o a significant performance improvements. the interrupt handler schedules work to a private taskqueue. the em_rxeof() function runs lockless. rev. 1.98 - 1.101 by scottl. rev. 1.103 by mux rev. 1.106 by glebius, from andrey v. elsukov <bu7cher yandex.ru> rev. 1.116 by glebius o style cleanups: - rev. 1.102, 1.108, 1.109 by glebius - rev. 1.124 by pdeuskar o vendor merges: - merged with vendor driver version 5.1.5 by jack vogel. rev. 1.115 by glebius - merged with vendor driver version 6.0.5 by jack vogel. rev. 1.123 by glebius o various fixes: - invalid use of bus_dma_allocnow rev. 1.104 by scott, 1.121 by yongari - link state handling cleanup. rev. 1.110 by glebius - fix if_baudrate handling. rev. 1.111 by glebius - honor iff_drv_oactive in em_start_locked(). rev. 1.117 by yongari - protect eeprom access with the driver lock. rev. 1.118 by yongari - fix link flap on siocgifaddr. rev. 1.119 by yongari - fix dma map handling in em_encap(). rev. 1.120,1.122 by yongari revision changes path 1.65.2.17 +1587 -1443 src/sys/dev/em/if_em.c glebius 2006-08-08 09:20:26 utc freebsd src repository modified files: (branch: releng_6) sys/dev/em license readme if_em.h if_em_hw.c if_em_hw.h if_em_osdep.h log: sync with head, merging vendor drivers updates 5.1.5, 6.0.5 by jack vogel. revision changes path 1.3.2.1 +1 -1 src/sys/dev/em/license 1.10.2.1 +71 -30 src/sys/dev/em/readme 1.32.2.3 +133 -157 src/sys/dev/em/if_em.h 1.16.2.2 +3186 -906 src/sys/dev/em/if_em_hw.c 1.15.2.3 +712 -48 src/sys/dev/em/if_em_hw.h 1.14.2.2 +46 -15 src/sys/dev/em/if_em_osdep.h I confirmed that by building a kernel from 2006.08.08.09.21.00 which shows the problem and a kernel from 2006.08.08.09.18.00 which works like a charm. Dunno if this could be linked to the em* watchdogs reported in this thread. Let me know if I can do something useful to help fixing this issue. -- bug
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20061004103154.GK1276>