Date: Fri, 12 Dec 2014 19:34:47 +0000 From: Roger Leigh <rleigh@codelibre.net> To: stable@FreeBSD.org Subject: Hard system lockups with 10.1, probably drm/newcons/radeonkms-related Message-ID: <20141212193447.GA1657@codelibre.net>
next in thread | raw e-mail | index | archive | help
[-- Attachment #1 --] Hi folks, With 10.1-RELEASE, I've enabled newcons at boot with kern.vty="vt" in loader.conf. With the latest Xorg/drm installed with pkg, I'm seeing intermittent hangs and hard lockups of the system. I've included the logs for one which recovered earlier today, but later on it just locked up completely and I don't have logs for that since I had to do a hard reset. I had to install and enable hal+dbus to get a working keyboard and mouse when running X, despite both working fine on the console! Not sure what the trigger is. Possibly also related to input. The first hard hang was after logging in with "mwm" via kdm4. It didn't start mwm, so I ran "mwm&" in the xterm; it locked up when I clicked and dragged the window title, i.e. when initiating the drag event. The second hang was while typing into a tmux session inside a konsole window. Nothing particularly special happening at the moment it locked up. I'm happy to do further debugging, but given that it locks up the whole system, I'm not sure how to go about getting any useful information at that point. The graphics card is an AMD Radeon HD 6800 Series using /dev/dri/card0. Starting X11 automatically loads the needed modules: # kldstat Id Refs Address Size Name 1 59 0xffffffff80200000 1755658 kernel 2 1 0xffffffff81956000 267f48 zfs.ko 3 2 0xffffffff81bbe000 6780 opensolaris.ko 4 1 0xffffffff81c11000 2b58 uhid.ko 5 1 0xffffffff81c14000 357f ums.ko 6 2 0xffffffff81c18000 28c0 vboxnetflt.ko 7 2 0xffffffff81c1b000 b998 netgraph.ko 8 2 0xffffffff81c27000 434c0 vboxdrv.ko 9 1 0xffffffff81c6b000 40a7 ng_ether.ko 10 1 0xffffffff81c70000 3ec0 vboxnetadp.ko 11 1 0xffffffff81c74000 11a57a radeonkms.ko 12 1 0xffffffff81d8f000 47f80 drm2.ko 13 4 0xffffffff81dd7000 1ff2 iicbus.ko 14 1 0xffffffff81dd9000 1a46 iic.ko 15 1 0xffffffff81ddb000 1e48 iicbb.ko 16 1 0xffffffff81ddd000 18f3 radeonkmsfw_BARTS_pfp.ko 17 1 0xffffffff81ddf000 1ce8 radeonkmsfw_BARTS_me.ko 18 1 0xffffffff81de1000 136f radeonkmsfw_BTC_rlc.ko 19 1 0xffffffff81de3000 6585 radeonkmsfw_BARTS_mc.ko Kernel log for the recoverable hang: Dec 12 13:23:23 sorilea kernel: drmn0: error: GPU lockup CP stall for more than 10000m sec Dec 12 13:23:23 sorilea kernel: drmn0: warning: GPU lockup (waiting for 0x000000000008 7184 last fence id 0x0000000000087177) Dec 12 13:23:23 sorilea kernel: drmn0: info: Saved 407 dwords of commands on ring 0. Dec 12 13:23:23 sorilea kernel: drmn0: info: GPU softreset: 0x00000003 Dec 12 13:23:23 sorilea kernel: drmn0: info: GRBM_STATUS = 0xA0003828 Dec 12 13:23:23 sorilea kernel: drmn0: info: GRBM_STATUS_SE0 = 0x00000007 Dec 12 13:23:23 sorilea kernel: drmn0: info: GRBM_STATUS_SE1 = 0x00000007 Dec 12 13:23:23 sorilea kernel: drmn0: info: SRBM_STATUS = 0x200000C0 Dec 12 13:23:23 sorilea kernel: drmn0: info: R_008674_CP_STALLED_STAT1 = 0x00000000 Dec 12 13:23:23 sorilea kernel: drmn0: info: R_008678_CP_STALLED_STAT2 = 0x00010100 Dec 12 13:23:23 sorilea kernel: drmn0: info: R_00867C_CP_BUSY_STAT = 0x00020182 Dec 12 13:23:23 sorilea kernel: drmn0: info: R_008680_CP_STAT = 0x80038243 Dec 12 13:23:23 sorilea kernel: drmn0: info: GRBM_SOFT_RESET=0x00007F6B Dec 12 13:23:23 sorilea kernel: drmn0: info: GRBM_STATUS = 0x00003828 Dec 12 13:23:23 sorilea kernel: drmn0: info: GRBM_STATUS_SE0 = 0x00000007 Dec 12 13:23:23 sorilea kernel: drmn0: info: GRBM_STATUS_SE1 = 0x00000007 Dec 12 13:23:23 sorilea kernel: drmn0: info: SRBM_STATUS = 0x200000C0 Dec 12 13:23:23 sorilea kernel: drmn0: info: R_008674_CP_STALLED_STAT1 = 0x00000000 Dec 12 13:23:23 sorilea kernel: drmn0: info: R_008678_CP_STALLED_STAT2 = 0x00000000 Dec 12 13:23:23 sorilea kernel: drmn0: info: R_00867C_CP_BUSY_STAT = 0x00000000 Dec 12 13:23:23 sorilea kernel: drmn0: info: R_008680_CP_STAT = 0x00000000 Dec 12 13:23:23 sorilea kernel: drmn0: info: GPU reset succeeded, trying to resume Dec 12 13:23:23 sorilea kernel: info: [drm] probing gen 2 caps for device 1002:5a16 = 2/0 Dec 12 13:23:23 sorilea kernel: info: [drm] enabling PCIE gen 2 link speeds, disable w ith radeon.pcie_gen2=0 Dec 12 13:23:23 sorilea kernel: info: [drm] PCIE GART of 512M enabled (table at 0x0000 000000040000). Dec 12 13:23:23 sorilea kernel: drmn0: info: WB enabled Dec 12 13:23:23 sorilea kernel: drmn0: info: fence driver on ring 0 use gpu addr 0x000 0000040000c00 and cpu addr 0x0xfffff8007e940c00 Dec 12 13:23:23 sorilea kernel: drmn0: info: fence driver on ring 3 use gpu addr 0x000 0000040000c0c and cpu addr 0x0xfffff8007e940c0c Dec 12 13:23:23 sorilea kernel: info: [drm] ring test on 0 succeeded in 4 usecs Dec 12 13:23:23 sorilea kernel: info: [drm] ring test on 3 succeeded in 2 usecs Dec 12 13:23:33 sorilea kernel: drmn0: error: GPU lockup CP stall for more than 10000m sec Dec 12 13:23:33 sorilea kernel: drmn0: warning: GPU lockup (waiting for 0x000000000008 7185 last fence id 0x0000000000087177) Dec 12 13:23:33 sorilea kernel: error: [drm:pid939:r600_ib_test] *ERROR* radeon: fence wait failed (-11). Dec 12 13:23:33 sorilea kernel: error: [drm:pid939:radeon_ib_ring_tests] *ERROR* radeon: failed testing IB on GFX ring (-11). Dec 12 13:23:33 sorilea kernel: drmn0: error: ib ring test failed (-11). Dec 12 13:23:33 sorilea kernel: drmn0: info: GPU softreset: 0x00000003 Dec 12 13:23:33 sorilea kernel: drmn0: info: GRBM_STATUS = 0xA0003828 Dec 12 13:23:33 sorilea kernel: drmn0: info: GRBM_STATUS_SE0 = 0x00000007 Dec 12 13:23:33 sorilea kernel: drmn0: info: GRBM_STATUS_SE1 = 0x00000007 Dec 12 13:23:33 sorilea kernel: drmn0: info: SRBM_STATUS = 0x200000C0 Dec 12 13:23:33 sorilea kernel: drmn0: info: R_008674_CP_STALLED_STAT1 = 0x00000000 Dec 12 13:23:33 sorilea kernel: drmn0: info: R_008678_CP_STALLED_STAT2 = 0x00004100 Dec 12 13:23:33 sorilea kernel: drmn0: info: R_00867C_CP_BUSY_STAT = 0x00020182 Dec 12 13:23:33 sorilea kernel: drmn0: info: R_008680_CP_STAT = 0x80028243 Dec 12 13:23:33 sorilea kernel: drmn0: info: GRBM_SOFT_RESET=0x00007F6B Dec 12 13:23:33 sorilea kernel: drmn0: info: GRBM_STATUS = 0x00003828 Dec 12 13:23:33 sorilea kernel: drmn0: info: GRBM_STATUS_SE0 = 0x00000007 Dec 12 13:23:33 sorilea kernel: drmn0: info: GRBM_STATUS_SE1 = 0x00000007 Dec 12 13:23:33 sorilea kernel: drmn0: info: SRBM_STATUS = 0x200000C0 Dec 12 13:23:33 sorilea kernel: drmn0: info: R_008674_CP_STALLED_STAT1 = 0x00000000 Dec 12 13:23:33 sorilea kernel: drmn0: info: R_008678_CP_STALLED_STAT2 = 0x00000000 Dec 12 13:23:33 sorilea kernel: drmn0: info: R_00867C_CP_BUSY_STAT = 0x00000000 Dec 12 13:23:33 sorilea kernel: drmn0: info: R_008680_CP_STAT = 0x00000000 Dec 12 13:23:33 sorilea kernel: drmn0: info: GPU reset succeeded, trying to resume Dec 12 13:23:33 sorilea kernel: info: [drm] probing gen 2 caps for device 1002:5a16 = 2/0 Dec 12 13:23:33 sorilea kernel: info: [drm] enabling PCIE gen 2 link speeds, disable with radeon.pcie_gen2=0 Dec 12 13:23:33 sorilea kernel: info: [drm] PCIE GART of 512M enabled (table at 0x0000000000040000). Dec 12 13:23:33 sorilea kernel: drmn0: info: WB enabled Dec 12 13:23:33 sorilea kernel: drmn0: info: fence driver on ring 0 use gpu addr 0x0000000040000c00 and cpu addr 0x0xfffff8007e940c00 Dec 12 13:23:33 sorilea kernel: drmn0: info: fence driver on ring 3 use gpu addr 0x0000000040000c0c and cpu addr 0x0xfffff8007e940c0c Dec 12 13:23:33 sorilea kernel: info: [drm] ring test on 0 succeeded in 4 usecs Dec 12 13:23:33 sorilea kernel: info: [drm] ring test on 3 succeeded in 2 usecs Dec 12 13:23:33 sorilea kernel: info: [drm] ib test on ring 0 succeeded in 0 usecs Dec 12 13:23:33 sorilea kernel: info: [drm] ib test on ring 3 succeeded in 1 usecs It worked perfectly for 5 hours after this recovery. Thanks all, Roger -- .''`. Roger Leigh : :' : Debian GNU/Linux http://people.debian.org/~rleigh/ `. `' schroot and sbuild http://alioth.debian.org/projects/buildd-tools `- GPG Public Key F33D 281D 470A B443 6756 147C 07B3 C8BC 4083 E800 [-- Attachment #2 --] 7zXZ ִF ! t/!{] -0FCeq{0MM?m9WF?bx~7lnp-K;ǖ&~N2xBף )R`hkx#Ib꿮/\)p~0"rZ\g's&D+Gܹvbh/?4fR ŤE3z+L%l;*5AUlm;]bD Q yt#$?eK+ ]iϭ IfPm뱑WFmgGesm[E3v8=2%餅ӕ1lWիʺEz9_ MP6l.8%/jNxo qQQER;i@np4<KEnICF%g8U+'Heԉ2֝P%VCov*檇] PgI*nԨeeɳaDm{TH? 4|FApdR* "-'jͭ2FKrg{lb.4k 9=X#l_2FVre8uc#`)IU+H졈9+֨8sRo$ƌڒ!%{I*#us{_M1aX|o ),>Q!f"9L4 D詴Q}jsNi?]hL$Esq>O<#PcGeV+Ȁm>!TZ!:5l:TEu2s'(*n)Mߪs8PbI ?t9KvLɡ6DCOU"L]hؚ9}|,6W8; [85GV60̬;/#3MɆOMy'MrKHe>ls"Eц1qz|ת()|tH⤿HCkHWܙeaqE%\oZ|`6}I>yY&yŀ?kS#T zٍ6eQwGDӏ[G!\+{TUs+2>^it!!Ro鰫Fe!e`x3~cA=3jäY]ϦD^}!4^:t'gO(v/=hN8v@-}e'Uۚw/sACc'x#d*rqM˽l49^%ݫs6ej"-6] Z(ol-M.5YNT9!pLyJLӦyOlӧGgeXaAv36@5Jsi&@Ǝ?{VGLLR\iU4KW3^pHm"E!Nԛ;K6ztABidw&Ih%o?G;FSP\@>"w\;P=8#ő"@Vu"L^!T#BstJb4qKV6~WoZ`@ +-%?8ʨr1_1,]JѡcuM:vmm: WlQJ6 ;1 FRi2ڔdmӺ8<7ъ ,X#1bS]IRfS)J=zл~2,?/ɫ5Օ(C HPalwlW_>h.)SQ,A}Ƨ"u"-G^j0f /5]Sֽ;V+bЎ!?VRsB"R>hȂqbU{D E{e잂q>Gu|$*¶P k˼i:(=FI^EugdϜ+cMך3muf5+1p#sd;srb k;:d?M4.eXb2Y~2onBWۛRwA8Be _pډ3?断x坩)+?_ZǢ~wއ;WvQ @M_o5_<'mP((JL-ӤgPZGxr89ZD4(_$HM>xlЍh˸ǚ?[q4;`f&JKQ`9wl!$>ŬDqkQ2TS;քåQYsg1*Z)Jt8tAHnyZ@RQfbF%}<Zu"jn,#_ָAun/mK`tၿ4́+@唰.Z2')"K%Ɯ UGnt{a'ǴB<יË5pZWMp%Zcc؇eB8eÐc1AFXWō;n ZڍIBD#V u<i5)[U8֛U>tؚ*뻵oImC{7JH=[)ɻH\"${f Pp k9||w zT|(BE/U. [иDzERflFNKz')RQtN0?`n*StY#?~)4Ǽ5CԨFai4-5av9Bbdlǖ"uDΌc; $N,EK(9k=g64k=^Ha͠qD.&XཪS,OE A^@I~_ꢴk"{f0G!|(Q*EL|tI1덋?hsZkCė]MXjH-_sU:W}OkyKnF|n0TKe r](Dqzhj4 z!V}a܈~o )Ǫ*]Rfv'!Ro"+V4x5#.4KBdyo%8[M?3 Uf:hφXϤjH+4x@-@z6ݨM^4bIŌFW$qS4~}zgNLn7'zw*x.}〓}...(zvbF!b]SOTUڼ{f¡#/*5Cjiwn # C棫|qJwI湩'[c/(i`| ħ%G}.NVm&@Bz'ffqօ<3萮ڲ:08x5m18K!?IkZ! KAșj> ~.F.}59 m*N#x12@^ED< .v& R mX/x"AzpnY;5n9xmY}Q.[9)jf.{PQ\]LmFǘh;66OltҨѬoAm&#` 7I[z'np%j;|+3#ƣrXxc*J25 N?2d O?`I֖翜,+\Rxqɿɴ݉} Isk%/f\Ab[<Hڮ,X-ՋXXx\PCVNwz@-QEdK[UuP*+_aLTNR'uSZ_.ؽ}SW #KH|W'=MPD4va a{K DnvM<aIJD`eiM(T>Z\w2`(bƺ:jN=XubGucIn1xVĽeQ)1ԣ#9e,Xq2v,!gF{b'qVw`E 1)Aj۶$r !@m`+ᮒ#LPŻ S]gAzګȥ1IJg/T:~T~CJ3ԺMYB3?5Wkzn7nD$|84t>GdńsyK5`tש*0zXrh$Aev$Ig| 7 TzI K^'K߰{>a]_bK=ř n-!ʟRV^^Q f9O04@+gڭAMWځ30`T/nsB\јy^pC,B{_uizU5|ƊuZk* J7y6Vriμ53VQMiT0'F3tRhwexNZ(Sş_boI#aeD,Hz R[6_rK6w-My@IB٘[wS 'uⷶRKx4@pӋ(@6DWO9<;>Spd 8KV5e88*pFʶ>u(3h-#+O*~ry!~Bw54z!x_ ¯5I|so2Uv\P>fFg-v|EgE.FY 9kJ!2oA_&>Z/}*%k&c'w{g(K;ԐN-wzL┺seǑ! {檫IH")B/X4xlhid"t7p{3'EÐC(;I#}Kąkf=NiHq_@{g<~?f`P٤PM̺)AZ^#~X4Qq`Z'K1z;a*vҚuJ( >[d^M_0$q">=gtARDn`<Vglո%th `.#xyfdT/Kڢ5cDG|ڐ$ȼzT엏XmݠοSsPQ\7g~UƬ ҅#2dVO6O`}=aw WƄk%"CUZ (ÀorI24e]9=ec/D=u 1 is@@ s `).8)kR oAZ6MbڸՂܿ"( vwvxBReg@`(=%/ډޛN?l<9~x:rw'M(,)uP&$?eEš."u%&V征E" 4[BOqܺE]"'X++jxDXdcyᥚ{L5 Z%] a|vFx.ctTgH4Iely;0GgS6uϾſS RP/pѹ\6s`V)pFlH7l?WKqc<> sZ;:Ɠ̈IK'$nH2,y-fW}*jzAO&*N`Ud}K3 <"g7&q2VP8U<iP_ XN-Ay6>aS"ʆ H%g~H(wWd'kdş%S3/x#swpcX-;6iC t_T@<jm|߬`+yxཅO(mתK;=2</3t~Ù'\|x=X(--g_'ɆJ/&dME M?N 胖,}/Ewi,h5|i UH"oz|\>طDn^;KQ_=E[xSzĩ&tIQʯMم:@(eזVǓ*侮ZY^&2fQt(Zʨ=MJu>*)pp$SL()q|q @l535xpofcl)jKa Ҍ%ֆ )2%ro pwMB#_ XR8N`j\Ey𱷜A4:HaO ayiM oϑ;x8y}jb%$|?^QkН.pL{ǣD8A/Ϙ @ QĚ|׆Ʉ%h튞<XNFLn۱jk" KlщJe+L=Gc.@B1RzƸU 2$A(S^?:'uGb^&J/\woxO5u]Gt \nRxZ}$vVA%obN'q*ʖ Oè^WAt "~0띑 k١CL4^s/<[~-|αuc|(>&cb澎|ھ xur/W90WLDnKMYcg'Hޛrf*2[Jxtk{3VBɊ;G]NA#GėD_uovJEZa7/3'60Ua@Cctp=fV/ aZ'^d\\n/6%Za7Z"s3%+ч&FS6c?xsz9Xug:<AQcm,?^){:B)[h,a^Ekn穝MB8FӱgH!݊zA Ew-ҎbKY#X&I:)Rp)0I` o H+Jk rJo)%WK(;ZZ ײp]K2 )\*oiS47xP?:*,sJl}և>zJ+1W!jo/_w\K`URgߺڂcZdD@_l!gbY4E86z}LJ2ScCׁ>mNWR d,ʚHde$FqavV2eK^t}-ʗp3TgK%q=(s_5^6p`l' *"9hux>EDz <v l`Cވ0mr7fEҨ^̙C#!q' A+ƸENA _hJ h$N>F`82֝^\X'!2UtiF0oJk k7#S,W L2BlȾ(a~p USRe?I2f"/9d/X+-Ҕ}#[.*yIpDͨ'vRT>_DMUCzQO3ey;Az t0u| ե}b|v}$4EJ9Kp.48̜l}zzar2BFѧKX&&t)<NkLSkKVZTS.a]i#胻8K&CޞddTrJ.:#Mj'a/2ۺh%AdQpy3iKc˪}wWҗkvQ8g:v{i~bAԠ>J U ۅk=ʿZyc]WP= s1oGXj C ]aEĀ4QpYμH>FnvVSVUVt|zxq[^ duO0b=,yU&>̬(eR |0YU9B&+P?VmhtE0WY=9 cYū핝~irb ͱg67q</qՋEleqrP=ZHO᪦_{ Y{De7t e@ n.ÃA[h]`+:YpH,%<+fAPSL$E:B2}obm wflo C f dg YZ
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20141212193447.GA1657>
