From owner-freebsd-current@FreeBSD.ORG Wed Jul 22 11:39:46 2009 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 52AB21065672 for ; Wed, 22 Jul 2009 11:39:46 +0000 (UTC) (envelope-from adamk@voicenet.com) Received: from QMTA06.westchester.pa.mail.comcast.net (qmta06.westchester.pa.mail.comcast.net [76.96.62.56]) by mx1.freebsd.org (Postfix) with ESMTP id E1ED18FC20 for ; Wed, 22 Jul 2009 11:39:45 +0000 (UTC) (envelope-from adamk@voicenet.com) Received: from OMTA09.westchester.pa.mail.comcast.net ([76.96.62.20]) by QMTA06.westchester.pa.mail.comcast.net with comcast id JymQ1c0060SCNGk56zfm3s; Wed, 22 Jul 2009 11:39:46 +0000 Received: from [192.168.5.101] ([68.45.151.98]) by OMTA09.westchester.pa.mail.comcast.net with comcast id Jzfl1c00327dlBY3VzflLd; Wed, 22 Jul 2009 11:39:46 +0000 From: Adam K Kirchhoff To: "Paul B. Mahol" In-Reply-To: <3a142e750907220413o1afff523s83c03d5f7ca0c044@mail.gmail.com> References: <4A5D27F2.50208@voicenet.com> <200907201803.32053.gnemmi@gmail.com> <3a142e750907210146u2ce72cadhbdaa71a89be54607@mail.gmail.com> <200907212034.04853.gnemmi@gmail.com> <3a142e750907220413o1afff523s83c03d5f7ca0c044@mail.gmail.com> Content-Type: text/plain Date: Wed, 22 Jul 2009 07:39:44 -0400 Message-Id: <1248262784.1724.1.camel@sorrow.ashke.com> Mime-Version: 1.0 X-Mailer: Evolution 2.26.3 FreeBSD GNOME Team Port Content-Transfer-Encoding: 7bit Cc: freebsd-current@freebsd.org, Gonzalo Nemmi Subject: Re: bge problems when resuming X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 Jul 2009 11:39:46 -0000 On Wed, 2009-07-22 at 13:13 +0200, Paul B. Mahol wrote: > On 7/22/09, Gonzalo Nemmi wrote: > > On Tuesday 21 July 2009 5:46:10 am Paul B. Mahol wrote: > >> On 7/20/09, Gonzalo Nemmi wrote: > >> > On Sunday 19 July 2009 7:53:52 pm Paul B. Mahol wrote: > >> >> On 7/20/09, Gonzalo Nemmi wrote: > >> >> > On Sat, Jul 18, 2009 at 12:09 AM, Paul B. Mahol > >> >> > > >> > > >> > wrote: > >> >> >> On 7/17/09, Gonzalo Nemmi wrote: > >> >> >> > On Wednesday 15 July 2009 8:13:47 am Adam K Kirchhoff wrote: > >> >> >> >> On Wednesday 15 July 2009 03:20:45 Paul B. Mahol wrote: > >> >> >> >> > On 7/15/09, Adam K Kirchhoff wrote: > >> >> >> >> > > Hello all, > >> >> >> >> > > > >> >> >> >> > > I have a Dell Latitude D610 laptop with 8.0-BETA1 > >> >> >> >> > > installed. I hadn't tried suspend/resume for a while > >> >> >> >> > > and decided to give it a shot. I was pleasantly > >> >> >> >> > > surprised to see that I could suspend to ram, resume, > >> >> >> >> > > and have a (relatively) working system (previously the > >> >> >> >> > > display would never come back up and the serial console > >> >> >> >> > > I had hooked up remained dead). Great job to everyone > >> >> >> >> > > who helped make that possible. > >> >> >> >> > > > >> >> >> >> > > The only real issue that I seem to have now is that bge > >> >> >> >> > > is completely unusable after resume. Another individual > >> >> >> >> > > seems to have reported similar problems with bge and > >> >> >> >> > > resume, but he also had other issues that apparently > >> >> >> >> > > trumped his networking issues: > >> >> >> >> > > > >> >> >> >> > > http://lists.freebsd.org/pipermail/freebsd-current/2009- > >> >> >> >> > >Jul y/0090 23.html > >> >> >> >> > > > >> >> >> >> > > Like him, resuming from suspend gives me: > >> >> >> >> > > > >> >> >> >> > > Jul 14 12:35:53 scroll kernel: bge0: PHY write timed out > >> >> >> >> > > (phy 1, reg 0, val 32768) > >> >> >> >> > > Jul 14 12:35:53 scroll kernel: bge0: PHY read timed out > >> >> >> >> > > (phy 1, reg 0, val 0xffffffff) > >> >> >> >> > > Jul 14 12:35:53 scroll kernel: bge0: PHY write timed out > >> >> >> >> > > (phy 1, reg 24, val 3072) > >> >> >> >> > > Jul 14 12:35:53 scroll kernel: bge0: PHY write timed out > >> >> >> >> > > (phy 1, reg 23, val 10) > >> >> >> >> > > Jul 14 12:35:53 scroll kernel: bge0: PHY write timed out > >> >> >> >> > > (phy 1, reg 21, val 12555) > >> >> >> >> > > Jul 14 12:35:53 scroll kernel: bge0: PHY write timed out > >> >> >> >> > > (phy 1, reg 23, val 8223) > >> >> >> >> > > Jul 14 12:35:53 scroll kernel: bge0: PHY write timed out > >> >> >> >> > > (phy 1, reg 21, val 38150) > >> >> >> >> > > Jul 14 12:35:53 scroll kernel: bge0: PHY write timed out > >> >> >> >> > > (phy 1, reg 23, val 16415) > >> >> >> >> > > Jul 14 12:35:53 scroll kernel: bge0: PHY write timed out > >> >> >> >> > > (phy 1, reg 21, val 5346) > >> >> >> >> > > Jul 14 12:35:53 scroll kernel: bge0: PHY write timed out > >> >> >> >> > > (phy 1, reg 24, val 1024) > >> >> >> >> > > Jul 14 12:35:53 scroll kernel: bge0: PHY write timed out > >> >> >> >> > > (phy 1, reg 24, val 7) > >> >> >> >> > > > >> >> >> >> > > And so on and so forth. > >> >> >> >> > > > >> >> >> >> > > I thought that compiling if_bge as a module, unloading > >> >> >> >> > > it before suspend, and reloading it after resume, might > >> >> >> >> > > get this working. However, doing a "kldload if_bge" > >> >> >> >> > > after the resume does nothing. Well, the module gets > >> >> >> >> > > loaded, but the device doesn't show up. No errors from > >> >> >> >> > > kldload, and there is nothing new in dmesg. > >> >> >> >> > > > >> >> >> >> > > Before the suspend, the device shows up as: > >> >> >> >> > > > >> >> >> >> > > bge0@pci0:2:0:0: class=0x020000 card=0x01821028 > >> >> >> >> > > chip=0x167714e4 rev=0x01 hdr=0x00 > >> >> >> >> > > vendor = 'Broadcom Corporation' > >> >> >> >> > > device = 'NetXtreme Gigabit Ethernet PCI Express > >> >> >> >> > > (BCM5750A1)' class = network > >> >> >> >> > > subclass = ethernet > >> >> >> >> > > > >> >> >> >> > > After resuming, and reloading the module, it's: > >> >> >> >> > > > >> >> >> >> > > none1@pci0:2:0:0: class=0x020000 card=0x01821028 > >> >> >> >> > > chip=0x167714e4 rev=0x01 hdr=0x00 > >> >> >> >> > > vendor = 'Broadcom Corporation' > >> >> >> >> > > device = 'NetXtreme Gigabit Ethernet PCI Express > >> >> >> >> > > (BCM5750A1)' class = network > >> >> >> >> > > subclass = ethernet > >> >> >> >> > > > >> >> >> >> > > If there are no ideas, I'll go ahead and open up a pr. > >> >> >> >> > > I assume this is just one bug, since both problems (the > >> >> >> >> > > PHY issues and the inability to reload the driver) are > >> >> >> >> > > both related to the network device. > >> >> >> >> > > >> >> >> >> > Put this lines into loader.conf and reboot. > >> >> >> >> > > >> >> >> >> > hw.pci.do_power_nodriver="3" > >> >> >> >> > hw.pci.do_power_resume="1" > >> >> >> >> > > >> >> >> >> > Now, before suspend, unload if_bge and some another driver > >> >> >> >> > (sound drivers are best candidate) and load sound driver > >> >> >> >> > again, suspend and resume. > >> >> >> >> > Now loading if_bge should make it succesfully attach. > >> >> >> >> > >> >> >> >> Unfortunately, after doing this, reloading the if_bge driver > >> >> >> >> causes the laptop to completely lock up... It gets as far > >> >> >> >> as: > >> >> >> >> > >> >> >> >> bge0: >> >> >> >> unknown ASIC rev. 0xffff> > >> >> >> >> mem 0xdfdf0000-0xdfdfffff irq 16 at device 0.0 on pci2 > >> >> >> >> > >> >> >> >> And then the entire machine hangs. I'm on ttyv0, so I'd see > >> >> >> >> any kernel panic, but nothing like that happens. The screen > >> >> >> >> stays on, but nothing else happens till I force a reboot. > >> >> >> >> > >> >> >> >> Adam > >> >> >> > > >> >> >> > Hi Adam, Paul ... > >> >> >> > I'm the "another individual" from you OP. > >> >> >> > I have the same problems you have regarding bge, but they > >> >> >> > weren't trumped .. I just had an order of priorities ;) > >> >> >> > > >> >> >> > Anyways, I tried the solution Paul posted and, just as in > >> >> >> > your case, I got a hard lock too ... > >> >> >> > > >> >> >> > I tried loading if_bge through /boot/loader.conf > >> >> >> > Then issued a: > >> >> >> > > >> >> >> > kldunload if_bge coretemp > >> >> >> > >> >> >> coretemp is wrong module, it must be one of modules that attach > >> >> >> to pci. > >> >> > > >> >> > Sorry Paul! > >> >> > I gave it a go with snd_hda and I got the same result except > >> >> > that this time I also got the following message: > >> >> > >> >> After unloading snd_hda you loaded it again before suspending? > >> > > >> > Doing so yielded a Fatal trap 12 on BETA2. Yesterday I install > >> > BETA2 and here are the results: > >> > > >> > > >> > kldstat > >> > > >> > Id Refs Address Size Name > >> > 1 28 0xc0400000 cf6c70 kernel > >> > 2 1 0xc10f7000 11bc0 if_bge.ko > >> > 3 1 0xc1109000 1ac4c snd_hda.ko > >> > 4 2 0xc1124000 61f78 sound.ko > >> > 5 1 0xc1186000 2af4 coretemp.ko > >> > 6 1 0xc1189000 a6d8 i915.ko > >> > 7 2 0xc1194000 177d4 drm.ko > >> > > >> > > >> > kldunload if_bge snd_hda > >> > > >> > Jul 20 17:50:49 gargoyle login: ROOT LOGIN (root) ON ttyv0 > >> > Jul 20 17:51:06 gargoyle kernel: brgphy0: detached > >> > Jul 20 17:51:06 gargoyle kernel: lock order reversal: > >> > Jul 20 17:51:06 gargoyle kernel: 1st 0xc0dba45c kernel linker > >> > (kernel linker) @ /usr/src/sys/kern/kern_linker.c:1079 > >> > Jul 20 17:51:06 gargoyle kernel: 2nd 0xc0dbbc64 sysctl lock (sysctl > >> > lock) @ /usr/src/sys/kern/kern_sysctl.c:257 > >> > Jul 20 17:51:06 gargoyle kernel: KDB: stack backtrace: > >> > Jul 20 17:51:06 gargoyle kernel: > >> > db_trace_self_wrapper(c0c6baf4,e6daba34,c08bc995,c08ad6db,c0c6e989, > >> >...) at db_trace_self_wrapper+0x26 > >> > Jul 20 17:51:06 gargoyle kernel: > >> > kdb_backtrace(c08ad6db,c0c6e989,c452bc88,c4529e10,e6daba90,...) at > >> > kdb_backtrace+0x29 > >> > Jul 20 17:51:06 gargoyle kernel: > >> > _witness_debugger(c0c6e989,c0dbbc64,c0c69667,c4529e10,c0c6956e,...) > >> > at _witness_debugger+0x25 > >> > Jul 20 17:51:06 gargoyle kernel: > >> > witness_checkorder(c0dbbc64,9,c0c6956e,101,0,...) at > >> > witness_checkorder+0x839 > >> > Jul 20 17:51:06 gargoyle kernel: > >> > _sx_xlock(c0dbbc64,0,c0c6956e,101,c4722c00,...) at _sx_xlock+0x85 > >> > Jul 20 17:51:06 gargoyle kernel: > >> > sysctl_ctx_free(c4722c4c,c4722c00,e6dabb18,c08a3c85,c4722c00,...) > >> > at sysctl_ctx_free+0x30 > >> > Jul 20 17:51:06 gargoyle kernel: > >> > device_sysctl_fini(c4722c00,0,c0d4c848,c472a810,c4ab3400,...) at > >> > device_sysctl_fini+0x1a > >> > Jul 20 17:51:06 gargoyle kernel: > >> > device_detach(c4722c00,c4722b80,e6dabb38,c06bc622,c4722b80,...) at > >> > device_detach+0x1f5 > >> > Jul 20 17:51:06 gargoyle kernel: > >> > bus_generic_detach(c4722b80,c4722b80,e6dabb64,c08a3b1c,c4722b80,... > >> >) at bus_generic_detach+0x29 > >> > Jul 20 17:51:06 gargoyle kernel: > >> > miibus_detach(c4722b80,c45d6060,c0d4ca68,a3c,c0c76f47,...) at > >> > miibus_detach+0x12 > >> > Jul 20 17:51:06 gargoyle kernel: > >> > device_detach(c4722b80,c472b008,e6dabb98,c10ff7ff,c4722300,...) at > >> > device_detach+0x8c > >> > Jul 20 17:51:06 gargoyle kernel: > >> > bus_generic_detach(c4722300,1,c1104b66,aec,c4722300,...) at > >> > bus_generic_detach+0x29 > >> > Jul 20 17:51:06 gargoyle kernel: > >> > bge_detach(c4722300,c4677060,c0d4ca68,a3c,c4526300,...) at > >> > bge_detach+0xbf > >> > Jul 20 17:51:06 gargoyle kernel: > >> > device_detach(c4722300,c086c843,c0dbb570,c1106c20,c456fb80,...) at > >> > device_detach+0x8c > >> > Jul 20 17:51:06 gargoyle kernel: > >> > driver_module_handler(c4526300,1,c1106c20,109,0,...) at > >> > driver_module_handler+0x29c > >> > Jul 20 17:51:06 gargoyle kernel: > >> > module_unload(c4526300,c0c652ef,273,270,c08604b6,...) at > >> > module_unload+0x43 > >> > Jul 20 17:51:06 gargoyle kernel: > >> > linker_file_unload(c4544200,0,c0c652ef,437,c10f7000,...) at > >> > linker_file_unload+0x15e > >> > Jul 20 17:51:06 gargoyle kernel: > >> > kern_kldunload(c4b346c0,2,0,e6dabd2c,c0ba8dd3,...) at > >> > kern_kldunload+0xd5 > >> > Jul 20 17:51:06 gargoyle kernel: > >> > kldunloadf(c4b346c0,e6dabcf8,8,c0c6fa4b,c0d50450,...) at > >> > kldunloadf+0x2b > >> > Jul 20 17:51:06 gargoyle kernel: syscall(e6dabd38) at syscall+0x2a3 > >> > Jul 20 17:51:06 gargoyle kernel: Xint0x80_syscall() at > >> > Xint0x80_syscall+0x20 > >> > Jul 20 17:51:06 gargoyle kernel: --- syscall (444, FreeBSD ELF32, > >> > kldunloadf), eip = 0x280d516b, esp = 0xbfbfe47c, ebp = 0xbfbfecc8 > >> > --- Jul 20 17:51:06 gargoyle kernel: miibus0: detached > >> > Jul 20 17:51:06 gargoyle kernel: bge0: detached > >> > Jul 20 17:51:06 gargoyle kernel: sysctl_unregister_oid: failed to > >> > unregister sysctl > >> > >> if_bge driver looks very problematic to me. Probably it can not > >> detach at all. > >> > >> > Jul 20 17:51:06 gargoyle kernel: pcm0: detached > >> > Jul 20 17:51:06 gargoyle kernel: hdac0: detached > >> > > >> > > >> > kld snd_hda > >> > >> ^^^ > >> You mean kldload. > >> > >> > Jul 20 17:52:16 gargoyle kernel: hdac0: >> > Definition Audio Controller> mem 0xf6dfc000-0xf6dfffff irq 21 at > >> > device 27.0 on pci0 > >> > Jul 20 17:52:16 gargoyle kernel: hdac0: HDA Driver Revision: > >> > 20090624_0136 > >> > Jul 20 17:52:16 gargoyle kernel: hdac0: [ITHREAD] > >> > Jul 20 17:52:16 gargoyle kernel: hdac0: HDA Codec #0: Sigmatel > >> > STAC9228X Jul 20 17:52:16 gargoyle kernel: bge0: >> > A2, ASIC rev. 0xc002> mem 0xf69f0000-0xf69fffff irq 17 at device > >> > 0.0 on pci9 Jul 20 17:52:16 gargoyle kernel: miibus0: on > >> > bge0 Jul 20 17:52:16 gargoyle kernel: brgphy0: >> > 10/100baseTX PHY> PHY 1 on miibus0 > >> > Jul 20 17:52:16 gargoyle kernel: brgphy0: 10baseT, 10baseT-FDX, > >> > 100baseTX, 100baseTX-FDX, auto > >> > Jul 20 17:52:16 gargoyle kernel: bge0: Ethernet address: > >> > 00:23:ae:04:ba:ca > >> > Jul 20 17:52:16 gargoyle kernel: bge0: [ITHREAD] > >> > Jul 20 17:52:16 gargoyle kernel: pcm0: >> > #0 Analog> at cad 0 nid 1 on hdac0 > >> > Jul 20 17:52:16 gargoyle kernel: bge0: link state changed to DOWN > >> > Jul 20 17:52:18 gargoyle kernel: bge0: link state changed to UP > >> > >> Why bge0 appeared again? > >> > >> > acpiconf -s 3 > >> > >> After this command bge0 should not appear at all because it should > >> not be attached to > >> device. > >> > >> > Jul 20 17:53:51 gargoyle acpi: suspend at 20090720 17:53:51 > >> > Jul 20 17:53:56 gargoyle kernel: fwohci0: fwohci_pci_suspend > >> > Jul 20 17:54:25 gargoyle kernel: bge0: PHY write timed out (phy 1, > >> > reg 0, val 32768) > >> > Jul 20 17:54:25 gargoyle kernel: bge0: PHY read timed out (phy 1, > >> > reg 0, val 0xffffffff) > >> > Jul 20 17:54:25 gargoyle kernel: bge0: PHY read timed out (phy 1, > >> > reg 24, val 0xffffffff) > >> > Jul 20 17:54:25 gargoyle kernel: bge0: PHY read timed out (phy 1, > >> > reg 16, val 0xffffffff) > >> > Jul 20 17:54:25 gargoyle kernel: bge0: PHY write timed out (phy 1, > >> > reg 16, val 0) > >> > Jul 20 17:54:25 gargoyle kernel: bge0: PHY read timed out (phy 1, > >> > reg 16, val 0xffffffff) > >> > Jul 20 17:54:25 gargoyle kernel: bge0: PHY write timed out (phy 1, > >> > reg 16, val 0) > >> > Jul 20 17:54:25 gargoyle kernel: bge0: PHY write timed out (phy 1, > >> > reg 23, val 18) > >> > Jul 20 17:54:25 gargoyle kernel: bge0: flow-through queue init > >> > failed Jul 20 17:54:25 gargoyle kernel: bge0: initialization > >> > failure Jul 20 17:54:25 gargoyle kernel: fwohci0: Phy 1394a > >> > available S400, 1 ports. > >> > Jul 20 17:54:25 gargoyle kernel: fwohci0: Link S400, max_rec 2048 > >> > bytes. Jul 20 17:54:25 gargoyle kernel: fwohci0: Initiate bus reset > >> > Jul 20 17:54:25 gargoyle kernel: fwohci0: fwohci_intr_core: BUS > >> > reset Jul 20 17:54:25 gargoyle kernel: fwohci0: fwohci_intr_core: > >> > node_id=0x00000000, SelfID Count=1, CYCLEMASTER mode > >> > Jul 20 17:54:25 gargoyle kernel: firewire0: 1 nodes, maxhop <= 0 > >> > cable IRM irm(0) (me) > >> > Jul 20 17:54:25 gargoyle kernel: firewire0: bus manager 0 > >> > Jul 20 17:54:25 gargoyle kernel: fwohci0: unrecoverable error > >> > Jul 20 17:54:25 gargoyle kernel: wakeup from sleeping state (slept > >> > 00:00:29) > >> > Jul 20 17:54:25 gargoyle acpi: resumed at 20090720 17:54:25 > >> > > >> > Should a PR on fwohci and firewire also be filed?? > >> > >> Try with custom kernel with smaller number of drivers as possible. > >> (use modules instead) > >> From your mail I dont see where is problem with firewire. > > > > Done. > > > > Commented if_bge out of GENERIC, recompiled, loaded if_bge via > > loader.conf, kldunloaded if_bge snd_hda, kloaded snd_hda (if_bge did > > not show up on dmesg this time), went to sleep (acpiconf -s 3), > > resumed, no bge timeouts (only fwohci and firewire messages), then > > kldloaded if_bge and got a solid freeze :( > > Does kldload of if_bge works after boot? (remove if_bge_load="YES" > from /boot/loader.conf > and load it after boot) > Does kldload and kldunload and kldload again of if_bge works (without > suspending machine this time)? I can boot up without if_bge loaded and then kldload and kldunload if_bge repeatedly at least 6 times each without any lockups if suspending is not involved (I stopped after the 6th time). Adam