From owner-freebsd-stable@FreeBSD.ORG Sat Jan 25 19:35:14 2014 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 23462640 for ; Sat, 25 Jan 2014 19:35:14 +0000 (UTC) Received: from maildrop2.v6ds.occnc.com (maildrop2.v6ds.occnc.com [IPv6:2001:470:88e6:3::232]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id BA6701D14 for ; Sat, 25 Jan 2014 19:35:13 +0000 (UTC) Received: from harbor3.ipv6.occnc.com (harbor3.v6ds.occnc.com [IPv6:2001:470:88e6:3::239]) (authenticated bits=128) by maildrop2.v6ds.occnc.com (8.14.7/8.14.7) with ESMTP id s0PJZAwH048013; Sat, 25 Jan 2014 14:35:10 -0500 (EST) (envelope-from curtis@ipv6.occnc.com) Message-Id: <201401251935.s0PJZAwH048013@maildrop2.v6ds.occnc.com> To: Vitaly Magerya From: Curtis Villamizar Subject: Re: Any news about "msk0 watchdog timeout" regression in 10-RELEASE? In-reply-to: Your message of "Sat, 25 Jan 2014 14:23:40 +0200." <52E3ACCC.1080707@gmail.com> Date: Sat, 25 Jan 2014 14:35:10 -0500 Cc: Yonghyeon PYUN , freebsd-stable@freebsd.org, curtis@ipv6.occnc.com X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list Reply-To: curtis@ipv6.occnc.com List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 25 Jan 2014 19:35:14 -0000 In message <52E3ACCC.1080707@gmail.com> Vitaly Magerya writes: > > On 01/21/14 21:56, Curtis Villamizar wrote: > > I have mine working but I haven't done a lot of reboots to see if it > > is a "fix" or luck. > > > > There is a lot of junk that you won't need in the code that is running > > well for me. But here it is, as-is warts and all. > > > > I've been swamped lately and haven't had time to look at this further. > > I've tried the patch, and the testing went like this: > 1) Reboot into fixed kernel => msk0 shows watchdog timeouts. > 2) Reboot again => no timeouts, but the interrupt storm is still there. > 3) Disable the machine completely for 15 minutes (take out the battery > too; it's a laptop), boot fixed kernel => msk works fine. > 4) Reboot one more time => msk still works fine. > 5) Reboot into 10-RELEASE kernel => watchdog timeouts. > 6) Disable the machine completely for 15 minutes, boot fixed kernel => > still watchdog timeouts. > 7) Disable the machine for 30 minutes, boot fixed kernel => nope, still > doesn't work. > > So, there was a success once (step 3), but I was not able to reproduce > it after that. Seems to be random. In my case I didn't have a problem if I didn't reboot the original kernel but I only tried a few reboots. I can't see how a chip could retain any state after 30 minutes of no power so you are right that we don't have a fix. I haven't had time to look at this further and don't generally reboot this machine (uptime 16 days since last I looked at this). When I'm no longer quite so swamped I'll look at this again. It seems we are the only two reporting this problem. Please send lines of these form from dmesg: mskc0: port 0xe800-0xe8ff mem 0xfebfc000-0xfebfffff irq 19 at device 0.0 on pci2 msk0: on mskc0 That may indicate we have very similar chips. If not, this msk problem may be more widespread. Curtis