Date: Wed, 22 Apr 2015 14:11:07 +0900 From: Yonghyeon PYUN <pyunyh@gmail.com> To: Chris Ross <cross+freebsd@distal.com> Cc: Gareth Wyn Roberts <g.w.roberts@glyndwr.ac.uk>, Alnis Morics <alnis.morics@gmail.com>, freebsd-stable@freebsd.org Subject: Re: 10.1-STABLE bce: Watchdog timeout occurred Message-ID: <20150422051107.GA975@michelle.fasterthan.com> In-Reply-To: <186A4B92-CA84-45DD-8710-307204BD8B7F@distal.com> References: <A1E984E1-B551-4CB6-A343-4E73FB58C35E@distal.com> <55361DF6.2080606@gmail.com> <55365A57.60509@glyndwr.ac.uk> <186A4B92-CA84-45DD-8710-307204BD8B7F@distal.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Apr 22, 2015 at 12:39:16AM -0400, Chris Ross wrote: > > On Apr 21, 2015, at 10:10 , Gareth Wyn Roberts <g.w.roberts@glyndwr.ac.uk> wrote: > > This may be caused by DMA alignment problems. > > See https://docs.freebsd.org/cgi/getmsg.cgi?fetch=145859+0+archive/2015/freebsd-stable/20150419.freebsd-stable for a recent thread about the msk driver. The msk maintainer Yonghyeon Pyun has opted for super safe options of 32K alignment! > > > > It's a long shot, but you could try increasing BCE_DMA_ALIGN and/or BCE_RX_BUF_ALIGN in the include file if_bcereg.h, say up to 4096, to see whether it makes any difference. > > Well, after making that change, I was able to confirm that the problem doesn't seem to occur. However, in trying to verify the problem on an unmodified kernel, I've rebooted a GENERIC from r281672 without that change, and am also not seeing the problem. :-/ I'm not sure whether the gremlins have "fixed" something, or if I was just too critical in my initial analysis. > > For now I'll take that change out of my tree and run without it. If I see the flapping again, I'll confirm that it's repeatable, then change the alignments as suggested and see if I see a change. > I guess the alignment issue of msk(4) has nothing to do with bce(4) watchdog timeouts. It would be more helpful to know details of your controller(bce(4)/brgphy(4) related dmesg output, pciconf output etc) and network setup. If you know a reliable way that triggers the watchdog timeouts, please share that info too. I would have tried to disable all hardware offloading features(TSO, checksum, VLAN H/W tagging etc) and see whether that makes any differences in the first step to narrow down the issue. Thanks.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20150422051107.GA975>