Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 22 Apr 2015 14:11:07 +0900
From:      Yonghyeon PYUN <pyunyh@gmail.com>
To:        Chris Ross <cross+freebsd@distal.com>
Cc:        Gareth Wyn Roberts <g.w.roberts@glyndwr.ac.uk>, Alnis Morics <alnis.morics@gmail.com>, freebsd-stable@freebsd.org
Subject:   Re: 10.1-STABLE bce: Watchdog timeout occurred
Message-ID:  <20150422051107.GA975@michelle.fasterthan.com>
In-Reply-To: <186A4B92-CA84-45DD-8710-307204BD8B7F@distal.com>
References:  <A1E984E1-B551-4CB6-A343-4E73FB58C35E@distal.com> <55361DF6.2080606@gmail.com> <55365A57.60509@glyndwr.ac.uk> <186A4B92-CA84-45DD-8710-307204BD8B7F@distal.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Apr 22, 2015 at 12:39:16AM -0400, Chris Ross wrote:
> 
> On Apr 21, 2015, at 10:10 , Gareth Wyn Roberts <g.w.roberts@glyndwr.ac.uk> wrote:
> > This may be caused by DMA alignment problems.
> > See https://docs.freebsd.org/cgi/getmsg.cgi?fetch=145859+0+archive/2015/freebsd-stable/20150419.freebsd-stable for a recent thread about the msk driver.  The msk maintainer Yonghyeon Pyun has opted for super safe options of 32K alignment!
> > 
> > It's a long shot, but you could try increasing BCE_DMA_ALIGN and/or BCE_RX_BUF_ALIGN in the include file if_bcereg.h, say up to 4096, to see whether it makes any difference.
> 
>   Well, after making that change, I was able to confirm that the problem doesn't seem to occur.  However, in trying to verify the problem on an unmodified kernel, I've rebooted a GENERIC from r281672 without that change, and am also not seeing the problem.  :-/  I'm not sure whether the gremlins have "fixed" something, or if I was just too critical in my initial analysis.
> 
>   For now I'll take that change out of my tree and run without it.  If I see the flapping again, I'll confirm that it's repeatable, then change the alignments as suggested and see if I see a change.
> 

I guess the alignment issue of msk(4) has nothing to do with bce(4)
watchdog timeouts.  It would be more helpful to know details of
your controller(bce(4)/brgphy(4) related dmesg output, pciconf
output etc) and network setup.
If you know a reliable way that triggers the watchdog timeouts, 
please share that info too.  I would have tried to disable all
hardware offloading features(TSO, checksum, VLAN H/W tagging etc)
and see whether that makes any differences in the first step to
narrow down the issue.

Thanks.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20150422051107.GA975>