From owner-freebsd-stable@FreeBSD.ORG Wed Apr 22 05:11:20 2015 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 2235BA09 for ; Wed, 22 Apr 2015 05:11:20 +0000 (UTC) Received: from mail-pd0-x229.google.com (mail-pd0-x229.google.com [IPv6:2607:f8b0:400e:c02::229]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id EB8C91DE9 for ; Wed, 22 Apr 2015 05:11:19 +0000 (UTC) Received: by pdbnk13 with SMTP id nk13so263317839pdb.0 for ; Tue, 21 Apr 2015 22:11:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:date:to:cc:subject:message-id:reply-to:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=KImrl/MJJDP8rhktEDterKdZScqwfb8tEcOiq0B7KZ0=; b=MkASTzkv6ICrqrpMgXLrsgWDe7oHXTGtSlNjCkMYWGCz8/acfxtG/JxJqBa0ubRYc1 FwQu8JOWVAi85Bu44Uaq975fjMBhdMyc5KbKS9lQqoOFzpgZPF9mMsqD+nP7aYWh0WZl UbU7C5CAb8NraIhonYby3XlQl6e7VXcGvcLQcy4vKyfRWHmgDnDQajrdiG2D96447NiG en9+L4dOIWdm4bwVdqvLkG+qFQvgFe4I1VoGPQ07Djo8z0cv/93xADoX5nWyXyOG14Z/ F8GOm8Ea+Lb0105RT6JEaqoN7z3qNekNiJKSyX0+AdwoL4YHt8c1ATqjEGmluOXTPRK/ yHIg== X-Received: by 10.68.135.233 with SMTP id pv9mr43543428pbb.67.1429679479497; Tue, 21 Apr 2015 22:11:19 -0700 (PDT) Received: from pyunyh@gmail.com ([106.247.248.2]) by mx.google.com with ESMTPSA id mb4sm2028481pdb.63.2015.04.21.22.11.15 (version=TLSv1 cipher=RC4-SHA bits=128/128); Tue, 21 Apr 2015 22:11:18 -0700 (PDT) From: Yonghyeon PYUN X-Google-Original-From: "Yonghyeon PYUN" Received: by pyunyh@gmail.com (sSMTP sendmail emulation); Wed, 22 Apr 2015 14:11:07 +0900 Date: Wed, 22 Apr 2015 14:11:07 +0900 To: Chris Ross Cc: Gareth Wyn Roberts , Alnis Morics , freebsd-stable@freebsd.org Subject: Re: 10.1-STABLE bce: Watchdog timeout occurred Message-ID: <20150422051107.GA975@michelle.fasterthan.com> Reply-To: pyunyh@gmail.com References: <55361DF6.2080606@gmail.com> <55365A57.60509@glyndwr.ac.uk> <186A4B92-CA84-45DD-8710-307204BD8B7F@distal.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <186A4B92-CA84-45DD-8710-307204BD8B7F@distal.com> User-Agent: Mutt/1.4.2.3i X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 Apr 2015 05:11:20 -0000 On Wed, Apr 22, 2015 at 12:39:16AM -0400, Chris Ross wrote: > > On Apr 21, 2015, at 10:10 , Gareth Wyn Roberts wrote: > > This may be caused by DMA alignment problems. > > See https://docs.freebsd.org/cgi/getmsg.cgi?fetch=145859+0+archive/2015/freebsd-stable/20150419.freebsd-stable for a recent thread about the msk driver. The msk maintainer Yonghyeon Pyun has opted for super safe options of 32K alignment! > > > > It's a long shot, but you could try increasing BCE_DMA_ALIGN and/or BCE_RX_BUF_ALIGN in the include file if_bcereg.h, say up to 4096, to see whether it makes any difference. > > Well, after making that change, I was able to confirm that the problem doesn't seem to occur. However, in trying to verify the problem on an unmodified kernel, I've rebooted a GENERIC from r281672 without that change, and am also not seeing the problem. :-/ I'm not sure whether the gremlins have "fixed" something, or if I was just too critical in my initial analysis. > > For now I'll take that change out of my tree and run without it. If I see the flapping again, I'll confirm that it's repeatable, then change the alignments as suggested and see if I see a change. > I guess the alignment issue of msk(4) has nothing to do with bce(4) watchdog timeouts. It would be more helpful to know details of your controller(bce(4)/brgphy(4) related dmesg output, pciconf output etc) and network setup. If you know a reliable way that triggers the watchdog timeouts, please share that info too. I would have tried to disable all hardware offloading features(TSO, checksum, VLAN H/W tagging etc) and see whether that makes any differences in the first step to narrow down the issue. Thanks.