From owner-freebsd-stable@FreeBSD.ORG Sat Jan 10 18:32:03 2015 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 1CDF529D for ; Sat, 10 Jan 2015 18:32:03 +0000 (UTC) Received: from mail-wg0-x22d.google.com (mail-wg0-x22d.google.com [IPv6:2a00:1450:400c:c00::22d]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id A704A2E3 for ; Sat, 10 Jan 2015 18:32:02 +0000 (UTC) Received: by mail-wg0-f45.google.com with SMTP id b13so13140447wgh.4 for ; Sat, 10 Jan 2015 10:32:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=1pLjusWjblYUXq0PPc/CE85oaUuRscWpYHeyrEO2nLk=; b=Nwo/0N8zOn8E2fwzbmV5iybB30SH6sTRCZ6zql1tfRFBGjF9rlLK4hLMbTmWosbk+g Q7AkQQ+9j9gHXbi79RSJZML+ZdFT5fL9KnPn+rsBlfIT9YkIYyAwIYmpUnri1+jVT6Zv jCGq7do9M1jfWIQYUBiF3RNcUIrR7iK/RiKmLyEQR0Z3S9UOhFcdkjZ+6tJtxsu+emT8 ScdVbXj7LJ+lHO64fPQNpQ6/lghzgm3ffFb0XNB0n2UkzjiOm1QDQvC5Kr+WBkDLGez0 Ug8aImY1ziFcu1w4mBHL7zCmhzh8RcHARN45dw/pg+GDG4BDgjo3uH9ajKmr8iCwJqhg qDSw== MIME-Version: 1.0 X-Received: by 10.194.19.131 with SMTP id f3mr12950322wje.46.1420914720527; Sat, 10 Jan 2015 10:32:00 -0800 (PST) Received: by 10.194.101.106 with HTTP; Sat, 10 Jan 2015 10:32:00 -0800 (PST) In-Reply-To: <54B10432.8050909@omnilan.de> References: <54ACC6A2.1050400@omnilan.de> <54AE565D.50208@omnilan.de> <54AE5A6B.7040601@omnilan.de> <54AFA784.6020102@omnilan.de> <54B10432.8050909@omnilan.de> Date: Sat, 10 Jan 2015 10:32:00 -0800 Message-ID: Subject: Re: igb(4) watchdog timeout, lagg(4) fails From: Jack Vogel To: Harald Schmalzbauer Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: FreeBSD Stable X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 10 Jan 2015 18:32:03 -0000 Did you say this system is a VM under ESX? Jack On Sat, Jan 10, 2015 at 2:51 AM, Harald Schmalzbauer < h.schmalzbauer@omnilan.de> wrote: > Bez=FCglich Jack Vogel's Nachricht vom 09.01.2015 18:46 (localtime): > > The tuneable interrupt rate code is not mine, and looking at it I'm not > > entirely > > sure it works. Why are you focused on the interrupt rate anyway, do you > have > > some reason to tie it to the watchdog? > > > > You could turn AIM off (enable_aim) and see if that changed anything? > > > > It seems most the time problems show up they involve the use of lagg, i= f > you > > take it out of the mix does the problem go away? > > Thanks for your attention! > > Unfortunately I can't test anything without lagg(4), this machine is in > production (with lagg(4) being parent of lots of vlan-interfaces). > I guess the watchdog timeout is more often reported by people with > lagg(4) in use for the reason that that's where igb(4) really get's some > (peak-)load ;-) Serious, I can't see how lagg(4) should be the culprit > for watchdog timeots, but stuck interrupts was my first guess. > Especially since I'm doing the kld-reload-trick to get msi-x working > inside ESXi (reported 2 years ago that booting FreeBSD initializes the > passthrough device with some kind of wrong device-type-identifier; > warmbooting the guest or simply kld-reloading solves this problem, the > hypervisor then get's the correct device-type-indicator (for using msi-x)= ). > Like mentioned this has been working without any issue for more than one > year with FreeBSD 9.1. > I have another machine with kawela cards and similar setup, but without > load at all. I'll see if I can reproduce the problem there and narrow it > down by removing lagg(4). > > Is there a way to reset the interface without rebooting the machine? The > watchdog doesn't really reset the device, it's in non-operating state > afterwards. I need to 'ifconfig down' it for bringin lagg(4) back into > operational state. > Some kind of D3D0-state switch for a single address? kldunloading would > destroy the remaining interface too... > > Thanks, > > -Harry > >