From owner-freebsd-current@freebsd.org Sun Nov 20 07:44:01 2016 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 21DFAC4B588 for ; Sun, 20 Nov 2016 07:44:01 +0000 (UTC) (envelope-from pyunyh@gmail.com) Received: from mail-pf0-x241.google.com (mail-pf0-x241.google.com [IPv6:2607:f8b0:400e:c00::241]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id E91F6106E; Sun, 20 Nov 2016 07:44:00 +0000 (UTC) (envelope-from pyunyh@gmail.com) Received: by mail-pf0-x241.google.com with SMTP id i88so16179800pfk.2; Sat, 19 Nov 2016 23:44:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:date:to:cc:subject:message-id:reply-to:references:mime-version :content-disposition:in-reply-to:user-agent; bh=LSsf/2ckLwE7srLGB9kppuHB9TKjHbUmW103JWE2D5Q=; b=wxZI8HMrXA9JUiKp/HTBy8sZfZEjqARTBf1SI+vtRzN1BtdYaaMs+6BgmbWymtOeKZ Bf/UMyXErVO/oDjoC06440JmdkbgSfDX+4G4KfDGA3GYqb9MD+5hey57/4Nx3yWttCVw hQZ26PKlnk6fzJD6V6QzJX3oBJAjPgu6YkyMMXCRQgzrj+HyqFpTZTM6VJkL5EJ2Izas GrKX8NIdW22EY9K8bly4YxdDvUDDGozaXh46HYuP30GH/UDsMBAXM6xA6nw3Fl+L4k1m A7KckuPxeXK4KtJjZvY4d4fxYN2by7rhJswQxUnbqreslhyuFqImmKDfniq5ZYXfM47c jmGg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:date:to:cc:subject:message-id:reply-to :references:mime-version:content-disposition:in-reply-to:user-agent; bh=LSsf/2ckLwE7srLGB9kppuHB9TKjHbUmW103JWE2D5Q=; b=VJkXI1bIgifLqaLyv1W72RB48A7qLkeEhpzoILOQ3Qspivl4blM7YcbJKVanAjuxzp xxp0MPk4TcdF08IgaMM60A9ar5r268fSmYTO7Kqk+qx/Zbzfj+2MHgl+COsyqLJQ4Y9m qhyCMh6bhs4VXPUG0/ZbpCVselVyxFqIGRsiu8EjGT5KlLpyax0jbzp+OqhHCVgoY0mR B4jvZM8dWrWNtNRawWyVVEF7qCGOeZUfek2C5CH/fkNb1YPCbOObblO5y/B1iAcskn6N 48Y+kFyw7nGvNcIGxw5FJZWpmXHq4NUI1rlQdNF8oAfA/CgGG7at5grZVpx7vZVaF+Oa b2Uw== X-Gm-Message-State: AKaTC00WTU5NY+3/qkMxM6bP4NMX6a47Uck917i1lXqZTyIL0KPrOAGJgVcG1lVrRUePig== X-Received: by 10.99.170.79 with SMTP id x15mr18204568pgo.14.1479627840425; Sat, 19 Nov 2016 23:44:00 -0800 (PST) Received: from localhost ([221.148.3.207]) by smtp.gmail.com with ESMTPSA id i11sm10859223pgn.17.2016.11.19.23.43.56 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 19 Nov 2016 23:43:58 -0800 (PST) From: YongHyeon PYUN X-Google-Original-From: "YongHyeon PYUN" Received: by localhost (sSMTP sendmail emulation); Sun, 20 Nov 2016 16:43:53 +0900 Date: Sun, 20 Nov 2016 16:43:52 +0900 To: "O. Hartmann" Cc: FreeBSD CURRENT , gnn@FreeBSD.org, "Andrey V. Elsukov" , glebius@FreeBSD.org, melifaro@FreeBSD.org Subject: Re: CURRENT: re(4) crashing system Message-ID: <20161120074351.GA1090@michelle.fasterthan.co.kr> Reply-To: pyunyh@gmail.com References: <20161024051359.GA1185@michelle.fasterthan.co.kr> <20161024140337.47af924e@freyja.zeit4.iv.bundesimmobilien.de> <20161025020538.GA1238@michelle.fasterthan.co.kr> <20161025070338.76ad6711@hermann> <20161027010004.GA1215@michelle.fasterthan.co.kr> <20161028212113.5c4a2ca2@hermann> <20161031021222.GA1252@michelle.fasterthan.co.kr> <20161106132036.06add6ca@hermann> <20161107021623.GA1557@michelle.fasterthan.co.kr> <20161119194424.6335338a@thor.walstatt.dynvpn.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20161119194424.6335338a@thor.walstatt.dynvpn.de> User-Agent: Mutt/1.4.2.3i X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 20 Nov 2016 07:44:01 -0000 On Sat, Nov 19, 2016 at 07:44:35PM +0100, O. Hartmann wrote: > Am Mon, 7 Nov 2016 11:16:23 +0900 > YongHyeon PYUN schrieb: > > > On Sun, Nov 06, 2016 at 01:20:36PM +0100, Hartmann, O. wrote: > > > On Mon, 31 Oct 2016 11:12:22 +0900 > > > YongHyeon PYUN wrote: > > > > > > > On Fri, Oct 28, 2016 at 09:21:13PM +0200, Hartmann, O. wrote: > > > > > On Thu, 27 Oct 2016 10:00:04 +0900 > > > > > YongHyeon PYUN wrote: > > > > > > > > > > > On Tue, Oct 25, 2016 at 07:03:38AM +0200, Hartmann, O. wrote: > > > > > > > On Tue, 25 Oct 2016 11:05:38 +0900 > > > > > > > YongHyeon PYUN wrote: > > > > > > > > > > > > > > > > > > > [...] > > > > > > > > > > > > > > I'm not sure but it's likely the issue is related with > > > > > > > > EEE/Green Ethernet handling. EEE is negotiated feature with > > > > > > > > link partner. If you directly connect your laptop to non-EEE > > > > > > > > capable link partner like other re(4) box without switches > > > > > > > > you may be able to tell whether the issue is EEE/Green > > > > > > > > Ethernet related one or not. > > > > > > > > > > > > > > Me either since when I discovered a problem the first time with > > > > > > > CURRENT, that was the Friday before last week's Friday, there > > > > > > > was a unlucky coicidence: I got the new switch, FreeBSD > > > > > > > introduced a serious bug and I changed the NICs. > > > > > > > > > > > > > > The laptop, the last in the row of re(4) equipted systems on > > > > > > > which I use the Realtek NIC, does well now with Green IT > > > > > > > technology, but crashes on plugging/unplugging - not on each > > > > > > > event, but at least in one of ten. > > > > > > > > > > > > Hmm, it seems you know how to trigger the issue. When you unplug > > > > > > UTP cable was there active network traffic on re(4) device? > > > > > > It would be helpful to know which event triggers the crash(e.g. > > > > > > unplugging or plugging). And would you show me backtrace of > > > > > > panic? > > > > > > > I guess the Green IT issue is more a unlucky guess of mine and > > > > > > > went hand in hand with the problem I face with CURRENT right > > > > > > > now on some older, Non UEFI machines. > > > > > > > > > > > > > > > > > > > Ok. > > > > > > > > > > > > [...] > > > > > > > > > > > > > > As requested the informations about re0 and rgephy0 on the > > > > > > > laptop (Lenovo E540) > > > > > > > > > > > > > > [...] > > > > > > > > > > > > > > rgephy0: PHY 1 on miibus0 > > > > > > > rgephy0: none, 10baseT, 10baseT-FDX, 10baseT-FDX-flow, > > > > > > > 100baseTX, 100baseTX-FDX, 100baseTX-FDX-flow, 1000baseT-FDX, > > > > > > > 1000baseT-FDX-master, 1000baseT-FDX-flow, > > > > > > > 1000baseT-FDX-flow-master, auto, auto-flow > > > > > > > > > > > > > > re0: > > > > > > > port 0x3000-0x30ff mem > > > > > > > 0xf0d04000-0xf0d04fff,0xf0d00000-0xf0d03fff at device 0.0 on > > > > > > > pci2 re0: Using 1 MSI-X message re0: ASPM disabled re0: Chip > > > > > > > rev. 0x50800000 re0: MAC rev. 0x00100000 > > > > > > > > > > > > This looks like 8168GU controller. > > > > > > > > > > > > [...] > > > > > > > > > > > > > I use options netmap in kernel config, but the problem is also > > > > > > > present without this option - just for the record. > > > > > > > > > > > > > > > > > > > Yup, netmap(4) has nothing to do with the crash. > > > > > > > > > > > > Thanks. > > > > > > > > > > Attached, you'll find the backtrace of the crash. This time it was > > > > > really easy - just one pull of the LAN cabling - and we are > > > > > happy :-/ > > > > > > > > > > Please let me know if you need something else. I will return to > > > > > normal operations (disabling debugging) due to CURRENT is very > > > > > unstable at the moment on other hosts beyond r307157. > > > > > > > > > > > > > It seems the attachment was stripped. > > > > > > This time I hope I got it right! > > > > > > Attached you'll find the latest CURRENT's backtrace on the provoked > > > crash (plug and unplug). > > > > > > I also saved the kernel and coredump, so if you need me to do further > > > investigations,please let me know. > > > > > > > Thanks a lot for the backtrace. This backtrace is not the one I > > expected and I guess the issue is related with cached route removal > > on interface down. Quick looking over the code didn't reveal the > > cause of crash(I'm not familiar with that part code). Probably > > gnn@ may have better idea what's going on here(CCed). > > > > Thanks. > > In another thread I complained about permanent crashes on several "older" Intel > architectures (IvyBridge and down). It has been revealed, that > > option FLOWTABLE > > in the kernel, which is part of my custom kernels a long time for now, has been > identified as the culprit on those systems. Commenting out that special option solved the > problem! > > Interestingly, also commenting out this option from the kernel config of the laptop in > question of this thread, I wasn't able - as of this writing - to reproduce the crashes, > so it might be that the same issue with FLOWTABLE has been triggered by pluggin and/or > unpluggin the LAN cord. > I'm not sure whether it's triggered by FLOWTABLE yet since it had been there for a log time. I suspected r297225, r301217 which re-added route caching for TCP. The panic you encountered indicates invalid access against destroyed lock which in turn suggests reference counting problem in lltable. I've CCed glebius@ and melifaro@ who are more familiar with routing code than me. > Usually I was able to trigger the coredump after two or three rounds, this time I tried > it over ten times with no effect. > > But on the contrary, the NIC of the laptop doesn't negotiate for 1 GBit/s with my switch, > it remains with 100 MBit/s. The switch is a Netgear GS110TP V2. > This would be a re(4) driver problem. When you see it negotiated 100Mbps after unplugging/plugging, would you try to negotiate with link partner again like below and let me know the result? #ifconfig re0 media auto Does the behavior change if you physically unplug/plug UTP cable on laptop rather than forcing port down/up on the switch? Thanks.