From owner-freebsd-current@FreeBSD.ORG Mon Jul 14 01:37:32 2008 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BA8B0106566B for ; Mon, 14 Jul 2008 01:37:32 +0000 (UTC) (envelope-from pyunyh@gmail.com) Received: from rv-out-0506.google.com (rv-out-0506.google.com [209.85.198.225]) by mx1.freebsd.org (Postfix) with ESMTP id 7DA048FC1C for ; Mon, 14 Jul 2008 01:37:32 +0000 (UTC) (envelope-from pyunyh@gmail.com) Received: by rv-out-0506.google.com with SMTP id b25so5679903rvf.43 for ; Sun, 13 Jul 2008 18:37:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:received:received:date:from :to:cc:subject:message-id:reply-to:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=q68BACj1oIH9XVsuDSAzwwJLLk1yZ0lJMqvdtE4W1Z8=; b=dsKv5fkz+zkBtfDwagGhY8lVNx7VhqA8enxjHPLsTw26d27w7rgmzXfu+uoi/CJVOY iILCJT9rQ8VqTsbaAw6a6tkzrUEW6pnTg2ibA53EJsMMSoWn1Stuwil1L7LpUpB5+m1H 2lbjupqvCqM13ro1SCo7Oc6OrTJmiHoidUAjU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=date:from:to:cc:subject:message-id:reply-to:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; b=fStb/p8e6bDA7JEMzT5fKjzurypMThXv0AUcaCF+wji81CQtN/dzcvxWetLlZ7wpmP bR8TKTW1H/rmysQ94zw4CD/e4sDD/tvzcEkqsdDjqAhJMXldtR4KGbI/OBlifPw9H7fv 51GdwIIy+FFNFHfLb9pVnIjBB6yIJKh1sWDuk= Received: by 10.141.161.6 with SMTP id n6mr6206094rvo.41.1215999451083; Sun, 13 Jul 2008 18:37:31 -0700 (PDT) Received: from michelle.cdnetworks.co.kr ( [211.53.35.84]) by mx.google.com with ESMTPS id b8sm7158724rvf.9.2008.07.13.18.37.28 (version=TLSv1/SSLv3 cipher=RC4-MD5); Sun, 13 Jul 2008 18:37:29 -0700 (PDT) Received: from michelle.cdnetworks.co.kr (localhost.cdnetworks.co.kr [127.0.0.1]) by michelle.cdnetworks.co.kr (8.13.5/8.13.5) with ESMTP id m6E1ZKho036746 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 14 Jul 2008 10:35:20 +0900 (KST) (envelope-from pyunyh@gmail.com) Received: (from yongari@localhost) by michelle.cdnetworks.co.kr (8.13.5/8.13.5/Submit) id m6E1ZJkG036745; Mon, 14 Jul 2008 10:35:19 +0900 (KST) (envelope-from pyunyh@gmail.com) Date: Mon, 14 Jul 2008 10:35:19 +0900 From: Pyun YongHyeon To: Dimitry Andric Message-ID: <20080714013519.GE36245@cdnetworks.co.kr> References: <484BC9FB.2040605@andric.com> <20080609012657.GD12521@cdnetworks.co.kr> <484D215A.7050700@andric.com> <20080609123206.GF12521@cdnetworks.co.kr> <484D25CC.9050106@andric.com> <20080610050550.GB17874@cdnetworks.co.kr> <484E9377.2050609@andric.com> <20080611005814.GA3529@cdnetworks.co.kr> <48666CD7.9020706@andric.com> <20080630043156.GB79537@cdnetworks.co.kr> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080630043156.GB79537@cdnetworks.co.kr> User-Agent: Mutt/1.4.2.1i Cc: freebsd-current@FreeBSD.org Subject: Re: Call for testers: re(4) and RTL8168C/RTL8168CP/RTL8111C/RTL8111CP X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: pyunyh@gmail.com List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 14 Jul 2008 01:37:32 -0000 On Mon, Jun 30, 2008 at 01:31:56PM +0900, To Dimitry Andric wrote: > On Sat, Jun 28, 2008 at 06:54:47PM +0200, Dimitry Andric wrote: > > On 2008-06-11 02:58, Pyun YongHyeon wrote: > > > > This seems to work better, although it still takes quite some time > > > > (~10s) for the interfaces to go up at boot time. I haven't yet been > > > > able to get them "stuck", however, so that's good. :) > > > Hmm, that's interesting. Can you spot where re(4) spends its time? > > > Did RELENG_7 also have this issue? > > > > Apparently it's experiencing timeouts, I usually get these: > > > > re0: link state changed to DOWN > > re0: watchdog timeout > ^^^^^^^^^^^^^^^^ > Because link state changed to DOWN re(4) should not queue > transmitting packets anymore until it get a valid link. Trying to > send further packets would cause watchdong timeouts as above. > This indicates re(4) failed to detect link loss event. > What makes me wonder is why the link state was changed to DOWN. > Do you have a clue(e.g. switching hub down etc)? > > > re0: 3 link states coalesced > ^^^^^^^^^^^^^^^^^^^^^^^ > > Hmm, I guess you've encountered another bug. The link states > coalescing message indicates a bug in PHY driver and link state > handling of re(4). ATM the link state handling of re(4) is in very > bad state and it doesn't correctly drive MII_TICK. re(4) just relys > on link status change interrupt of controller but re(4) failed to > determine what's current link event is for (The event could be link > up or down or auto-negotiation complete etc). In addition, all > RealTek controllers lack proper programming interface to tell MAC > negotiated speed/duplex/flow-controls which in turn taking proper > action to the event very hard. > > I guess re(4) should not rely on link status change interrupt but > it should fall back to traditional polling mechanism which will > enable correct tracking of link establishment. Also the link up/ > down handling should be changed to process mii(4) posted events. > All these change requires a lot of code change and needs more > testing. I think I may have to commit accumulated patches for newer > RTL8168 family before going to that direction. The patch is not > perfect to address all issues for RTL8168 family but it allows > recognition of the new hardware and make it usable in most cases. > > > re0: link state changed to UP > > re1: link state changed to DOWN > > > > I've been running all tests under RELENG_7, btw. Note also, these > > delays don't always happen, in some cases the interfaces react very > > quickly. In rare cases, they don't work at all, until you manually > > ifconfig down and up them a few times. > > > > What's funny though, is that the interfaces seem to start in DOWN mode: > > > > [...booting...] > > Mounting local file systems:. > > Setting hostname: tensor.andric.com. > > re0: link state changed to DOWN > > re1: link state changed to DOWN > > lo0: flags=8049 metric 0 mtu 16384 > > inet6 ::1 prefixlen 128 > > inet6 fe80::1%lo0 prefixlen 64 scopeid 0x5 > > inet 127.0.0.1 netmask 0xff000000 > > re0: flags=8843 metric 0 mtu 1500 > > options=399b > > ether 00:30:18:a6:f1:a8 > > inet6 fe80::230:18ff:fea6:f1a8%re0 prefixlen 64 tentative scopeid 0x1 > > inet 87.251.56.140 netmask 0xffffffc0 broadcast 87.251.56.191 > > media: Ethernet autoselect (none) > > status: no carrier > > re1: flags=8843 metric 0 mtu 1500 > > options=399b > > ether 00:30:18:a6:f1:a9 > > inet6 fe80::230:18ff:fea6:f1a9%re1 prefixlen 64 tentative scopeid 0x2 > > inet 192.168.0.1 netmask 0xffffff00 broadcast 192.168.0.255 > > media: Ethernet autoselect (none) > > status: no carrier > > [...more initialization...] > > net.inet6.ip6.forwarding: 0 -> 1 > > net.inet6.ip6.accept_rtadv: 0 -> 0 > > re0: link state changed to UP > > re1: link state changed to UP > > > > and only then do they "really" go up... :) > > > > I can't sure due to bugs in link state handling in driver but > generally it's normal. Establishing a link with link partner takes > time and sometimes it would even take 10 seconds or more. > > > Do you have any good suggestions on where I could put some debug > > printfs in re to find out what it's timing out on? > > > > Before doing that it would be more appropriate to fix link state > handing in driver. I'll let you know when I have a patch for link > handling clean-up. > Here is patch for re(4) link handling. Copy if_re.c and if_rlreg.h from HEAD to RELENG_7 and apply attached one. If you still see watchdog timeouts, please turn off TSO and let me know how it goes. One user reported TSO issues on 8169 family controllers but I can't reproduce this on my 8169 hardware so it could be related with silicon bug of sepecific revision of the hardware. > > > > > Plugging/unplugging UTP cable to ethernet controller during boot > > > change the long delay? How about disabling WOL before system > > > shutdown?(e.g. ifconfig re0 -wol) > > > > Plugging/unplugging the cable doesn't seem to make much difference, and > > neither does disabling WOL before shutdown (or altogether)... > > > > Ok. > > Thanks for reporting. -- Regards, Pyun YongHyeon