From owner-freebsd-stable@FreeBSD.ORG Wed Feb 26 04:55:51 2014 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 1EE36380; Wed, 26 Feb 2014 04:55:51 +0000 (UTC) Received: from mail-pb0-x232.google.com (mail-pb0-x232.google.com [IPv6:2607:f8b0:400e:c01::232]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id DA6A4197C; Wed, 26 Feb 2014 04:55:50 +0000 (UTC) Received: by mail-pb0-f50.google.com with SMTP id md12so448022pbc.37 for ; Tue, 25 Feb 2014 20:55:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:date:to:cc:subject:message-id:reply-to:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=iD4ZUsUC6enWib/eCYt0/FL9Hxz2VS8IU56dGGGKWbk=; b=r0vBoWjtqisAHE0Yr+uM3oE4Jw8ETwDUCsiN78akaAaXbvbIOh01l7Wn5xgXC/EJaC RkGV+8xCmGG/4/XrYGKoySxKOByU0dl06oQcpJ/+aQEztEpwm8Vq/TVza9UXj4or1Jr0 qgMCoX5CoGag6fuzcwfqPYPBhtc4P57NenmAzkGv7OillQFhxyC4/sYxQ8bLwdVkinkN 86Y6Hyv9fSTeWxvp+SARqsD35G+Pr9xLfTjBiqT43/nnACM0jPGrtVWZAK8nq/5a5IJM 2gwbZPdifeKGiRrZ/h2cDznYxLovOuOzjhDotoIBJcM04GxbH9ueAinWaAt98kkIfutW FIxA== X-Received: by 10.69.2.2 with SMTP id bk2mr4109787pbd.75.1393390550505; Tue, 25 Feb 2014 20:55:50 -0800 (PST) Received: from pyunyh@gmail.com (lpe4.p59-icn.cdngp.net. [114.111.62.249]) by mx.google.com with ESMTPSA id qh2sm158131361pab.13.2014.02.25.20.55.47 for (version=TLSv1 cipher=RC4-SHA bits=128/128); Tue, 25 Feb 2014 20:55:49 -0800 (PST) Received: by pyunyh@gmail.com (sSMTP sendmail emulation); Wed, 26 Feb 2014 13:55:45 +0900 From: Yonghyeon PYUN Date: Wed, 26 Feb 2014 13:55:45 +0900 To: David Naylor Subject: Re: [SOLVED] MPCP Opcode Pause and unresponsive computer Message-ID: <20140226045545.GB1350@michelle.cdnetworks.com> References: <1403963.5sDsKbxfoF@dragon.dg> <20140217022329.GA3675@michelle.cdnetworks.com> <7109858.LYNIHJIJOi@dragon.dg> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <7109858.LYNIHJIJOi@dragon.dg> User-Agent: Mutt/1.4.2.3i Cc: jfv@FreeBSD.org, stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list Reply-To: pyunyh@gmail.com List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Feb 2014 04:55:51 -0000 On Sun, Feb 23, 2014 at 07:51:10PM +0300, David Naylor wrote: > Hi, > > The issue was hardware error (corrupt memory module). Once removed all > symptoms disappeared. > > Please see below for specific follow up messages. > > Regards > > On Monday, 17 February 2014 11:23:29 Yonghyeon PYUN wrote: > > On Thu, Feb 13, 2014 at 10:01:56PM +0300, David Naylor wrote: > > > Hi, > > > > > > I recently installed FreeBSD 10.0-RELEASE on an headless Intense-PC. I am > > > experiencing two network related issues with the computer. > > > > > > First issue > > > ----------- > > > When compiling lang/ruby19 the network freezes. The build was done > > > directly from the command line using ssh. After a while ssh reports > > > "Write failed: Broken pipe". I attached the monitor and no messages were > > > displayed on the output (and the machine was still running). > > > > > > The Intense-PC does not respond to pings at this point either. Of note, I > > > was capable of transferring multiple GB of data and successfully compiled > > > other ports but compiling lang/ruby19 messes up everything. > > > > > > Second issue > > > ------------ > > > After a period of uptime (after the freeze from building lang/ruby19) the > > > entire network stops working, nothing is capable of connecting or > > > communicating on the network. When I do a tcpdump (from a different, > > > affected computer) I find the following: > > > > > > 20:57:58.254626 MPCP, Opcode Pause, length 46 > > > > > > These messages get repeated a few times a second. The moment I disconnect > > > the Intense-PC from the network functionality is restored (and is clearly > > > illustrated by the tcpdump). > > > > > > Information > > > ----------- > > > # uname -a > > > FreeBSD dragonbsd 10.0-RELEASE FreeBSD 10.0-RELEASE #0 > > > d44ce30(releng/10.0): Sun Feb 9 20:11:55 SAST 2014 > > > root@dragon.dg:/tmp/home/freebsd/10.0/src/sys/MODULAR amd64 > > > > > > # ifconfig > > > lo0: flags=8049 metric 0 mtu 16384 > > > > > > options=600003 > > > inet6 ::1 prefixlen 128 > > > inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1 > > > inet 127.0.0.1 netmask 0xff000000 > > > nd6 options=21 > > > > > > em0: flags=8843 metric 0 mtu 1500 > > > > > > options=4219b > > O4,WOL_MAGIC,VLAN_HWTSO> ether XX:XX:XX:XX:XX:XX > > > inet 192.168.0.160 netmask 0xffffff00 broadcast 192.168.0.255 > > > nd6 options=29 > > > media: Ethernet autoselect (100baseTX ) > > > status: active > > > > > > re0: flags=8843 metric 0 mtu 1500 > > > > > > options=8209b > > L_MAGIC,LINKSTATE> ether XX:XX:XX:XX:XX:XX > > > nd6 options=29 > > > media: Ethernet autoselect (none) > > > status: no carrier > > > > > > Any assistance to resolve this issue will be greatly appreciated. > > > > It's not normal to see pause frames with tcpdump. If my memory > > serves me right, MAC control frames which include pause frames > > should not be passed to host. Which network driver do you see > > above pause frames? Some drivers like fxp(4) allow passing pause > > frames to host but I think that's a bug in driver. I didn't change > > that behavior of the driver just because it used to enable that > > feature in the past. > > This is what a web search also indicated. In this case the machine receiving > pause frames has: > # dmesg | grep 'em0\|re0' > em0: port 0xf040-0xf05f mem > 0xf7300000-0xf731ffff,0xf7328000-0xf7328fff irq 20 at device 25.0 on pci0 > em0: Using an MSI interrupt > DragonSA@dragon:/tmp> dmesg | grep re0 > re0: port 0xd000-0xd0ff > mem 0xf7220000-0xf72200ff irq 16 at device 0.0 on pci3 > re0: Chip rev. 0x18000000 > re0: MAC rev. 0x00000000 > miibus0: on re0 > > # ifconfig bridge0 > bridge0: flags=8843 metric 0 mtu 1500 > inet 192.168.0.2 netmask 0xffffff00 broadcast 192.168.0.255 > nd6 options=9 > id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15 > maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200 > root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0 > member: re0 flags=143 > ifmaxaddr 0 port 3 priority 128 path cost 55 > member: em0 flags=143 > ifmaxaddr 0 port 2 priority 128 path cost 2000000 > > Could it be bridge0 is causing the pause frames to be visible? > bridge(4) will put its members into promiscuous mode. Either em(4) or re(4) seems to pass received pause frames to host. My old re(4) data sheet said nothing about passing pause frames under promiscuous mode. Probably jfv@ may be able to answer for em(4) controllers(CCed). > > I'm not sure what's happening there but receiving pause frames will > > inhibit sending frames until the pause time expires such that you'll > > not get any response from the host. Probably you have to know > > which host is sending these lots of pause frames. Once you > > identify the guilty host, you have to narrow down what condition > > makes it send pause frames. > > It turns out that the guilty host had a faulty memory module (that didn't show > up in memtest86+ when run with another module in). I've removed the offending > memory module and no repeat of the incidences. I have no idea how a faulty memory module can generate pause frames but it's good to know the issue was resolved.