Date: Sun, 23 Feb 2014 19:51:10 +0300 From: David Naylor <dbn@freebsd.org> To: pyunyh@gmail.com Cc: stable@freebsd.org Subject: Re: [SOLVED] MPCP Opcode Pause and unresponsive computer Message-ID: <7109858.LYNIHJIJOi@dragon.dg> In-Reply-To: <20140217022329.GA3675@michelle.cdnetworks.com> References: <1403963.5sDsKbxfoF@dragon.dg> <20140217022329.GA3675@michelle.cdnetworks.com>
next in thread | previous in thread | raw e-mail | index | archive | help
[-- Attachment #1 --]
Hi,
The issue was hardware error (corrupt memory module). Once removed all
symptoms disappeared.
Please see below for specific follow up messages.
Regards
On Monday, 17 February 2014 11:23:29 Yonghyeon PYUN wrote:
> On Thu, Feb 13, 2014 at 10:01:56PM +0300, David Naylor wrote:
> > Hi,
> >
> > I recently installed FreeBSD 10.0-RELEASE on an headless Intense-PC. I am
> > experiencing two network related issues with the computer.
> >
> > First issue
> > -----------
> > When compiling lang/ruby19 the network freezes. The build was done
> > directly from the command line using ssh. After a while ssh reports
> > "Write failed: Broken pipe". I attached the monitor and no messages were
> > displayed on the output (and the machine was still running).
> >
> > The Intense-PC does not respond to pings at this point either. Of note, I
> > was capable of transferring multiple GB of data and successfully compiled
> > other ports but compiling lang/ruby19 messes up everything.
> >
> > Second issue
> > ------------
> > After a period of uptime (after the freeze from building lang/ruby19) the
> > entire network stops working, nothing is capable of connecting or
> > communicating on the network. When I do a tcpdump (from a different,
> > affected computer) I find the following:
> >
> > 20:57:58.254626 MPCP, Opcode Pause, length 46
> >
> > These messages get repeated a few times a second. The moment I disconnect
> > the Intense-PC from the network functionality is restored (and is clearly
> > illustrated by the tcpdump).
> >
> > Information
> > -----------
> > # uname -a
> > FreeBSD dragonbsd 10.0-RELEASE FreeBSD 10.0-RELEASE #0
> > d44ce30(releng/10.0): Sun Feb 9 20:11:55 SAST 2014
> > root@dragon.dg:/tmp/home/freebsd/10.0/src/sys/MODULAR amd64
> >
> > # ifconfig
> > lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
> >
> > options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
> > inet6 ::1 prefixlen 128
> > inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1
> > inet 127.0.0.1 netmask 0xff000000
> > nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
> >
> > em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
> >
> > options=4219b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TS
> > O4,WOL_MAGIC,VLAN_HWTSO> ether XX:XX:XX:XX:XX:XX
> > inet 192.168.0.160 netmask 0xffffff00 broadcast 192.168.0.255
> > nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
> > media: Ethernet autoselect (100baseTX <full-duplex>)
> > status: active
> >
> > re0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
> >
> > options=8209b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WO
> > L_MAGIC,LINKSTATE> ether XX:XX:XX:XX:XX:XX
> > nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
> > media: Ethernet autoselect (none)
> > status: no carrier
> >
> > Any assistance to resolve this issue will be greatly appreciated.
>
> It's not normal to see pause frames with tcpdump. If my memory
> serves me right, MAC control frames which include pause frames
> should not be passed to host. Which network driver do you see
> above pause frames? Some drivers like fxp(4) allow passing pause
> frames to host but I think that's a bug in driver. I didn't change
> that behavior of the driver just because it used to enable that
> feature in the past.
This is what a web search also indicated. In this case the machine receiving
pause frames has:
# dmesg | grep 'em0\|re0'
em0: <Intel(R) PRO/1000 Network Connection 7.3.8> port 0xf040-0xf05f mem
0xf7300000-0xf731ffff,0xf7328000-0xf7328fff irq 20 at device 25.0 on pci0
em0: Using an MSI interrupt
DragonSA@dragon:/tmp> dmesg | grep re0
re0: <RealTek 8169SC/8110SC Single-chip Gigabit Ethernet> port 0xd000-0xd0ff
mem 0xf7220000-0xf72200ff irq 16 at device 0.0 on pci3
re0: Chip rev. 0x18000000
re0: MAC rev. 0x00000000
miibus0: <MII bus> on re0
# ifconfig bridge0
bridge0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
inet 192.168.0.2 netmask 0xffffff00 broadcast 192.168.0.255
nd6 options=9<PERFORMNUD,IFDISABLED>
id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15
maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200
root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0
member: re0 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
ifmaxaddr 0 port 3 priority 128 path cost 55
member: em0 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
ifmaxaddr 0 port 2 priority 128 path cost 2000000
Could it be bridge0 is causing the pause frames to be visible?
> I'm not sure what's happening there but receiving pause frames will
> inhibit sending frames until the pause time expires such that you'll
> not get any response from the host. Probably you have to know
> which host is sending these lots of pause frames. Once you
> identify the guilty host, you have to narrow down what condition
> makes it send pause frames.
It turns out that the guilty host had a faulty memory module (that didn't show
up in memtest86+ when run with another module in). I've removed the offending
memory module and no repeat of the incidences.
[-- Attachment #2 --]
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (FreeBSD)
iKYEABECAGYFAlMKNRFfFIAAAAAALgAoaXNzdWVyLWZwckBub3RhdGlvbnMub3Bl
bnBncC5maWZ0aGhvcnNlbWFuLm5ldDY1NDBCNDdDNTRBQTNFQkFCMjNCNThBQzUx
QTY4NTgwRkY2OTE2QjIACgkQUaaFgP9pFrI0vQCeKqI0k/fCkTdo+w991TYZOUBW
pQ0AnjaKypfbJuF13/yLHnfYsFyHMEVN
=b7fL
-----END PGP SIGNATURE-----
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?7109858.LYNIHJIJOi>
