Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 14 Mar 2015 15:04:09 +0100
From:      Bernd Walter <ticso@cicely7.cicely.de>
To:        Ian Lepore <ian@freebsd.org>
Cc:        Tim Kientzle <tim@kientzle.com>, freebsd-arm <freebsd-arm@freebsd.org>
Subject:   Re: BeagleBone slow inbound net I/O
Message-ID:  <20150314140408.GE40951@cicely7.cicely.de>
In-Reply-To: <20150314135954.GD40951@cicely7.cicely.de>
References:  <20150311165115.32327c5a@ivory.wynn.com> <89CEBFCA-6B94-4F48-8DFD-790E4667632D@kientzle.com> <20150314031542.439cdee3@ivory.wynn.com> <1426339400.52318.3.camel@freebsd.org> <20150314135954.GD40951@cicely7.cicely.de>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Mar 14, 2015 at 02:59:54PM +0100, Bernd Walter wrote:
> On Sat, Mar 14, 2015 at 07:23:20AM -0600, Ian Lepore wrote:
> > On Sat, 2015-03-14 at 03:15 -0400, Brett Wynkoop wrote:
> > > On Fri, 13 Mar 2015 23:02:25 -0700
> > > Tim Kientzle <tim@kientzle.com> wrote:
> > > 
> > > > 
> > > > > On Mar 11, 2015, at 1:51 PM, Brett Wynkoop <freebsd-arm@wynn.com>
> > > > > wrote:
> > > > > 
> > > > > Have I managed to find a network driver issue?  Any ideas how to
> > > > > gather more information to help get to the bottom of things?
> > > > > 
> > > > 
> > > > $ sysctl dev.cpsw
> > > > 
> > > > This will dump detailed statistics from the Ethernet hardware and
> > > > driver.
> > > > 
> > > > Tim
> > > > 
> > > 
> > > After a short time while doing nfs i/o
> > > 
> > > 
> > > [wynkoop@beaglebone ~]$ sysctl dev.cpsw | grep -i error
> > > dev.cpsw.0.stats.RxCrcErrors: 40
> > > dev.cpsw.0.stats.RxAlignErrors: 32
> > > dev.cpsw.0.stats.CarrierSenseErrors: 0
> > [...]
> > > [wynkoop@beaglebone ~]$ sysctl dev.cpsw | grep -i error
> > > dev.cpsw.0.stats.RxCrcErrors: 262
> > > dev.cpsw.0.stats.RxAlignErrors: 231
> > > dev.cpsw.0.stats.CarrierSenseErrors: 0
> > > [wynkoop@beaglebone ~]$ 
> > > 
> > > So we can see climbing errors.  I am not sure how this compares to the
> > > results of others. The above was during the first few minutes of a
> > > buildworld from an nfs share.
> > > 
> > > At the same time on the console:
> > > 
> > > Mar 14 03:07:47 beaglebone amd[1163]: mountd rpc failed: RPC: Can't
> > > decode result Mar 14 03:11:48 beaglebone amd[1399]: mountd rpc failed:
> > > RPC: Can't decode result
> > > 
> > > which makes sense with the above errors I think.
> 
> It doesn't make sense with the ethernet CRC alone, since ethernet CRC
> failures are basicly dropped packets.
> If the RPC answer can't be parsed then ethernet packet CRC was Ok, when
> verified by the MAC and corrupted later.
> 
> > On mine:
> > 
> > root@bb:/usr/ports/benchmarks/iperf # sysctl dev.cpsw | grep Err
> > dev.cpsw.0.stats.RxCrcErrors: 0
> > dev.cpsw.0.stats.RxAlignErrors: 0
> > dev.cpsw.0.stats.CarrierSenseErrors: 0
> > 
> > That's after 3 days of uptime including doing builds over nfs, and all
> > the iperf testing I was doing yesterday (no errors after megabytes of
> > transfers).
> > 
> > I wonder if your power supply is failing and injecting transient
> > glitches under heavy load or something?
> 
> Power seems likely.
> The RPC error and filesystem corruptions make sense with broken RAM too,
> but the ethernet CRC should be checked during transfer to RAM.
> The subsystems with failures are very different and mostly isolated from
> software and HW-logic.
> 
> About ZFS:
> It is designed to handle data corruption to some degree and has an extremly
> different workload, so possible that it works by luck.

Just noticed in another of your mails that you have uncorrected errors
with ZFS too.
It is just that ZFS notices the corruption with it's CRC, plus has copies
for metadata, so it doesn't need to panic as UFS.
But as a user I would panic with such a system ;-)

-- 
B.Walter <bernd@bwct.de> http://www.bwct.de
Modbus/TCP Ethernet I/O Baugruppen, ARM basierte FreeBSD Rechner uvm.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20150314140408.GE40951>