Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 13 Feb 2013 05:48:01 -0800
From:      Jeremy Chadwick <jdc@koitsu.org>
To:        CeDeROM <cederom@tlen.pl>
Cc:        freebsd-stable@freebsd.org, freebsd-emulation@freebsd.org, Christian Gusenbauer <c47g@gmx.at>
Subject:   Re: 9.1 AMD64 multitasking efficiency low
Message-ID:  <20130213134801.GA58535@icarus.home.lan>
In-Reply-To: <CAFYkXjm%2B%2BUPYBYVD_BFok5YmRtyKQAsnexRxOZLZCrA1GsMhtg@mail.gmail.com>
References:  <CAFYkXjkACs=2RaCb_BaoNT6PM%2B5gkQSGoKP5G4bR_284v_Eoig@mail.gmail.com> <201302130844.45388.c47g@gmx.at> <CAFYkXjk2PwB1CXihNKNAatc2J72SRc7M0YMayXFfoUExj1yJSQ@mail.gmail.com> <201302131321.49429.c47g@gmx.at> <CAFYkXjm%2B%2BUPYBYVD_BFok5YmRtyKQAsnexRxOZLZCrA1GsMhtg@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Feb 13, 2013 at 01:30:53PM +0100, CeDeROM wrote:
> On Wed, Feb 13, 2013 at 1:21 PM, Christian Gusenbauer <c47g@gmx.at> wrote:
> > It has something to do with the drive. I've just connected my external drive
> > to the Intel controller and copied some GB of data around without performance
> > impacts! So my new WDC drive works on both the JMicron and the Intel
> > controller.
> 
> On the other hand these drivers work very well on other operating
> systems like WIndows and Linux, so I would rather suspect some
> SCSI/CAM/SATA issues on the FreeBSD side...?

I'm stepping in.  This thread is officially pissing me off.

I have read the thread.  Every post.  Repeatedly.  I went back and read
them all.  Again: every one.  I see all the nuances, all the stuff
you're screwing with, all the stuff you're changing between posts.

All I see are random, wild, insane claims of "SATA/AHCI/CAM issues" or
"WDC disks have problems" or "maybe it's the Intel controller".  Nobody
until half way through the thread mentioned that USB was involved (heh
heh heh...), nor did anyone disclose that the "system" in question was a
laptop (it matters).  Others with different setups are posting "me too!"
yet their issue may be completely different.  I see no actual useful
data being posted either, just all sorts of vague statements; I find
this very bizarre given that this is a UNIX OS and a UNIX-related
mailing list, yet data is omitted.

And now we have "are you running hald?  Guys try shutting it off".

Stop this madness.  STOP IT.  Stop being (what appears to be)
hyperactive and sit down and actually FIGURE OUT when the issue begins
for you.  It will take you hours, if not an entire day, to do proper
analysis of this.

You will have to try numerous things -- and you will need to take very
precise, very meticulous notes during each thing you do.

You will have to reboot the machine numerous times, because filesystem
caching may be causing you complications.

Stop involving other hardware ("a more powerful machine").  Focus on ONE
MACHINE, do not bring other things into the mix.

Stop using ext2fs.  Use UFS2, and state whether or not you're using
SU, and/or SU+J.  It matters.

Provide dmesg output from the machine in question (straight off a fresh
reboot), with the USB drive attached.

Provide "pciconf -lvbc" output from that machine.

Provide "gpart show" output for each drive.

Provide SMART statistics for the hard disks involved (the USB one, as
well as the internal one).  ports/sysutils/smartmontools.  smartctl -a
output is what I want.  If you can't get the data from the USB one (you
may have to use "smartctl --scan" and try its flag recommendations), get
a different USB enclosure that has a USB/SATA bridge that permits SMART
pass-through.

At some point (your choice -- come up with a plan!) take the USB drive
out of the picture.  Add a 2nd internal drive and try doing I/O to/from
that instead.  Figure it out.

And finally -- start using "gstat -I500ms" in another VTY -- when using
the system.  This will give you some idea of the I/O workload that's
going on, on a per-device level.  If you see a device that should be
getting, say, 150MBytes/second yet is only get 7MBytes/second, then
that may be an indication of where to focus.

If you want an example of a bug that took me an afternoon to track down,
and a good 30-40 reboots and having to take audio recordings (pocket
recorder) while performing physical tasks, just to figure out how to
reproduce the problem, here you go:

http://lists.freebsd.org/pipermail/freebsd-fs/2013-January/016324.html

*That* is what is needed here.

I am more than happy to help you analyse problems relating to hard disk
performance -- I can assure you CAM/ahci(4) and related bits are in good
shape, barring weird/bizarre chipset revision oddities (common with the
mobile chipset versions, i.e. ICHxxM) -- but the information needs
to be provided coherently.

Take an afternoon to figure out what the commonality is.  I can expand
on all sorts of levels about hard disk performance, all the way down to
PCB cache going bad or excessive ECC impacting things, but there's no
point in speculating or going there until evidence shows that.

And please, no "me too" posts.  As said, each issue should be treated
separately.

Figure out where the commonality is through trial and error, then post
those results here.  Nobody can help when arms are flailing to this
degree.  Got it?

-- 
| Jeremy Chadwick                                   jdc@koitsu.org |
| UNIX Systems Administrator                http://jdc.koitsu.org/ |
| Mountain View, CA, US                                            |
| Making life hard for others since 1977.             PGP 4BD6C0CB |



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130213134801.GA58535>