FreeBSD Mail Archives

Date:      Tue, 15 Feb 2011 12:29:37 +1030
From:      "Daniel O'Connor" <doconnor@gsoft.com.au>
To:        Matthew Dillon <dillon@apollo.backplane.com>
Cc:        freebsd-hackers@freebsd.org, Ivan Voras <ivoras@freebsd.org>
Subject:   Re: Scheduler question
Message-ID:  <29157A8B-5B2C-4A08-9DE9-B8917F054DB1@gsoft.com.au>
In-Reply-To: <201102102028.p1AKSnAl067323@apollo.backplane.com>
References:  <53A394ED-7C2E-4E4B-A9A7-CB5F1B27DBE3@gsoft.com.au> <iignas$kbc$1@dough.gmane.org> <AA0DA14A-A8E5-48C5-AABE-ECCB02C59D19@gsoft.com.au> <iii67s$ngs$1@dough.gmane.org> <7108E013-77D1-47F8-892E-5027DB7D432B@gsoft.com.au> <990005CD-39BD-45F6-BD07-ACEE79DF5A03@gsoft.com.au> <AANLkTincOBWKhr9qWX_dFmukWeUCG61aYT0AXd-VYYTu@mail.gmail.com> <772B352C-7241-4326-8B49-3FB675896609@gsoft.com.au> <AANLkTik_kxCgtjTBshEc4zF5X%2BSKV6L1R4waYyLeM_r3@mail.gmail.com> <71549325-5FD1-4516-B49E-098AFBEFE8B7@gsoft.com.au> <AANLkTikbPsF9d3GTDj0JtrLL%2BML=3_f1ShU0e5H7hYEM@mail.gmail.com> <88A9D148-7E3C-4ADB-90FC-B95C4D3BBD2E@gsoft.com.au> <201102102028.p1AKSnAl067323@apollo.backplane.com>


On 11/02/2011, at 6:58, Matthew Dillon wrote:
>   It sounds like there are at least two issues involved.
>=20
>   The first could be a buffer cache starvation issue due to the load =
on
>   the filesystem from the tar.  If the usb program is doing any =
filesystem
>   operation at all, even at low bandwidths, it could be hitting =
blockages
>   due to the disk intensive tar eating up available buffer cache =
buffers
>   (e.g. causing an excessive ratio of dirty buffers vs clean buffers).
>   This would NOT be a scheduler problem per-say, but instead a kernel
>   resource management problem.

OK..
Note that my program is split into 2 threads and queues up a large =
number of buffers. One thread just calls the libusb event handler so if =
the main thread is blocked for IO it should still run.. right? :)

>   The way to test this is to double-buffer or tripple-buffer the =
output
>   via shared memory.  A pipe might not do the job if it gets stuck =
doing
>   direct transfers (I eventually gave up trying to optimize pipes in =
DFly
>   due to a similar problem and just run everything through a kernel =
buffer
>   now).  Still, it may be possible to test against this particular =
problem
>   by having the program write to a pipe and another program or fork =
handle
>   the actual writing to the disk or filesystem.

Hmm.. in effect I have this as I write all data to disk via mbuffer and =
this did help, but it still drops out which seems to indicate to me that =
my libusb event loop thread is being stalled.=20

Note that the total CPU consumed by it is very low (<1%) and that thread =
does no I/O.
>=20
>   Another way to test this is to comment out the writing in the usb =
program
>   entirely and see if things improve.

If I write to /dev/null it works fine.

>   The second issue sounds more scheduler-related.  Try running the
>   usb program at nice -20?  You could even run it at a pseudo-realtime
>   priority using rtprio but nice -20 had better work properly against
>   a md5 or there is something seriously broken in the scheduler.

Unfortunately neither of these improve things, I am pretty surprised a =
nice -20 or rtprio'd thread doesn't beat a pure CPU user doing no IO :(
>=20
>   Dynamic priority handling is supposed to deal with this sort of =
thing
>   automatically, particularly if the usb program is not using a lot of
>   cpu, but sometimes it can't tell whether a newly-exec'd program is
>   going to be interactive or batch until after it has run for a while.
>=20
>   Tuning initial conditions after an exec for the scheduler is not an
>   easy task.  Simply giving a program a more batch/bulk-run priority =
on
>   exec and letting the dynamic priority shift it more to interactive
>   operation tends to mess up interactive shells in the face of
>   cpu-intensive system operation, for example.  Theoretically dynamic
>   priority handling should bump up the priority for the usb program =
well
>   beyond any initial conditions for exec once it has been running a =
while,
>   assuming it doesn't use tons of cpu.

Hmm.. It is unfortunate the hinting mechanisms are very coarse :(

>   An md5, or any single-file reading operation, would not overload the
>   buffer cache.  File writing and/or multi-file operations (such as a
>   tar extraction or a tar-up) can create blockages in the buffer =
cache.

The md5 process is just reading /dev/null - I run it to soak up the CPU =
because in production the system will be doing CPU intensive data =
analysis.

>   It takes a considerable amount of VM/buffer-cache tuning to get =
those
>   subsystems to pipeline properly and sometimes things can go stale =
and
>   stop pipelining properly for months without anyone realizing it.

:(
I am waiting on a new buffer card with 8 times bigger FIFOs which should =
help I hope..

Also I am writing a kernel driver in the hope it will be more robust :)

--
Daniel O'Connor software and network engineer
for Genesis Software - http://www.gsoft.com.au
"The nice thing about standards is that there
are so many of them to choose from."
  -- Andrew Tanenbaum
GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?29157A8B-5B2C-4A08-9DE9-B8917F054DB1>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation