Date: Tue, 15 Feb 2011 12:29:37 +1030 From: "Daniel O'Connor" <doconnor@gsoft.com.au> To: Matthew Dillon <dillon@apollo.backplane.com> Cc: freebsd-hackers@freebsd.org, Ivan Voras <ivoras@freebsd.org> Subject: Re: Scheduler question Message-ID: <29157A8B-5B2C-4A08-9DE9-B8917F054DB1@gsoft.com.au> In-Reply-To: <201102102028.p1AKSnAl067323@apollo.backplane.com> References: <53A394ED-7C2E-4E4B-A9A7-CB5F1B27DBE3@gsoft.com.au> <iignas$kbc$1@dough.gmane.org> <AA0DA14A-A8E5-48C5-AABE-ECCB02C59D19@gsoft.com.au> <iii67s$ngs$1@dough.gmane.org> <7108E013-77D1-47F8-892E-5027DB7D432B@gsoft.com.au> <990005CD-39BD-45F6-BD07-ACEE79DF5A03@gsoft.com.au> <AANLkTincOBWKhr9qWX_dFmukWeUCG61aYT0AXd-VYYTu@mail.gmail.com> <772B352C-7241-4326-8B49-3FB675896609@gsoft.com.au> <AANLkTik_kxCgtjTBshEc4zF5X%2BSKV6L1R4waYyLeM_r3@mail.gmail.com> <71549325-5FD1-4516-B49E-098AFBEFE8B7@gsoft.com.au> <AANLkTikbPsF9d3GTDj0JtrLL%2BML=3_f1ShU0e5H7hYEM@mail.gmail.com> <88A9D148-7E3C-4ADB-90FC-B95C4D3BBD2E@gsoft.com.au> <201102102028.p1AKSnAl067323@apollo.backplane.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 11/02/2011, at 6:58, Matthew Dillon wrote: > It sounds like there are at least two issues involved. >=20 > The first could be a buffer cache starvation issue due to the load = on > the filesystem from the tar. If the usb program is doing any = filesystem > operation at all, even at low bandwidths, it could be hitting = blockages > due to the disk intensive tar eating up available buffer cache = buffers > (e.g. causing an excessive ratio of dirty buffers vs clean buffers). > This would NOT be a scheduler problem per-say, but instead a kernel > resource management problem. OK.. Note that my program is split into 2 threads and queues up a large = number of buffers. One thread just calls the libusb event handler so if = the main thread is blocked for IO it should still run.. right? :) > The way to test this is to double-buffer or tripple-buffer the = output > via shared memory. A pipe might not do the job if it gets stuck = doing > direct transfers (I eventually gave up trying to optimize pipes in = DFly > due to a similar problem and just run everything through a kernel = buffer > now). Still, it may be possible to test against this particular = problem > by having the program write to a pipe and another program or fork = handle > the actual writing to the disk or filesystem. Hmm.. in effect I have this as I write all data to disk via mbuffer and = this did help, but it still drops out which seems to indicate to me that = my libusb event loop thread is being stalled.=20 Note that the total CPU consumed by it is very low (<1%) and that thread = does no I/O. >=20 > Another way to test this is to comment out the writing in the usb = program > entirely and see if things improve. If I write to /dev/null it works fine. > The second issue sounds more scheduler-related. Try running the > usb program at nice -20? You could even run it at a pseudo-realtime > priority using rtprio but nice -20 had better work properly against > a md5 or there is something seriously broken in the scheduler. Unfortunately neither of these improve things, I am pretty surprised a = nice -20 or rtprio'd thread doesn't beat a pure CPU user doing no IO :( >=20 > Dynamic priority handling is supposed to deal with this sort of = thing > automatically, particularly if the usb program is not using a lot of > cpu, but sometimes it can't tell whether a newly-exec'd program is > going to be interactive or batch until after it has run for a while. >=20 > Tuning initial conditions after an exec for the scheduler is not an > easy task. Simply giving a program a more batch/bulk-run priority = on > exec and letting the dynamic priority shift it more to interactive > operation tends to mess up interactive shells in the face of > cpu-intensive system operation, for example. Theoretically dynamic > priority handling should bump up the priority for the usb program = well > beyond any initial conditions for exec once it has been running a = while, > assuming it doesn't use tons of cpu. Hmm.. It is unfortunate the hinting mechanisms are very coarse :( > An md5, or any single-file reading operation, would not overload the > buffer cache. File writing and/or multi-file operations (such as a > tar extraction or a tar-up) can create blockages in the buffer = cache. The md5 process is just reading /dev/null - I run it to soak up the CPU = because in production the system will be doing CPU intensive data = analysis. > It takes a considerable amount of VM/buffer-cache tuning to get = those > subsystems to pipeline properly and sometimes things can go stale = and > stop pipelining properly for months without anyone realizing it. :( I am waiting on a new buffer card with 8 times bigger FIFOs which should = help I hope.. Also I am writing a kernel driver in the hope it will be more robust :) -- Daniel O'Connor software and network engineer for Genesis Software - http://www.gsoft.com.au "The nice thing about standards is that there are so many of them to choose from." -- Andrew Tanenbaum GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?29157A8B-5B2C-4A08-9DE9-B8917F054DB1>