Date: Tue, 17 Apr 2012 21:15:58 +0200 From: Gary Jennejohn <gljennjohn@googlemail.com> To: Adrian Chadd <adrian@freebsd.org> Cc: freebsd-hackers <freebsd-hackers@freebsd.org>, Jerry Toung <jrytoung@gmail.com> Subject: Re: CAM disk I/O starvation Message-ID: <20120417211558.4793b705@ernst.jennejohn.org> In-Reply-To: <CAJ-VmokwR%2BVHmup6OLN%2BBGHvoAeLvJ9%2BBeZ9Fm6xM7Pio73pzQ@mail.gmail.com> References: <CADC0LV=-e%2B7PshRQdc69e2-Vktf6XFpVrqiMpx=QL4m_%2B9hSnw@mail.gmail.com> <20120403193124.46ad9de9@ernst.jennejohn.org> <CADC0LVm1HY2Dz%2BVk_GK35szRS6ySviLhMiL1TSRBOnPwQnBwRg@mail.gmail.com> <20120411192153.5672b62c@ernst.jennejohn.org> <CAJ-VmokwR%2BVHmup6OLN%2BBGHvoAeLvJ9%2BBeZ9Fm6xM7Pio73pzQ@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 16 Apr 2012 14:39:12 -0700
Adrian Chadd <adrian@freebsd.org> wrote:
> On 11 April 2012 10:21, Gary Jennejohn <gljennjohn@googlemail.com> wrote:
>
> > Just for the archive my bad disk performance seems to have been fixed in
> > HEAD by svn commit r234074. Seems that all interrupts were being handled
> > by a single CPU/core (I have 6), which resulted in abysmal interrupt
> > handling when mutltiple disks were busy.
> >
> > Since this commit my disk preformance is back to normal and long lags
> > are a thing of the past.
>
> Hi,
>
> This is kind of worrying. You only have a few disks, a single core
> SHOULD be able to handle all the interrupts for those disks whilst
> leaving plenty of cycles to spare to drive the rest of your system.
> And you have 5 other cores.
>
> Would you be willing to help out diagnose exactly why that particular
> behaviour is causing you so much trouble? It almost sounds like
> something in the IO path is blocking for far too long, not allowing
> the rest of the system to move forward. That's very worrying for an
> interrupt handler. :)
>
Yes, I agree completely. My first thought was that disk I/O
scheduling had somehow been pessimized. But then I thought -
wait a minute, I have disk caches enabled and command queuing is
enabled for all of them, so that shouldn't really have any
noticeable impact. So I was at a loss to explain why disk performance
had suddenly gotten so bad.
I'd be willing to spend some time on diagnosing it, but I have to come
up with a scenario which would reliably reproduce the problem. AFAICR
it generally happened when I was running csup/svn because my CVS
repositoy is on one disk and /usr/{ports,src} are on a different one.
I still have the old problem kernel around, but it's probably not
instrumented for any meaningful diagnoses.
--
Gary Jennejohn
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120417211558.4793b705>
