Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 2 Apr 2012 10:55:31 -0700
From:      Jerry Toung <jrytoung@gmail.com>
To:        freebsd-hackers <freebsd-hackers@freebsd.org>
Subject:   CAM disk I/O starvation
Message-ID:  <CADC0LV=-e%2B7PshRQdc69e2-Vktf6XFpVrqiMpx=QL4m_%2B9hSnw@mail.gmail.com>

next in thread | raw e-mail | index | archive | help
Hello list,
I am convinced that there is a bug in the CAM code that leads to I/O starvation.
I have already discussed this privately with some. I am now bringing this up to
the general audience to get more feedback.

My setup is that I have 1 RAID controller with 2 arrays connected to
it, da0 and da1.
The controller supports 252 tags. After boot up, camcontrol tags on
da0 and da1 shows that both devices have 252 openings each. A process
P0 writing on da0 is dormant most of the time, but would wake up with
burst of I/Os, 5000-6000 ops as reported by gstat.
A process P1 writing on da1 has a fixed data rate to da1 as reported by gstat.

The issue: When P0 generates that burst of 5000-6000 ops, the write
rate of P1 on da1 goes to 0 MB/sec for up to 8-9sec,
vfs.hirunningspace starts climbing and we get into waithirunning() or
getblk() sleep channel. BTW, raising hirunningspace has no effect on
the 0 MB/s behavior.

The first problem that I see here, is that if the sim's devq has 252
alloc_queue and
send_queue, the struct cam_ed representing da0 and da1 should each
have 126 openings and not
252. The second problem is that clearly, there is no I/O fairness in CAM as seen
in gstat output and da0 exclusively takes a hold of the sim/controller
until it has processed all it's I/Os (8-9 seconds). The code that does
this is at

cam/cam_xpt.c:3030
3030             && (devq->alloc_openings > 0)

and

cam/cam_xpt.c:3091
3091             && (devq->send_openings > 0)

After you've split the openings to 126 each, the tests above will always be true

I have a patch and it fixes those problems. I can share it to the list
if requested to.
da0 and da1 now both automatically get 126 openings and based on that,
extra logic implements fairness in cam/cam_xpt.c. No more 0 MB/s on
da1. This is on 8.1-RELEASE FreeBSD.

Any comments welcome.

Thanks,
Jerry



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CADC0LV=-e%2B7PshRQdc69e2-Vktf6XFpVrqiMpx=QL4m_%2B9hSnw>