Date: Fri, 6 Feb 2009 16:02:52 +0100 From: Luigi Rizzo <rizzo@iet.unipi.it> To: arch@freebsd.org, Luigi Rizzo <rizzo@iet.unipi.it>, Fabio Checconi <fabio@gandalf.sssup.it>, pjd@FreeBSD.org, kmacy@freebsd.org Subject: RFC: upcoming fixes to bioq_disksort() and friends. Message-ID: <20090206150252.GB11500@onelab2.iet.unipi.it>
next in thread | raw e-mail | index | archive | help
Hi, [people in Bcc were involved in a previous thread] we recently discovered that bioq_disksort() does not behave as intended, and the consensus on -developers was to revert the behaviour to the one in subr_disklabel.c v.1.1. Given that there are some corner cases that neither the original nor the current code address, we would like to describe how we are going to update the code. Please let us know if you have objections or there is anything unclear. The main methods for manipulating the bioq are bioq_disksort(), which does an ordered insertion, and bioq_first()/bioq_takefirst() which poll or extract the next request from the queue. This was the original API, and the behaviour of the bioq in this case is well specified. However, subr_disk.c also introduced and exported functions to directly manipulate the TAILQ embedded in the bioq. The behaviour of these functions in terms of keeping track of the disk head is not specified, and as a consequence, if you intermix bioq_disksort() and bioq_first()/bioq_takefirst() with the other calls, the ordering of the bioq could be completely disrupted [NOTE 1]. We are going to specify (and implement) the behaviour of the bioq as follows: 1. a bioq defines the current head position, last_offset, as the bio_offset of the first byte after [NOTE 2] the last entry extracted via bioq_takefirst(). 2. bioq_first() returns the next request with bio_offset >= last_offset, leaving the request in the queue; 3. bioq_takefirst() returns the next request with bio_offset >= last_offset, extracting it from the queue and setting last_offset according to #1 4. bioq_disksort() sorts requests using (uoff_t)(bio_offset - last_offset) as the sorting key (the offset space wraps); 5. bioq_insert_tail() appends to the end of the queue and DOES NOT change last_offset; 6. bioq_insert_head() inserts at the head of the queue and sets last_offset = bio_offset [NOTE 3]; 7. in terms of implementation, the field insert_point in struct bioq_head now becomes useless. However we are not going to remove it, so that parts of the kernel that #include bio.h do not need to be rebuilt. The change is only in the behaviour of the functions, and should be considered as a fix rather than a change of features. NOTES: 1. There are only two files that do this, sys/geom/mirror/g_mirror.c and sys/dev/xen/blkfront/blkfront.c. The g_vinum code also has some mix of code but it seems just a result of implementing two different queues with a single bioq, something that should be fixed anyways. 2. Historical behaviour is to just set last_offset = bio_offset. Accounting for the request size makes the code track the head position slightly better (assuming, of course, that we can make assumptions at all on the head position). But this makes a difference only in some corner cases. 3. The update of last_offset guarantees that a bioq_disksort() after a bioq_insert_head() puts the new entry after the one inserted bioq_insert_head(). We can leave last_offset untouched, in which case we lose that guarantee. Given that there is no precedent behaviour to preserve, both approaches are equally good, but we would prefer the one proposed (updating last_offset) as it makes it easier to reason on the behaviour of the queue. cheers luigi and fabio
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20090206150252.GB11500>