Date: Fri, 07 Sep 2007 15:03:22 +0200 From: =?UTF-8?B?SmVhbi1Tw6liYXN0aWVuIFDDqWRyb24=?= <dumbbell@freebsd.org> To: freebsd-arch@freebsd.org Subject: Pipe direct write and pipeselwakeup() Message-ID: <46E14C1A.1060606@freebsd.org>
next in thread | raw e-mail | index | archive | help
This is a multi-part message in MIME format. --------------060407030105010800010608 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hello, I'm investigating a problem with select/poll/kevent not triggered when writing to a pipe. Here I explain what I understood and, at the end of this mail, I propose a patch. I would like to have feedback about this solution. The problem comes from the way pipes are implemented. The kernel uses two ways to write data on a pipe: o buffered write. This is done when there is less than 8192 bytes (PIPE_MINDIRECT) in the _current_ iov. Data from _all_ iov are uiomove()'d to an internal buffer until there's no more data or the buffer is full. o direct write. This is done when there is at least 8192 bytes in the current iov. Both techniques can't be mixed. So during a single call to writev(2), if there's a need to switch from one to the other, the kernel must wake reader processes and select/poll/kqueue up before the write continues. But when switching from direct write to buffered write, the kernel only wakes reader processes up, not select/poll/kqueue. Someone provided me with a testcase to reproduce the bug. I attached the sources to this mail ("rd.c" and "wr.c"). Use it like this: ./rd ./wr Here's is what's going on with this testcase: 1. the first iov is smaller than 8192 bytes (1 or 2 bytes), so buffered write is selected. 2. the kernel internal buffer is 65536 bytes long, so uiomove() will fill it completly with the data (73727 or 73728 bytes). At the end, 8191 or 8192 bytes remain, depending on TRIGGER_WRITEV_BUG in "wr.c". 3a. with 8191 bytes remaining, buffered write is still selected but the buffer is full: readers and selects are awaken. Everything's fine. 3b. with 8192 bytes remaining, direct write is selected. It sees that the internal buffer is in use: readers are awaken (so the buffer can be flushed) but not selects. Here, the select/poll/kevent times out. There are 3 cases where only readers are awaken. The attached patch add calls to pipeselwakeup(). This fixes the testcase but I'd like to know if there was a good reason to not call pipeselwakeup() in this 3 specific cases? Also, in the third case, the PIPE_WANTW flag isn't set either. I think it should be set too. What do you think? Thanks for any feedback! - -- Jean-Sébastien Pédron http://www.dumbbell.fr/ PGP Key: http://www.dumbbell.fr/pgp/pubkey.asc -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (FreeBSD) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFG4UwZa+xGJsFYOlMRAqKBAJwLx+9WoQmPs4pa8VEPOzT2b5r3VQCfarLY giS8UUEYvNuUQGBqtJ4jhJU= =Rato -----END PGP SIGNATURE----- --------------060407030105010800010608 Content-Type: text/plain; name="rd.c" Content-Transfer-Encoding: base64 Content-Disposition: inline; filename="rd.c" Ci8qICJnY2MgLVdhbGwgLW8gcmQgcmQuYyIgKi8KLyogImdjYyAtRF9SRUVOVFJBTlQgLVdh bGwgLW8gcmQgcmQuYyAtbGNfciIgKi8KI2luY2x1ZGUgPGZjbnRsLmg+CiNpbmNsdWRlIDxz eXMvdHlwZXMuaD4KI2luY2x1ZGUgPHN5cy9ldmVudC5oPgojaW5jbHVkZSA8c3lzL3RpbWUu aD4KI2luY2x1ZGUgPHN5cy91aW8uaD4KI2luY2x1ZGUgPHN0ZGlvLmg+CiNpbmNsdWRlIDxz dGRsaWIuaD4KI2luY2x1ZGUgPHVuaXN0ZC5oPgojaW5jbHVkZSA8c3RyaW5nLmg+CiNpbmNs dWRlIDxlcnJuby5oPgojaW5jbHVkZSA8cG9sbC5oPgojaW5jbHVkZSA8c3lzL3NlbGVjdC5o PgoKI2lmIDEKI2RlZmluZSBVU0VfS1EKI2VsaWYgMQojZGVmaW5lIFVTRV9QT0xMCiNlbmRp ZgoKI2RlZmluZSBFVl9USU1FT1VUIDEwCgppbnQKbWFpbihpbnQgYXJnYywgY2hhciAqYXJn dltdKQp7CiAgICBpbnQgZmRzWzJdOwogICAgaW50IHJlczsKCiAgICBpZiAoYXJnYyAhPSAy KSB7CglmcHJpbnRmKHN0ZGVyciwgIlVzYWdlOiByZCA8cGF0aCB0byB3cml0ZXI+XG4iKTsK CWV4aXQoMSk7CiAgICB9CgogICAgaWYgKHBpcGUoZmRzKSA8IDApIHsKCXBlcnJvcigicGlw ZSgpIGZhaWxlZCIpOwoJZXhpdCgxKTsKICAgIH0KCiAgICBmcHJpbnRmKHN0ZGVyciwgInBp cGUgZmRzPXslZCwgJWR9XG4iLCBmZHNbMF0sIGZkc1sxXSk7CgogICAgZnByaW50ZihzdGRl cnIsICJzZXR0aW5nICVkIGluIG5vbi1ibG9ja2luZyBtb2RlXG4iLCBmZHNbMF0pOwogICAg ZmNudGwoZmRzWzBdLCBGX1NFVEZMLCBmY250bChmZHNbMF0sIEZfR0VURkwsIDApIHwgT19O T05CTE9DSyk7CgogICAgcmVzID0gZm9yaygpOwogICAgaWYgKHJlcyA8IDApIHsKCXBlcnJv cigiZm9yaygpIGZhaWxlZFxuIik7CglleGl0KDEpOwogICAgfQogICAgZWxzZSBpZiAocmVz ID09IDApIHsKCWNsb3NlKDEpOwoJZHVwKGZkc1sxXSk7CgljbG9zZShmZHNbMF0pOwoJY2xv c2UoZmRzWzFdKTsKCWV4ZWNsKGFyZ3ZbMV0sIGFyZ3ZbMV0sIE5VTEwpOwoJcGVycm9yKCJl eGVjbCgpIGZhaWxlZFxuIik7CglleGl0KDEpOwogICAgfQoKICAgIGNsb3NlKGZkc1sxXSk7 CgojaWZkZWYgVVNFX0tRCiAgICB7CglpbnQga3E7CglpbnQgaTsKCXN0cnVjdCBrZXZlbnQg ZXZbMTBdOwoJc3RydWN0IHRpbWVzcGVjIHR2ID0gezAsIDB9OwoJc3RydWN0IGtldmVudCBj aGdbMV07CgoJa3EgPSBrcXVldWUoKTsKCWlmIChrcSA8IDApIHsKCSAgICBwZXJyb3IoImtx dWV1ZSgpIGZhaWxlZCIpOwoJICAgIGV4aXQoMSk7Cgl9CgkKCWZwcmludGYoc3RkZXJyLCAi c2V0dGluZyBFVkZJTFRfUkVBRCBvbiAlZFxuIiwgZmRzWzBdKTsKCUVWX1NFVCgmY2hnWzBd LCBmZHNbMF0sIEVWRklMVF9SRUFELCBFVl9BREQsIDAsIDAsICh2b2lkICopIDIpOwoJcmVz ID0ga2V2ZW50KGtxLCAmY2hnWzBdLCAxLCBOVUxMLCAwLCAmdHYpOwoJaWYgKHJlcyA8IDAp IHsKCSAgICBwZXJyb3IoImtldmVudCgpIGZhaWxlZFxuIik7CgkgICAgZXhpdCgxKTsKCX0K CWVsc2UgewoJICAgIGZwcmludGYoc3RkZXJyLCAia2V2ZW50KCkgcmV0dXJuZWQgPSAlZFxu IiwgcmVzKTsKCX0KCXR2LnR2X3NlYyA9IEVWX1RJTUVPVVQ7Cgl0di50dl9uc2VjID0gMDsK CWZwcmludGYoc3RkZXJyLCAia2V2ZW50IHdhaXRpbmcgZm9yICVkIHNlY3MgZm9yIGV2ZW50 cy4uLlxuIiwKCQlFVl9USU1FT1VUKTsKCXJlcyA9IGtldmVudChrcSwgTlVMTCwgMCwgJmV2 WzBdLCAxMCwgJnR2KTsKCWlmIChyZXMgPCAwKSB7CgkgICAgcGVycm9yKCJrZXZlbnQgZmFp bGVkXG4iKTsKCSAgICBleGl0KDEpOwoJfQoJZWxzZSBpZiAocmVzID09IDApIHsKCSAgICBm cHJpbnRmKHN0ZGVyciwgImtldmVudCB0aW1lZCBvdXRcbiIpOwoJICAgIGV4aXQoMSk7Cgl9 CglmcHJpbnRmKHN0ZGVyciwgImtldmVudCByZXR1cm5lZCA9ICVkXG4iLCByZXMpOwoKCWZv ciAoaSA9IDA7IGkgPCAxMCAmJiBpIDwgcmVzOyBpKyspIHsKCSAgICBmcHJpbnRmKHN0ZGVy ciwgInJlc3VsdCBldmVudCAlZDogZmQ9JWQ6ICIsIGksIChpbnQpZXZbaV0uaWRlbnQpOwoJ ICAgIGlmIChldltpXS5mbGFncyAmIEVWX0VSUk9SKSB7CgkJZnByaW50ZihzdGRlcnIsICJF Vl9FUlJPUjogJXMgIiwgc3RyZXJyb3IoZXZbaV0uZGF0YSkpOwoJICAgIH0KCSAgICBlbHNl IHsKCQlpZiAoZXZbaV0uZmlsdGVyID09IEVWRklMVF9SRUFEKQoJCSAgICBmcHJpbnRmKHN0 ZGVyciwgIkVWRklMVF9SRUFEICIpOwoJCWlmIChldltpXS5maWx0ZXIgPT0gRVZGSUxUX1dS SVRFKQoJCSAgICBmcHJpbnRmKHN0ZGVyciwgIkVWRklMVF9XUklURSAiKTsKCSAgICB9Cgkg ICAgZnByaW50ZihzdGRlcnIsICJcbiIpOwoJfQogICAgfQoKI2VsaWYgZGVmaW5lZChVU0Vf UE9MTCkKICAgIHsKCXN0cnVjdCBwb2xsZmQgcGZkc1sxXTsKCQoJcGZkc1swXS5mZCA9IGZk c1swXTsKCXBmZHNbMF0uZXZlbnRzID0gKFBPTExJTnxQT0xMUkROT1JNKTsKCXBmZHNbMF0u cmV2ZW50cyA9IDA7CgoJZnByaW50ZihzdGRlcnIsICJwZmRzWzBdLmZkID0gJWQgcGZkc1sw XS5ldmVudHMgPSBQT0xMSU58UE9MTFJETk9STVxuIiwKCQlwZmRzWzBdLmZkKTsKCSAgICAK CglmcHJpbnRmKHN0ZGVyciwgInBvbGwgd2FpdGluZyBmb3IgJWQgc2VjcyBmb3IgZXZlbnRz Li4uXG4iLAoJCUVWX1RJTUVPVVQpOwoJcmVzID0gcG9sbCgmcGZkc1swXSwgMSwgRVZfVElN RU9VVCoxMDAwKTsKCWlmIChyZXMgPCAwKSB7CgkgICAgcGVycm9yKCJwb2xsIGZhaWxlZFxu Iik7CgkgICAgZXhpdCgxKTsKCX0KCWVsc2UgaWYgKHJlcyA9PSAwKSB7CgkgICAgZnByaW50 ZihzdGRlcnIsICJwb2xsIHRpbWVkIG91dFxuIik7CgkgICAgZXhpdCgxKTsKCX0KCWZwcmlu dGYoc3RkZXJyLCAicG9sbCByZXR1cm5lZCA9ICVkXG4iLCByZXMpOwoJZnByaW50ZihzdGRl cnIsICJmZD0lZCAiLCBwZmRzWzBdLmZkKTsKCWlmIChwZmRzWzBdLnJldmVudHMgJiBQT0xM SU4pCgkgICAgZnByaW50ZihzdGRlcnIsICJQT0xMSU4gIik7CglpZiAocGZkc1swXS5yZXZl bnRzICYgUE9MTFJETk9STSkKCSAgICBmcHJpbnRmKHN0ZGVyciwgIlBPTExSRE5PUk0gIik7 CglpZiAocGZkc1swXS5yZXZlbnRzICYgUE9MTE9VVCkKCSAgICBmcHJpbnRmKHN0ZGVyciwg IlBPTExPVVQgIik7CglmcHJpbnRmKHN0ZGVyciwgIlxuIik7CiAgICB9CiNlbHNlIC8qIHVz ZSBzZWxlY3QgKi8KICAgIHsKCWZkX3NldCByZWFkZmRzOwoJc3RydWN0IHRpbWV2YWwgc3R2 ID0ge0VWX1RJTUVPVVQsIDB9OwoJRkRfWkVSTygmcmVhZGZkcyk7CglGRF9TRVQoZmRzWzBd LCAmcmVhZGZkcyk7CgoJZnByaW50ZihzdGRlcnIsICJzZWxlY3RpbmcgZmQgPSAlZFxuIiwg ZmRzWzBdKTsKCSAgICAKCglmcHJpbnRmKHN0ZGVyciwgInNlbGVjdCB3YWl0aW5nIGZvciAl ZCBzZWNzIGZvciBldmVudHMuLi5cbiIsCgkJRVZfVElNRU9VVCk7CglyZXMgPSBzZWxlY3Qo ZmRzWzBdKzEsICZyZWFkZmRzLCBOVUxMLCBOVUxMLCAmc3R2KTsKCWlmIChyZXMgPCAwKSB7 CgkgICAgcGVycm9yKCJzZWxlY3QgZmFpbGVkXG4iKTsKCSAgICBleGl0KDEpOwoJfQoJZWxz ZSBpZiAocmVzID09IDApIHsKCSAgICBmcHJpbnRmKHN0ZGVyciwgInNlbGVjdCB0aW1lZCBv dXRcbiIpOwoJICAgIGV4aXQoMSk7Cgl9CglmcHJpbnRmKHN0ZGVyciwgInNlbGVjdCByZXR1 cm5lZCA9ICVkXG4iLCByZXMpOwoJZnByaW50ZihzdGRlcnIsICJmZD0lZCAiLCBmZHNbMF0p OwoJaWYgKEZEX0lTU0VUKGZkc1swXSwgJnJlYWRmZHMpKQoJICAgIGZwcmludGYoc3RkZXJy LCAiZmQgaXMgc2V0ICIpOwoJZnByaW50ZihzdGRlcnIsICJcbiIpOwogICAgfQojZW5kaWYK ICAgIGNsb3NlKGZkc1swXSk7CiAgICByZXR1cm4gMDsKfQoKCgoKCgo= --------------060407030105010800010608 Content-Type: text/plain; name="wr.c" Content-Transfer-Encoding: base64 Content-Disposition: inline; filename="wr.c" Ci8qICJnY2MgLVdhbGwgLW8gd3Igd3IuYyIgYnVnZ3kgKi8KLyogImdjYyAtRF9SRUVOVFJB TlQgLVdhbGwgLW8gd3Igd3IuYyAtbGNfciIgbm90IGJ1Z2d5ISAqLwoKCiNkZWZpbmUgVFJJ R0dFUl9XUklURVZfQlVHIDEKCgojaW5jbHVkZSA8c3lzL3R5cGVzLmg+CiNpbmNsdWRlIDxz eXMvdWlvLmg+CiNpbmNsdWRlIDx1bmlzdGQuaD4KI2luY2x1ZGUgPHN0ZGlvLmg+CgojaWZu ZGVmIFBBR0VfU0laRQojZGVmaW5lIFBBR0VfU0laRSA0MDk2CiNlbmRpZgoKLyogVGhlIGZv bGxvd2luZyAicGlwZSBkZWZpbmVzIiBjdXQgZnJvbSBzeXMvcGlwZS5oICovCgojaWZuZGVm IFBJUEVfU0laRQojZGVmaW5lIFBJUEVfU0laRSAgICAgICAxNjM4NAojZW5kaWYKCiNpZm5k ZWYgQklHX1BJUEVfU0laRQojZGVmaW5lIEJJR19QSVBFX1NJWkUgICAoNjQqMTAyNCkKI2Vu ZGlmCgojaWZuZGVmIFNNQUxMX1BJUEVfU0laRQojZGVmaW5lIFNNQUxMX1BJUEVfU0laRSBQ QUdFX1NJWkUKI2VuZGlmCgojaWZuZGVmIFBJUEVfTUlORElSRUNUCiNkZWZpbmUgUElQRV9N SU5ESVJFQ1QgIDgxOTIKI2VuZGlmCgojaWZuZGVmIFBJUEVOUEFHRVMKI2RlZmluZSBQSVBF TlBBR0VTICAgICAgKEJJR19QSVBFX1NJWkUgLyBQQUdFX1NJWkUgKyAxKQojZW5kaWYKCiNp ZiBUUklHR0VSX1dSSVRFVl9CVUcKI2RlZmluZSBCVUYwX1NaIDIKI2Vsc2UKI2RlZmluZSBC VUYwX1NaIDEKI2VuZGlmCiNkZWZpbmUgQlVGMV9TWiBQQUdFX1NJWkUKI2RlZmluZSBCVUYy X1NaIChQSVBFTlBBR0VTICogUEFHRV9TSVpFIC0gMikKCnN0YXRpYyBjaGFyIGJ1ZltCVUYw X1NaICsgQlVGMV9TWiArIEJVRjJfU1pdOwoKCmludCBtYWluKHZvaWQpCnsKICAgIGludCB3 cjsKICAgIHN0cnVjdCBpb3ZlYyBpb3ZbM107CgogICAgc2xlZXAoMSk7CgogICAgZnByaW50 ZihzdGRlcnIsICJQSVBFTlBBR0VTPSVkXG5QQUdFX1NJWkU9JWRcbiIsCgkgICAgUElQRU5Q QUdFUywgUEFHRV9TSVpFKTsKICAgIGZwcmludGYoc3RkZXJyLCAiQlVGMF9TWj0lZCwgQlVG MV9TWj0lZCwgQlVGMl9TWj0lZFxuIiwKCSAgICBCVUYwX1NaLCBCVUYxX1NaLCBCVUYyX1Na KTsKCiAgICBpb3ZbMF0uaW92X2Jhc2UgPQojaWYgQlVGMF9TWiA9PSAwCglOVUxMCiNlbHNl CgkmYnVmWzBdCiNlbmRpZgoJOwogICAgaW92WzBdLmlvdl9sZW4gPSBCVUYwX1NaOwoKCiAg ICBpb3ZbMV0uaW92X2Jhc2UgPQojaWYgQlVGMV9TWiA9PSAwCglOVUxMCiNlbHNlCgkmYnVm W0JVRjBfU1pdCiNlbmRpZgoJOwogICAgaW92WzFdLmlvdl9sZW4gPSBCVUYxX1NaOwoKCiAg ICBpb3ZbMl0uaW92X2Jhc2UgPQojaWYgQlVGMl9TWiA9PSAwCglOVUxMCiNlbHNlCgkmYnVm W0JVRjBfU1orQlVGMV9TWl0KI2VuZGlmCgk7CiAgICBpb3ZbMl0uaW92X2xlbiA9IEJVRjJf U1o7CgkJCiAgICB3ciA9IHdyaXRldigxLCAmaW92WzBdLCAzKTsKICAgIGZwcmludGYoc3Rk ZXJyLCAid3JpdGUgcmV0dXJuZWQ6ICVkXG4iLCB3cik7CiAgICByZXR1cm4gMDsKfQoKCgoK --------------060407030105010800010608 Content-Type: text/plain; name="sys-kern-sys_pipe.c-pipeselwakeup_with_directwrite-a.patch" Content-Transfer-Encoding: base64 Content-Disposition: inline; filename*0="sys-kern-sys_pipe.c-pipeselwakeup_with_directwrite-a.patch" SW5kZXg6IHN5cy9rZXJuL3N5c19waXBlLmMKPT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PQpSQ1MgZmlsZTogL2hv bWUvZHVtYmJlbGwvcHJvamVjdHMvZnJlZWJzZC9jdnMtbWlycm9yL3NyYy9zeXMva2Vybi9z eXNfcGlwZS5jLHYKcmV0cmlldmluZyByZXZpc2lvbiAxLjE5MQpkaWZmIC11IC1yMS4xOTEg c3lzX3BpcGUuYwotLS0gc3lzL2tlcm4vc3lzX3BpcGUuYwkyNyBNYXkgMjAwNyAxNzozMzox MCAtMDAwMAkxLjE5MQorKysgc3lzL2tlcm4vc3lzX3BpcGUuYwk2IFNlcCAyMDA3IDEzOjI4 OjIxIC0wMDAwCkBAIC04ODEsNiArODgxLDcgQEAKIAkJCXdha2V1cCh3cGlwZSk7CiAJCX0K IAkJd3BpcGUtPnBpcGVfc3RhdGUgfD0gUElQRV9XQU5UVzsKKwkJcGlwZXNlbHdha2V1cCh3 cGlwZSk7CiAJCXBpcGV1bmxvY2sod3BpcGUpOwogCQllcnJvciA9IG1zbGVlcCh3cGlwZSwg UElQRV9NVFgod3BpcGUpLAogCQkgICAgUFJJQklPIHwgUENBVENILCAicGlwZHd3IiwgMCk7 CkBAIC04OTYsNiArODk3LDcgQEAKIAkJCXdha2V1cCh3cGlwZSk7CiAJCX0KIAkJd3BpcGUt PnBpcGVfc3RhdGUgfD0gUElQRV9XQU5UVzsKKwkJcGlwZXNlbHdha2V1cCh3cGlwZSk7CiAJ CXBpcGV1bmxvY2sod3BpcGUpOwogCQllcnJvciA9IG1zbGVlcCh3cGlwZSwgUElQRV9NVFgo d3BpcGUpLAogCQkgICAgUFJJQklPIHwgUENBVENILCAicGlwZHdjIiwgMCk7CkBAIC0xMDgw LDYgKzEwODIsNyBAQAogCQkJCXdwaXBlLT5waXBlX3N0YXRlICY9IH5QSVBFX1dBTlRSOwog CQkJCXdha2V1cCh3cGlwZSk7CiAJCQl9CisJCQlwaXBlc2Vsd2FrZXVwKHdwaXBlKTsKIAkJ CXBpcGV1bmxvY2sod3BpcGUpOwogCQkJZXJyb3IgPSBtc2xlZXAod3BpcGUsIFBJUEVfTVRY KHJwaXBlKSwgUFJJQklPIHwgUENBVENILAogCQkJICAgICJwaXBid3ciLCAwKTsK --------------060407030105010800010608--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?46E14C1A.1060606>