From owner-freebsd-geom@FreeBSD.ORG Tue Dec 2 22:20:04 2008 Return-Path: Delivered-To: freebsd-geom@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 674501065670 for ; Tue, 2 Dec 2008 22:20:04 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 55C1B8FC16 for ; Tue, 2 Dec 2008 22:20:04 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id mB2MK39O085196 for ; Tue, 2 Dec 2008 22:20:03 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id mB2MK37K085195; Tue, 2 Dec 2008 22:20:03 GMT (envelope-from gnats) Date: Tue, 2 Dec 2008 22:20:03 GMT Message-Id: <200812022220.mB2MK37K085195@freefall.freebsd.org> To: freebsd-geom@FreeBSD.org From: "Will Andrews" Cc: Subject: Re: kern/113885: [gmirror] [patch] improved gmirror balance algorithm X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Will Andrews List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Dec 2008 22:20:04 -0000 The following reply was made to PR kern/113885; it has been noted by GNATS. From: "Will Andrews" To: bug-followup@freebsd.org Cc: "Mykola Zubach" Subject: Re: kern/113885: [gmirror] [patch] improved gmirror balance algorithm Date: Tue, 2 Dec 2008 14:50:29 -0700 ------=_Part_29247_27764781.1228254629354 Content-Type: multipart/alternative; boundary="----=_Part_29248_27670565.1228254629354" ------=_Part_29248_27670565.1228254629354 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline I have attached what I believe is a better version of your patch. It: 1) Fixes the type ambiguity of the new use_delay/best_use_delay and dist/best_dist variables, to match the variables used in their calculations; they should be uint64_t and off_t, respectively. 2) Uses bit shifts instead of multiplication/division in the use delay and distance calculations. The precision loss should be acceptable in this situation. 3) Cleans up the style of the code; add more commenting, better comments. 4) Gets rid of the g_mirror_disk.d_delay variable since it is no longer used, along with the function the original patch short-circuited. In my testing, with 16 simultaneous processes performing the same test at the same time, by throughput, random reads/writes improved by about 35% (low variance), while sequential reads/writes improved by 100-400% (high variance). IOs also increased proportionally. Testing was done using "rawio -a -p 16 /dev/mirror/testa", where the test mirror composed of two 160GB Seagate SATA disks and the system is a dual Opteron 246 with 1.5GB of RAM with no other load, running 8.0-CURRENT as of 12/1/2008. CPU usage impact vs. old load algorithm appears negligible as well. Regards, --Will. ------=_Part_29248_27670565.1228254629354 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline I have attached what I believe is a better version of your patch.  It:

1) Fixes the type ambiguity of the new use_delay/best_use_delay and dist/best_dist variables, to match the variables used in their calculations; they should be uint64_t and off_t, respectively.
2) Uses bit shifts instead of multiplication/division in the use delay and distance calculations.  The precision loss should be acceptable in this situation.
3) Cleans up the style of the code; add more commenting, better comments.
4) Gets rid of the g_mirror_disk.d_delay variable since it is no longer used, along with the function the original patch short-circuited.

In my testing, with 16 simultaneous processes performing the same test at the same time, by throughput, random reads/writes improved by about 35% (low variance), while sequential reads/writes improved by 100-400% (high variance).  IOs also increased proportionally.  Testing was done using "rawio -a -p 16 /dev/mirror/testa", where the test mirro r composed of two 160GB Seagate SATA disks and the system is a dual Opteron 246 with 1.5GB of RAM with no other load, running 8.0-CURRENT as of 12/1/2008.  CPU usage impact vs. old load algorithm appears negligible as well.

Regards,
--Will.
------=_Part_29248_27670565.1228254629354-- ------=_Part_29247_27764781.1228254629354 Content-Type: application/octet-stream; name=g_mirror_113885.diff Content-Transfer-Encoding: base64 X-Attachment-Id: f_fo92vwa00 Content-Disposition: attachment; filename=g_mirror_113885.diff SW5kZXg6IGdfbWlycm9yLmMKPT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PQotLS0gZ19taXJyb3IuYwkocmV2aXNpb24gMTg1 NTY3KQorKysgZ19taXJyb3IuYwkod29ya2luZyBjb3B5KQpAQCAtMjUsNyArMjUsNyBAQAogICov CiAKICNpbmNsdWRlIDxzeXMvY2RlZnMuaD4KLV9fRkJTRElEKCIkRnJlZUJTRCQiKTsKK19fRkJT RElEKCIkRnJlZUJTRDogc3JjL3N5cy9nZW9tL21pcnJvci9nX21pcnJvci5jLHYgMS45NCAyMDA3 LzEwLzIwIDIzOjIzOjE5IGp1bGlhbiBFeHAgJCIpOwogCiAjaW5jbHVkZSA8c3lzL3BhcmFtLmg+ CiAjaW5jbHVkZSA8c3lzL3N5c3RtLmg+CkBAIC00NSw3ICs0NSw2IEBACiAjaW5jbHVkZSA8c3lz L3NjaGVkLmg+CiAjaW5jbHVkZSA8Z2VvbS9taXJyb3IvZ19taXJyb3IuaD4KIAotCiBzdGF0aWMg TUFMTE9DX0RFRklORShNX01JUlJPUiwgIm1pcnJvcl9kYXRhIiwgIkdFT01fTUlSUk9SIERhdGEi KTsKIAogU1lTQ1RMX0RFQ0woX2tlcm5fZ2VvbSk7CkBAIC03MSw3ICs3MCwxMiBAQAogVFVOQUJM RV9JTlQoImtlcm4uZ2VvbS5taXJyb3Iuc3luY19yZXF1ZXN0cyIsICZnX21pcnJvcl9zeW5jcmVx cyk7CiBTWVNDVExfVUlOVChfa2Vybl9nZW9tX21pcnJvciwgT0lEX0FVVE8sIHN5bmNfcmVxdWVz dHMsIENUTEZMQUdfUkRUVU4sCiAgICAgJmdfbWlycm9yX3N5bmNyZXFzLCAwLCAiUGFyYWxsZWwg c3luY2hyb25pemF0aW9uIEkvTyByZXF1ZXN0cy4iKTsKK3N0YXRpYyB1X2ludCBnX21pcnJvcl9w bHVzZGVsYXkgPSA2MDAwMDsKK1RVTkFCTEVfSU5UKCJrZXJuLmdlb20ubWlycm9yLnBsdXNkZWxh eSIsICZnX21pcnJvcl9wbHVzZGVsYXkpOworU1lTQ1RMX1VJTlQoX2tlcm5fZ2VvbV9taXJyb3Is IE9JRF9BVVRPLCBwbHVzZGVsYXksIENUTEZMQUdfUlcsCisgICAgJmdfbWlycm9yX3BsdXNkZWxh eSwgMCwgIkFkZGl0aW9uYWwgbG9hZCBkZWxheSBpbiAxLzY1NTM2dGhzIG9mIGEgc2Vjb25kLiIp OwogCisKICNkZWZpbmUJTVNMRUVQKGlkZW50LCBtdHgsIHByaW9yaXR5LCB3bWVzZywgdGltZW91 dCkJZG8gewkJXAogCUdfTUlSUk9SX0RFQlVHKDQsICIlczogU2xlZXBpbmcgJXAuIiwgX19mdW5j X18sIChpZGVudCkpOwlcCiAJbXNsZWVwKChpZGVudCksIChtdHgpLCAocHJpb3JpdHkpLCAod21l c2cpLCAodGltZW91dCkpOwkJXApAQCAtNDUxLDggKzQ1NSw3IEBACiAJZGlzay0+ZF9pZCA9IG1k LT5tZF9kaWQ7CiAJZGlzay0+ZF9zdGF0ZSA9IEdfTUlSUk9SX0RJU0tfU1RBVEVfTk9ORTsKIAlk aXNrLT5kX3ByaW9yaXR5ID0gbWQtPm1kX3ByaW9yaXR5OwotCWRpc2stPmRfZGVsYXkuc2VjID0g MDsKLQlkaXNrLT5kX2RlbGF5LmZyYWMgPSAwOworCWRpc2stPmxhc3Rfb2Zmc2V0ID0gMDsKIAli aW51cHRpbWUoJmRpc2stPmRfbGFzdF91c2VkKTsKIAlkaXNrLT5kX2ZsYWdzID0gbWQtPm1kX2Rm bGFnczsKIAlpZiAobWQtPm1kX3Byb3ZpZGVyWzBdICE9ICdcMCcpCkBAIC04NjMsMTYgKzg2Niw2 IEBACiB9CiAKIHN0YXRpYyB2b2lkCi1nX21pcnJvcl91cGRhdGVfZGVsYXkoc3RydWN0IGdfbWly cm9yX2Rpc2sgKmRpc2ssIHN0cnVjdCBiaW8gKmJwKQotewotCi0JaWYgKGRpc2stPmRfc29mdGMt PnNjX2JhbGFuY2UgIT0gR19NSVJST1JfQkFMQU5DRV9MT0FEKQotCQlyZXR1cm47Ci0JYmludXB0 aW1lKCZkaXNrLT5kX2RlbGF5KTsKLQliaW50aW1lX3N1YigmZGlzay0+ZF9kZWxheSwgJmJwLT5i aW9fdDApOwotfQotCi1zdGF0aWMgdm9pZAogZ19taXJyb3JfZG9uZShzdHJ1Y3QgYmlvICpicCkK IHsKIAlzdHJ1Y3QgZ19taXJyb3Jfc29mdGMgKnNjOwpAQCAtOTA0LDggKzg5Nyw2IEBACiAJCWdf dG9wb2xvZ3lfbG9jaygpOwogCQlnX21pcnJvcl9raWxsX2NvbnN1bWVyKHNjLCBicC0+YmlvX2Zy b20pOwogCQlnX3RvcG9sb2d5X3VubG9jaygpOwotCX0gZWxzZSB7Ci0JCWdfbWlycm9yX3VwZGF0 ZV9kZWxheShkaXNrLCBicCk7CiAJfQogCiAJcGJwLT5iaW9faW5iZWQrKzsKQEAgLTE0NzIsMjUg KzE0NjMsNDUgQEAKIAlzdHJ1Y3QgZ19jb25zdW1lciAqY3A7CiAJc3RydWN0IGJpbyAqY2JwOwog CXN0cnVjdCBiaW50aW1lIGN1cnRpbWU7CisJb2ZmX3QgIGJpb19vZmZzZXQgPSBicC0+YmlvX29m ZnNldDsKKwlvZmZfdCAgYmVzdF9kaXN0ID0gLTEsIGRpc3Q7CisJdWludDY0X3QgYmVzdF91c2Vf ZGVsYXkgPSAwLCB1c2VfZGVsYXkgPSAwOwogCi0JYmludXB0aW1lKCZjdXJ0aW1lKTsKKwlnZXRi aW51cHRpbWUoJmN1cnRpbWUpOwogCS8qCi0JICogRmluZCBhIGRpc2sgd2hpY2ggdGhlIHNtYWxs ZXN0IGxvYWQuCisJICogRmluZCB0aGUgZGlzayB3aGljaCBoYXMgdGhlIHNtYWxsZXN0IHJhdGlv IG9mIGRpc3RhbmNlIHRvIHVzZQorCSAqIGRlbGF5LCBpLmUuIGl0cyBoZWFkIGxvb2tzIGNsb3Nl c3QgdG8gYmlvX29mZnNldCBhbmQgaXQgd2FzIHVzZWQKKwkgKiBsZWFzdCByZWNlbnRseS4KIAkg Ki8KIAlkaXNrID0gTlVMTDsKIAlMSVNUX0ZPUkVBQ0goZHAsICZzYy0+c2NfZGlza3MsIGRfbmV4 dCkgewogCQlpZiAoZHAtPmRfc3RhdGUgIT0gR19NSVJST1JfRElTS19TVEFURV9BQ1RJVkUpCiAJ CQljb250aW51ZTsKLQkJLyogSWYgZGlzayB3YXNuJ3QgdXNlZCBmb3IgbW9yZSB0aGFuIDIgc2Vj LCB1c2UgaXQuICovCi0JCWlmIChjdXJ0aW1lLnNlYyAtIGRwLT5kX2xhc3RfdXNlZC5zZWMgPj0g MikgeworCisJCWRpc3QgPSBkcC0+bGFzdF9vZmZzZXQgLSBiaW9fb2Zmc2V0OworCQlpZiAoZGlz dCA8IDApCisJCQlkaXN0ID0gLWRpc3Q7CisKKwkJLyoKKwkJICogQ2FsY3VsYXRlIHRoZSB1c2Ug ZGVsYXkgYXMgZm9sbG93czogQWRkIHRoZSBzeXNjdGwKKwkJICogY29uZmlndXJlZCBkZWxheSwg dGhlbiBjb252ZXJ0IHRoZSBiaW50aW1lIHN0cnVjdHVyZQorCQkgKiBpbiB0ZXJtcyBvZiAxLzY1 NTM2dGhzIG9mIGEgc2Vjb25kIGJlZm9yZSBhZGRpbmcgaXRzCisJCSAqIGNvbXBvbmVudHMuICBT byBtdWx0aXBseSBzZWNvbmRzIGRpZmZlcmVuY2UgYnkgNjU1MzYKKwkJICogYW5kIGRyb3AgYWxs IGJ1dCB0aGUgMTYgbW9zdCBzaWduaWZpY2FudCBiaXRzIGluIHRoZQorCQkgKiBmcmFjdGlvbiwg c2luY2UgdGhleSdyZSBhbGwgZ3JlYXRlciB0aGFuIDEvNjU1MzYuCisJCSAqLworCQl1c2VfZGVs YXkgPSBnX21pcnJvcl9wbHVzZGVsYXk7CisJCXVzZV9kZWxheSArPSAoKGN1cnRpbWUuc2VjIC0g ZHAtPmRfbGFzdF91c2VkLnNlYykgPDwgMTYpOworCQl1c2VfZGVsYXkgKz0gKChjdXJ0aW1lLmZy YWMgLSBkcC0+ZF9sYXN0X3VzZWQuZnJhYykgPj4gNDgpOworCisJCWlmIChiZXN0X2Rpc3QgPT0g LTEgfHwKKwkJICAgIGRpc3QgKiBiZXN0X3VzZV9kZWxheSA8IGJlc3RfZGlzdCAqIHVzZV9kZWxh eSkgewogCQkJZGlzayA9IGRwOwotCQkJYnJlYWs7CisJCQliZXN0X2Rpc3QgPSBkaXN0OworCQkJ YmVzdF91c2VfZGVsYXkgPSB1c2VfZGVsYXk7CiAJCX0KLQkJaWYgKGRpc2sgPT0gTlVMTCB8fAot CQkgICAgYmludGltZV9jbXAoJmRwLT5kX2RlbGF5LCAmZGlzay0+ZF9kZWxheSkgPCAwKSB7Ci0J CQlkaXNrID0gZHA7Ci0JCX0KIAl9CisKIAlLQVNTRVJUKGRpc2sgIT0gTlVMTCwgKCJOVUxMIGRp c2sgZm9yICVzLiIsIHNjLT5zY19uYW1lKSk7CiAJY2JwID0gZ19jbG9uZV9iaW8oYnApOwogCWlm IChjYnAgPT0gTlVMTCkgewpAQCAtMTUwNSw3ICsxNTE2LDggQEAKIAljcCA9IGRpc2stPmRfY29u c3VtZXI7CiAJY2JwLT5iaW9fZG9uZSA9IGdfbWlycm9yX2RvbmU7CiAJY2JwLT5iaW9fdG8gPSBj cC0+cHJvdmlkZXI7Ci0JYmludXB0aW1lKCZkaXNrLT5kX2xhc3RfdXNlZCk7CisJZGlzay0+ZF9s YXN0X3VzZWQgPSBjdXJ0aW1lOworCWRpc2stPmxhc3Rfb2Zmc2V0ID0gYmlvX29mZnNldDsKIAlH X01JUlJPUl9MT0dSRVEoMywgY2JwLCAiU2VuZGluZyByZXF1ZXN0LiIpOwogCUtBU1NFUlQoY3At PmFjciA+PSAxICYmIGNwLT5hY3cgPj0gMSAmJiBjcC0+YWNlID49IDEsCiAJICAgICgiQ29uc3Vt ZXIgJXMgbm90IG9wZW5lZCAociVkdyVkZSVkKS4iLCBjcC0+cHJvdmlkZXItPm5hbWUsIGNwLT5h Y3IsCkBAIC0xNjU5LDYgKzE2NzEsNyBAQAogCQkJCWdfaW9fZGVsaXZlcihicCwgYnAtPmJpb19l cnJvcik7CiAJCQkJcmV0dXJuOwogCQkJfQorCQkJZGlzay0+bGFzdF9vZmZzZXQgPSBicC0+Ymlv X29mZnNldDsKIAkJCWJpb3FfaW5zZXJ0X3RhaWwoJnF1ZXVlLCBjYnApOwogCQkJY2JwLT5iaW9f ZG9uZSA9IGdfbWlycm9yX2RvbmU7CiAJCQljcCA9IGRpc2stPmRfY29uc3VtZXI7CkluZGV4OiBn X21pcnJvci5oCj09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT0KLS0tIGdfbWlycm9yLmgJKHJldmlzaW9uIDE4NTU2NykKKysr IGdfbWlycm9yLmgJKHdvcmtpbmcgY29weSkKQEAgLTIzLDcgKzIzLDcgQEAKICAqIE9VVCBPRiBU SEUgVVNFIE9GIFRISVMgU09GVFdBUkUsIEVWRU4gSUYgQURWSVNFRCBPRiBUSEUgUE9TU0lCSUxJ VFkgT0YKICAqIFNVQ0ggREFNQUdFLgogICoKLSAqICRGcmVlQlNEJAorICogJEZyZWVCU0Q6IHNy Yy9zeXMvZ2VvbS9taXJyb3IvZ19taXJyb3IuaCx2IDEuMjQgMjAwNi8xMS8wMSAyMjo1MTo0OSBw amQgRXhwICQKICAqLwogCiAjaWZuZGVmCV9HX01JUlJPUl9IXwpAQCAtMTMzLDcgKzEzMyw3IEBA CiAJc3RydWN0IGdfbWlycm9yX3NvZnRjCSpkX3NvZnRjOyAvKiBCYWNrLXBvaW50ZXIgdG8gc29m dGMuICovCiAJaW50CQkgZF9zdGF0ZTsJLyogRGlzayBzdGF0ZS4gKi8KIAl1X2ludAkJIGRfcHJp b3JpdHk7CS8qIERpc2sgcHJpb3JpdHkuICovCi0Jc3RydWN0IGJpbnRpbWUJIGRfZGVsYXk7CS8q IERpc2sgZGVsYXkuICovCisJb2ZmX3QJCSBsYXN0X29mZnNldDsJLyogTEJBIG9mIGxhc3Qgb3Bl cmF0aW9uLiAqLwogCXN0cnVjdCBiaW50aW1lCSBkX2xhc3RfdXNlZDsJLyogV2hlbiBkaXNrIHdh cyBsYXN0IHVzZWQuICovCiAJdWludDY0X3QJIGRfZmxhZ3M7CS8qIEFkZGl0aW9uYWwgZmxhZ3Mu ICovCiAJdV9pbnQJCSBkX2dlbmlkOwkvKiBEaXNrJ3MgZ2VuZXJhdGlvbiBJRC4gKi8K ------=_Part_29247_27764781.1228254629354--