From owner-freebsd-arch@FreeBSD.ORG Thu Nov 15 00:47:01 2012 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id CF45D25F for ; Thu, 15 Nov 2012 00:47:01 +0000 (UTC) (envelope-from jkim@FreeBSD.org) Received: from hammer.pct.niksun.com (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) by mx1.freebsd.org (Postfix) with ESMTP id 5A9B98FC17 for ; Thu, 15 Nov 2012 00:47:01 +0000 (UTC) Message-ID: <50A43B52.8030102@FreeBSD.org> Date: Wed, 14 Nov 2012 19:46:10 -0500 From: Jung-uk Kim User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:16.0) Gecko/20121031 Thunderbird/16.0.2 MIME-Version: 1.0 To: freebsd-arch@freebsd.org Subject: [RFC] Generic population count function X-Enigmail-Version: 1.4.5 Content-Type: multipart/mixed; boundary="------------050306080403080108050002" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Nov 2012 00:47:02 -0000 This is a multi-part message in MIME format. --------------050306080403080108050002 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I implemented generic population count function. Please see the attachment. It is also available from here: http://people.freebsd.org/~jkim/bitcount.diff The idea is to make use of CPU-supported population count instructions if available and the compiler supports them (i.e., clang), especially for larger than 32-bit data. The patch also has use cases for the new function (i.e., counting number of bits in IPv6 address[*]). Any objection? Jung-uk Kim * PS: BTW, I am not sure whether this is correct: - --- sys/netpfil/ipfw/ip_fw_table.c +++ sys/netpfil/ipfw/ip_fw_table.c @@ -720,11 +717,10 @@ dump_table_xentry_extended(struct radix_node *rn, switch (tbl->type) { #ifdef INET6 case IPFW_TABLE_CIDR: - - /* Count IPv6 mask */ - - v = (uint32_t *)&n->m.mask6.sin6_addr; - - for (i = 0; i < sizeof(struct in6_addr) / 4; i++, v++) - - xent->masklen += bitcount32(*v); - - memcpy(&xent->k, &n->a.addr6.sin6_addr, sizeof(struct in6_addr)); + xent->masklen += bitcount(&n->m.mask6.sin6_addr, + sizeof(struct in6_addr)); + memcpy(&xent->k, &n->a.addr6.sin6_addr, + sizeof(struct in6_addr)); break; #endif case IPFW_TABLE_INTERFACE: Is xent->masklen initialized to a non-zero value somewhere, i.e., shouldn't we just use '=' instead of '+=' here? -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) Comment: Using GnuPG with Mozilla - http://www.enigmail.net/ iEYEARECAAYFAlCkO1IACgkQmlay1b9qnVNzggCfW+Fri0Aj4TDDXcAoPc4SaATB clQAnikNhO6JVJ+Ez71cbdQV5Qy4uHam =r4nt -----END PGP SIGNATURE----- --------------050306080403080108050002 Content-Type: text/plain; charset=UTF-8; name="bitcount.diff" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="bitcount.diff" SW5kZXg6IHN5cy9uZXRncmFwaC9uZXRmbG93L25ldGZsb3cuYwo9PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09Ci0t LSBzeXMvbmV0Z3JhcGgvbmV0Zmxvdy9uZXRmbG93LmMJKHJldmlzaW9uIDI0MzA0MikKKysr IHN5cy9uZXRncmFwaC9uZXRmbG93L25ldGZsb3cuYwkod29ya2luZyBjb3B5KQpAQCAtNDA5 LDEyICs0MDksOSBAQCBoYXNoX2luc2VydChwcml2X3AgcHJpdiwgc3RydWN0IGZsb3dfaGFz aF9lbnRyeSAqaAogfQogCiAjaWZkZWYgSU5FVDYKLS8qIFhYWDogbWFrZSBub3JtYWwgZnVu Y3Rpb24sIGluc3RlYWQgb2YuLiAqLwotI2RlZmluZSBpcHY2X21hc2tsZW4oeCkJCWJpdGNv dW50MzIoKHgpLl9fdTZfYWRkci5fX3U2X2FkZHIzMlswXSkgKyBcCi0JCQkJYml0Y291bnQz MigoeCkuX191Nl9hZGRyLl9fdTZfYWRkcjMyWzFdKSArIFwKLQkJCQliaXRjb3VudDMyKCh4 KS5fX3U2X2FkZHIuX191Nl9hZGRyMzJbMl0pICsgXAotCQkJCWJpdGNvdW50MzIoKHgpLl9f dTZfYWRkci5fX3U2X2FkZHIzMlszXSkKLSNkZWZpbmUgUlRfTUFTSzYoeCkJKGlwdjZfbWFz a2xlbigoKHN0cnVjdCBzb2NrYWRkcl9pbjYgKilydF9tYXNrKHgpKS0+c2luNl9hZGRyKSkK KyNkZWZpbmUgUlRfTUFTSzYoeCkgXAorCWJpdGNvdW50KCYoKHN0cnVjdCBzb2NrYWRkcl9p bjYgKilydF9tYXNrKHgpKS0+c2luNl9hZGRyLCBcCisJICAgIHNpemVvZihzdHJ1Y3QgaW42 X2FkZHIpKQogc3RhdGljIGludAogaGFzaDZfaW5zZXJ0KHByaXZfcCBwcml2LCBzdHJ1Y3Qg Zmxvd19oYXNoX2VudHJ5ICpoc2g2LCBzdHJ1Y3QgZmxvdzZfcmVjICpyLAogCWludCBwbGVu LCB1aW50OF90IGZsYWdzLCB1aW50OF90IHRjcF9mbGFncykKQEAgLTUwNSw3ICs1MDIsNiBA QCBoYXNoNl9pbnNlcnQocHJpdl9wIHByaXYsIHN0cnVjdCBmbG93X2hhc2hfZW50cnkgKgog CiAJcmV0dXJuICgwKTsKIH0KLSN1bmRlZiBpcHY2X21hc2tsZW4KICN1bmRlZiBSVF9NQVNL NgogI2VuZGlmCiAKSW5kZXg6IHN5cy9uZXRwZmlsL2lwZncvaXBfZndfdGFibGUuYwo9PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09Ci0tLSBzeXMvbmV0cGZpbC9pcGZ3L2lwX2Z3X3RhYmxlLmMJKHJldmlzaW9u IDI0MzA0MikKKysrIHN5cy9uZXRwZmlsL2lwZncvaXBfZndfdGFibGUuYwkod29ya2luZyBj b3B5KQpAQCAtNzA2LDEwICs3MDYsNyBAQCBkdW1wX3RhYmxlX3hlbnRyeV9leHRlbmRlZChz dHJ1Y3QgcmFkaXhfbm9kZSAqcm4sCiAJc3RydWN0IHRhYmxlX3hlbnRyeSAqIGNvbnN0IG4g PSAoc3RydWN0IHRhYmxlX3hlbnRyeSAqKXJuOwogCWlwZndfeHRhYmxlICogY29uc3QgdGJs ID0gYXJnOwogCWlwZndfdGFibGVfeGVudHJ5ICp4ZW50OwotI2lmZGVmIElORVQ2Ci0JaW50 IGk7Ci0JdWludDMyX3QgKnY7Ci0jZW5kaWYKKwogCS8qIE91dCBvZiBtZW1vcnksIHJldHVy bmluZyAqLwogCWlmICh0YmwtPmNudCA9PSB0YmwtPnNpemUpCiAJCXJldHVybiAoMSk7CkBA IC03MjAsMTEgKzcxNywxMCBAQCBkdW1wX3RhYmxlX3hlbnRyeV9leHRlbmRlZChzdHJ1Y3Qg cmFkaXhfbm9kZSAqcm4sCiAJc3dpdGNoICh0YmwtPnR5cGUpIHsKICNpZmRlZiBJTkVUNgog CWNhc2UgSVBGV19UQUJMRV9DSURSOgotCQkvKiBDb3VudCBJUHY2IG1hc2sgKi8KLQkJdiA9 ICh1aW50MzJfdCAqKSZuLT5tLm1hc2s2LnNpbjZfYWRkcjsKLQkJZm9yIChpID0gMDsgaSA8 IHNpemVvZihzdHJ1Y3QgaW42X2FkZHIpIC8gNDsgaSsrLCB2KyspCi0JCQl4ZW50LT5tYXNr bGVuICs9IGJpdGNvdW50MzIoKnYpOwotCQltZW1jcHkoJnhlbnQtPmssICZuLT5hLmFkZHI2 LnNpbjZfYWRkciwgc2l6ZW9mKHN0cnVjdCBpbjZfYWRkcikpOworCQl4ZW50LT5tYXNrbGVu ICs9IGJpdGNvdW50KCZuLT5tLm1hc2s2LnNpbjZfYWRkciwKKwkJICAgIHNpemVvZihzdHJ1 Y3QgaW42X2FkZHIpKTsKKwkJbWVtY3B5KCZ4ZW50LT5rLCAmbi0+YS5hZGRyNi5zaW42X2Fk ZHIsCisJCSAgICBzaXplb2Yoc3RydWN0IGluNl9hZGRyKSk7CiAJCWJyZWFrOwogI2VuZGlm CiAJY2FzZSBJUEZXX1RBQkxFX0lOVEVSRkFDRToKSW5kZXg6IHN5cy9zeXMvYml0Y291bnQu aAo9PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09Ci0tLSBzeXMvc3lzL2JpdGNvdW50LmgJKHJldmlzaW9uIDApCisr KyBzeXMvc3lzL2JpdGNvdW50LmgJKHdvcmtpbmcgY29weSkKQEAgLTAsMCArMSwxMDYgQEAK Ky8qLQorICogQ29weXJpZ2h0IChjKSAyMDEyIEp1bmctdWsgS2ltIDxqa2ltQEZyZWVCU0Qu b3JnPgorICogQWxsIHJpZ2h0cyByZXNlcnZlZC4KKyAqCisgKiBSZWRpc3RyaWJ1dGlvbiBh bmQgdXNlIGluIHNvdXJjZSBhbmQgYmluYXJ5IGZvcm1zLCB3aXRoIG9yIHdpdGhvdXQKKyAq IG1vZGlmaWNhdGlvbiwgYXJlIHBlcm1pdHRlZCBwcm92aWRlZCB0aGF0IHRoZSBmb2xsb3dp bmcgY29uZGl0aW9ucworICogYXJlIG1ldDoKKyAqIDEuIFJlZGlzdHJpYnV0aW9ucyBvZiBz b3VyY2UgY29kZSBtdXN0IHJldGFpbiB0aGUgYWJvdmUgY29weXJpZ2h0CisgKiAgICBub3Rp Y2UsIHRoaXMgbGlzdCBvZiBjb25kaXRpb25zIGFuZCB0aGUgZm9sbG93aW5nIGRpc2NsYWlt ZXIuCisgKiAyLiBSZWRpc3RyaWJ1dGlvbnMgaW4gYmluYXJ5IGZvcm0gbXVzdCByZXByb2R1 Y2UgdGhlIGFib3ZlIGNvcHlyaWdodAorICogICAgbm90aWNlLCB0aGlzIGxpc3Qgb2YgY29u ZGl0aW9ucyBhbmQgdGhlIGZvbGxvd2luZyBkaXNjbGFpbWVyIGluIHRoZQorICogICAgZG9j dW1lbnRhdGlvbiBhbmQvb3Igb3RoZXIgbWF0ZXJpYWxzIHByb3ZpZGVkIHdpdGggdGhlIGRp c3RyaWJ1dGlvbi4KKyAqCisgKiBUSElTIFNPRlRXQVJFIElTIFBST1ZJREVEIEJZIFRIRSBB VVRIT1IgQU5EIENPTlRSSUJVVE9SUyBgYEFTIElTJycgQU5ECisgKiBBTlkgRVhQUkVTUyBP UiBJTVBMSUVEIFdBUlJBTlRJRVMsIElOQ0xVRElORywgQlVUIE5PVCBMSU1JVEVEIFRPLCBU SEUKKyAqIElNUExJRUQgV0FSUkFOVElFUyBPRiBNRVJDSEFOVEFCSUxJVFkgQU5EIEZJVE5F U1MgRk9SIEEgUEFSVElDVUxBUiBQVVJQT1NFCisgKiBBUkUgRElTQ0xBSU1FRC4gIElOIE5P IEVWRU5UIFNIQUxMIFRIRSBBVVRIT1IgT1IgQ09OVFJJQlVUT1JTIEJFIExJQUJMRQorICog Rk9SIEFOWSBESVJFQ1QsIElORElSRUNULCBJTkNJREVOVEFMLCBTUEVDSUFMLCBFWEVNUExB UlksIE9SIENPTlNFUVVFTlRJQUwKKyAqIERBTUFHRVMgKElOQ0xVRElORywgQlVUIE5PVCBM SU1JVEVEIFRPLCBQUk9DVVJFTUVOVCBPRiBTVUJTVElUVVRFIEdPT0RTCisgKiBPUiBTRVJW SUNFUzsgTE9TUyBPRiBVU0UsIERBVEEsIE9SIFBST0ZJVFM7IE9SIEJVU0lORVNTIElOVEVS UlVQVElPTikKKyAqIEhPV0VWRVIgQ0FVU0VEIEFORCBPTiBBTlkgVEhFT1JZIE9GIExJQUJJ TElUWSwgV0hFVEhFUiBJTiBDT05UUkFDVCwgU1RSSUNUCisgKiBMSUFCSUxJVFksIE9SIFRP UlQgKElOQ0xVRElORyBORUdMSUdFTkNFIE9SIE9USEVSV0lTRSkgQVJJU0lORyBJTiBBTlkg V0FZCisgKiBPVVQgT0YgVEhFIFVTRSBPRiBUSElTIFNPRlRXQVJFLCBFVkVOIElGIEFEVklT RUQgT0YgVEhFIFBPU1NJQklMSVRZIE9GCisgKiBTVUNIIERBTUFHRS4KKyAqCisgKiAkRnJl ZUJTRCQKKyAqLworCisjaWZuZGVmIF9TWVNfQklUQ09VTlRfSF8KKyNkZWZpbmUJX1NZU19C SVRDT1VOVF9IXworCisjaW5jbHVkZSA8c3lzL2NkZWZzLmg+CisjaW5jbHVkZSA8c3lzL3R5 cGVzLmg+CisKKyNpZiBfX0dOVUNfUFJFUkVRX18oMywgNCkKKyNkZWZpbmUJX19CSVRDT1VO VCh4KQlfX2J1aWx0aW5fcG9wY291bnRsKHgpCisjZGVmaW5lCV9fQklUQ09VTlQzMih4KQlf X2J1aWx0aW5fcG9wY291bnQoeCkKKyNkZWZpbmUJX19CSVRDT1VOVF9UCXVfbG9uZworI2Vs c2UKKyNkZWZpbmUJX19CSVRDT1VOVCh4KQlfX2JpdGNvdW50MzIoeCkKKyNkZWZpbmUJX19C SVRDT1VOVDMyKHgpCV9fYml0Y291bnQzMih4KQorI2RlZmluZQlfX0JJVENPVU5UX1QJdWlu dDMyX3QKKyNlbmRpZgorCisvKgorICogUG9wdWxhdGlvbiBjb3VudCBhbGdvcml0aG0gdXNp bmcgU1dBUiBhcHByb2FjaAorICogLSAiU0lNRCBXaXRoaW4gQSBSZWdpc3RlciIuCisgKi8K K3N0YXRpYyBfX2lubGluZSBpbnQKK19fYml0Y291bnQzMih1aW50MzJfdCB4KQoreworCisJ eCArPSB+KCh4ID4+IDEpICYgMHg1NTU1NTU1NSkgKyAxOworCXggPSAoeCAmIDB4MzMzMzMz MzMpICsgKCh4ID4+IDIpICYgMHgzMzMzMzMzMyk7CisJeCA9ICh4ICsgKHggPj4gNCkpICYg MHgwZjBmMGYwZjsKKwl4ICs9IHggPj4gODsKKwl4ICs9IHggPj4gMTY7CisJcmV0dXJuICh4 ICYgMHgzZik7Cit9CisKK3N0YXRpYyBfX2lubGluZSBpbnQKK2JpdGNvdW50KHZvaWQgKnAs IGludCBsZW4pCit7CisJX19CSVRDT1VOVF9UIHg7CisJdV9jaGFyICpzdHI7CisJaW50IGNv dW50OworCisJLyoKKwkgKiBOdW1iZXIgb2YgYml0cyBtdXN0IGJlIG5vbi16ZXJvIGFuZCBs ZXNzIG9yIGVxdWFsIHRvIElOVF9NQVguCisJICovCisJaWYgKGxlbiA8PSAwIHx8IGxlbiA+ PSAweDEwMDAwMDAwKQorCQlyZXR1cm4gKC0xKTsKKworCWZvciAoY291bnQgPSAwLCBzdHIg PSBwOyBsZW4gPiAwOyBsZW4gLT0gc2l6ZW9mKHgpKSB7CisJCWlmIChsZW4gLyBzaXplb2Yo eCkgPiAwKSB7CisJCQl4ID0gKihfX0JJVENPVU5UX1QgKilzdHI7CisJCQlzdHIgKz0gc2l6 ZW9mKHgpOworCQl9IGVsc2UgeworCQkJLyogQnl0ZSBvcmRlciBpcyBub3QgaW1wb3J0YW50 IGhlcmUuICovCisJCQlmb3IgKHggPSAwOyBsZW4gPiAwOyBzdHIrKywgbGVuLS0pCisJCQkJ eCB8PSAqc3RyIDw8IChsZW4gKiA4KTsKKwkJfQorCQljb3VudCArPSBfX0JJVENPVU5UKHgp OworCX0KKwlyZXR1cm4gKGNvdW50KTsKK30KKworc3RhdGljIF9faW5saW5lIGludAorYml0 Y291bnQxNih1aW50MTZfdCB4KQoreworCisJcmV0dXJuIChfX0JJVENPVU5UMzIoeCkpOwor fQorCitzdGF0aWMgX19pbmxpbmUgaW50CitiaXRjb3VudDMyKHVpbnQzMl90IHgpCit7CisK KwlyZXR1cm4gKF9fQklUQ09VTlQzMih4KSk7Cit9CisKKyN1bmRlZiBfX0JJVENPVU5UCisj dW5kZWYgX19CSVRDT1VOVDMyCisjdW5kZWYgX19CSVRDT1VOVF9UCisKKyNlbmRpZiAvKiAh X1NZU19CSVRDT1VOVF9IXyAqLwoKUHJvcGVydHkgY2hhbmdlcyBvbjogc3lzL3N5cy9iaXRj b3VudC5oCl9fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX18KQWRkZWQ6IHN2bjplb2wtc3R5bGUKIyMgLTAsMCArMSAj IworbmF0aXZlClwgTm8gbmV3bGluZSBhdCBlbmQgb2YgcHJvcGVydHkKQWRkZWQ6IHN2bjpt aW1lLXR5cGUKIyMgLTAsMCArMSAjIwordGV4dC9wbGFpbgpcIE5vIG5ld2xpbmUgYXQgZW5k IG9mIHByb3BlcnR5CkFkZGVkOiBzdm46a2V5d29yZHMKIyMgLTAsMCArMSAjIworRnJlZUJT RD0lSApcIE5vIG5ld2xpbmUgYXQgZW5kIG9mIHByb3BlcnR5CkluZGV4OiBzeXMvc3lzL3N5 c3RtLmgKPT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PQotLS0gc3lzL3N5cy9zeXN0bS5oCShyZXZpc2lvbiAyNDMw NDIpCisrKyBzeXMvc3lzL3N5c3RtLmgJKHdvcmtpbmcgY29weSkKQEAgLTQwLDYgKzQwLDcg QEAKIAogI2luY2x1ZGUgPG1hY2hpbmUvYXRvbWljLmg+CiAjaW5jbHVkZSA8bWFjaGluZS9j cHVmdW5jLmg+CisjaW5jbHVkZSA8c3lzL2JpdGNvdW50Lmg+CiAjaW5jbHVkZSA8c3lzL2Nh bGxvdXQuaD4KICNpbmNsdWRlIDxzeXMvY2RlZnMuaD4KICNpbmNsdWRlIDxzeXMvcXVldWUu aD4KQEAgLTM4NywzMSArMzg4LDQgQEAgaW50IGFsbG9jX3Vucl9zcGVjaWZpYyhzdHJ1Y3Qg dW5yaGRyICp1aCwgdV9pbnQgaXQKIGludCBhbGxvY191bnJsKHN0cnVjdCB1bnJoZHIgKnVo KTsKIHZvaWQgZnJlZV91bnIoc3RydWN0IHVucmhkciAqdWgsIHVfaW50IGl0ZW0pOwogCi0v KgotICogUG9wdWxhdGlvbiBjb3VudCBhbGdvcml0aG0gdXNpbmcgU1dBUiBhcHByb2FjaAot ICogLSAiU0lNRCBXaXRoaW4gQSBSZWdpc3RlciIuCi0gKi8KLXN0YXRpYyBfX2lubGluZSB1 aW50MzJfdAotYml0Y291bnQzMih1aW50MzJfdCB4KQotewotCi0JeCA9ICh4ICYgMHg1NTU1 NTU1NSkgKyAoKHggJiAweGFhYWFhYWFhKSA+PiAxKTsKLQl4ID0gKHggJiAweDMzMzMzMzMz KSArICgoeCAmIDB4Y2NjY2NjY2MpID4+IDIpOwotCXggPSAoeCArICh4ID4+IDQpKSAmIDB4 MGYwZjBmMGY7Ci0JeCA9ICh4ICsgKHggPj4gOCkpOwotCXggPSAoeCArICh4ID4+IDE2KSkg JiAweDAwMDAwMGZmOwotCXJldHVybiAoeCk7Ci19Ci0KLXN0YXRpYyBfX2lubGluZSB1aW50 MTZfdAotYml0Y291bnQxNih1aW50MzJfdCB4KQotewotCi0JeCA9ICh4ICYgMHg1NTU1KSAr ICgoeCAmIDB4YWFhYSkgPj4gMSk7Ci0JeCA9ICh4ICYgMHgzMzMzKSArICgoeCAmIDB4Y2Nj YykgPj4gMik7Ci0JeCA9ICh4ICsgKHggPj4gNCkpICYgMHgwZjBmOwotCXggPSAoeCArICh4 ID4+IDgpKSAmIDB4MDBmZjsKLQlyZXR1cm4gKHgpOwotfQotCiAjZW5kaWYgLyogIV9TWVNf U1lTVE1fSF8gKi8K --------------050306080403080108050002--