Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 13 Oct 2024 14:21:39 -0400
From:      "David E. Cross" <david@crossfamilyweb.com>
To:        Marek Zarychta <zarychtam@plan-b.pwste.edu.pl>
Cc:        freebsd-hackers@freebsd.org
Subject:   Re: Review D38047 ... and then there was one....
Message-ID:  <f819b96f-81a9-5b95-07e2-1c57c3bdb32e@crossfamilyweb.com>
In-Reply-To: <553ea3d5-c94e-9c2f-c044-db7986625c74@crossfamilyweb.com>
References:  <1fd47603-0bf2-4fcf-a556-22335d99e203@plan-b.pwste.edu.pl> <DB1FFEBD-1FD4-43A5-9899-85C6DD292E3E@gmail.com> <a9b5e3e7-904f-46be-ab0e-068c6e6fef0a@plan-b.pwste.edu.pl> <553ea3d5-c94e-9c2f-c044-db7986625c74@crossfamilyweb.com>

next in thread | previous in thread | raw e-mail | index | archive | help
This is a multi-part message in MIME format.
--------------ff50sKx0ynBebR1XvMoA1yeA
Content-Type: multipart/alternative;
 boundary="------------RCDID0qg0faNyHDCUHl5y0OZ"

--------------RCDID0qg0faNyHDCUHl5y0OZ
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit

I have now fixed this as well.

WARNING, I did this as a manual patch to my tree and then put it back in 
git, I have not tried this EXACT code as is; I am working on some other 
changes  as well so things are a bit in flux; I will attach the diff to 
this, but caveat emptor.

The patch given is to be applied incrementally after the last patch.  
let met know how it works (or if it works!) for you.

On 10/8/24 16:17, David E. Cross wrote:
>
> Ok, I looked a bit into this and for the case of 'getent *' it really 
> is not (currently) a fair comparison to speed.
>
> For 'getent password' the system currently works as follows, for each 
> datasource in the list fully iterate over EVERY datasource, and 
> 'cache' is a datasource, but so is ldap.
>
> What you wind up getting is a list of EVERYTHNG in files, then a list 
> of everything in cache, and then a list of everything in LDAP. (or 
> whatever).   SO every time it will always go back to origin, so 
> caching effectively doesn't matter except to duplicate the data.
>
> I remember this when I was doing the initial development and I looked 
> into ways to NOT have it do it but for some reason I didn't think it 
> was possible without a substantial rewrite, I am taking another look 
> to see if that is still true or if there is a way around it.
>
> Going on my vague (it has been multiple years now), I think in the 
> GENERAL case it is unavoidable.  The way NSCD typically operates is 
> that looked up values are PUSHED into the cache from the client.  That 
> is the client says 'do you have X'? nscd replies 'no', then the CLIENT 
> falls back, does the lookup, get the value and pushes it into nscd.  
> nscd additionally has a 'perform_lookups' flag that will have it do 
> the lookup itself and then tell the client the result.  The 
> interaction of this variable behavior is that there is no way to 
> programatically shortcircuit it without libc knowing how nscd is 
> optionally configured.  If libc knew that nscd would perform the 
> lookups itself then it could for getent type calls just return 
> immediately after the cache layer enumeration.  if libc knew that nscd 
> would NOT perform lookups then it could bypass it and do the normal.
>
>
> I guess I could implement it as follows:
>
> nscd retruns NS_SUCCESS if it performs its own lookups and then in the 
> case of getent NS_SUCCESS is treated as a return step for the cache 
> layer only (since otherwise getent calls are treated as continue 
> otherwise you'd never enumerate anything after files). and NS_NOTFOUND 
> if it doesn't.. and then the libc layer would treat that as a 
> continue.  .. I think that may do it... I need to refamiliarize myself 
> with that code.
>
>
> In the meantime, checking basic lookups (not enumerations) is a more 
> fair test.  Also keep in mind that without [notfound=return] that 
> misses will always fall back to origin, which is probably what you 
> want with nscd in the default configuration, but not with nscd doing 
> its own lookups.
>
> On 10/7/24 11:33, Marek Zarychta wrote:
>>
>> W dniu 7.10.2024 o 07:05, David Cross pisze:
>>
>>> How many entries are in your ldap structure?  I can attempt a replication here
>>
>> Hello David,
>>
>> I will rather not expose it publicly. Whole LDAP directory contains 
>> few thousand entries - and it was was used for the tests mentioned in 
>> this thread.
>>
>> With the filters applied I see below 1k entries, and then lookup with 
>> nsdc running takes: first lookup 0.16s, next lookups 0.09s, while 
>> without nscd it varies from 0.12 to 0.08 - so nscd performs OK.
>>
>> I have your patch applied and I am still testing it with 
>> net/nss-pam-ldapd from ports with patch for login classes applied 
>> (it's present in port but not enabled by default). So far it works 
>> without issues.
>>
>> -- 
>> Marek Zarychta
--------------RCDID0qg0faNyHDCUHl5y0OZ
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: 8bit

<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body>
    <p>I have now fixed this as well.</p>
    <p>WARNING, I did this as a manual patch to my tree and then put it
      back in git, I have not tried this EXACT code as is; I am working
      on some other changes  as well so things are a bit in flux; I will
      attach the diff to this, but caveat emptor.</p>
    <p>The patch given is to be applied incrementally after the last
      patch.  let met know how it works (or if it works!) for you.<br>
    </p>
    <div class="moz-cite-prefix">On 10/8/24 16:17, David E. Cross wrote:<br>
    </div>
    <blockquote type="cite"
      cite="mid:553ea3d5-c94e-9c2f-c044-db7986625c74@crossfamilyweb.com">
      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
      <p>Ok, I looked a bit into this and for the case of 'getent *' it
        really is not (currently) a fair comparison to speed.</p>
      <p>For 'getent password' the system currently works as follows,
        for each datasource in the list fully iterate over EVERY
        datasource, and 'cache' is a datasource, but so is ldap.</p>
      <p>What you wind up getting is a list of EVERYTHNG in files, then
        a list of everything in cache, and then a list of everything in
        LDAP. (or whatever).   SO every time it will always go back to
        origin, so caching effectively doesn't matter except to
        duplicate the data.</p>
      <p>I remember this when I was doing the initial development and I
        looked into ways to NOT have it do it but for some reason I
        didn't think it was possible without a substantial rewrite, I am
        taking another look to see if that is still true or if there is
        a way around it.</p>
      <p>Going on my vague (it has been multiple years now), I think in
        the GENERAL case it is unavoidable.  The way NSCD typically
        operates is that looked up values are PUSHED into the cache from
        the client.  That is the client says 'do you have X'? nscd
        replies 'no', then the CLIENT falls back, does the lookup, get
        the value and pushes it into nscd.  nscd additionally has a
        'perform_lookups' flag that will have it do the lookup itself
        and then tell the client the result.  The interaction of this
        variable behavior is that there is no way to programatically
        shortcircuit it without libc knowing how nscd is optionally
        configured.  If libc knew that nscd would perform the lookups
        itself then it could for getent type calls just return
        immediately after the cache layer enumeration.  if libc knew
        that nscd would NOT perform lookups then it could bypass it and
        do the normal.</p>
      <p><br>
      </p>
      <p>I guess I could implement it as follows:</p>
      <p>nscd retruns NS_SUCCESS if it performs its own lookups and then
        in the case of getent NS_SUCCESS is treated as a return step for
        the cache layer only (since otherwise getent calls are treated
        as continue otherwise you'd never enumerate anything after
        files). and NS_NOTFOUND if it doesn't.. and then the libc layer
        would treat that as a continue.  .. I think that may do it... I
        need to refamiliarize myself with that code.</p>
      <p><br>
      </p>
      <p>In the meantime, checking basic lookups (not enumerations) is a
        more fair test.  Also keep in mind that without
        [notfound=return] that misses will always fall back to origin,
        which is probably what you want with nscd in the default
        configuration, but not with nscd doing its own lookups.<br>
      </p>
      <div class="moz-cite-prefix">On 10/7/24 11:33, Marek Zarychta
        wrote:<br>
      </div>
      <blockquote type="cite"
        cite="mid:a9b5e3e7-904f-46be-ab0e-068c6e6fef0a@plan-b.pwste.edu.pl">
        <meta http-equiv="Content-Type" content="text/html;
          charset=UTF-8">
        <p>W dniu 7.10.2024 o 07:05, David Cross pisze: </p>
        <blockquote type="cite"
          cite="mid:DB1FFEBD-1FD4-43A5-9899-85C6DD292E3E@gmail.com">
          <pre class="moz-quote-pre" wrap="">How many entries are in your ldap structure?  I can attempt a replication here</pre>
        </blockquote>
        <p>Hello David, <br>
        </p>
        <p>I will rather not expose it publicly. Whole LDAP directory
          contains few thousand entries - and it was was used for the
          tests mentioned in this thread.</p>
        <p>With the filters applied I see below 1k entries, and then
          lookup with nsdc running takes: first lookup 0.16s, next
          lookups 0.09s, while without nscd it varies from 0.12 to 0.08
          - so nscd performs OK. <br>
        </p>
        <p>I have your patch applied and I am still testing it with
          net/nss-pam-ldapd from ports with patch for login classes
          applied (it's present in port but not enabled by default). So
          far it works without issues.<br>
        </p>
        <p><span style="white-space: pre-wrap">
</span></p>
        <pre class="moz-signature" cols="72">-- 
Marek Zarychta</pre>
      </blockquote>
    </blockquote>
  </body>
</html>

--------------RCDID0qg0faNyHDCUHl5y0OZ--
--------------ff50sKx0ynBebR1XvMoA1yeA
Content-Type: text/x-patch; charset=UTF-8; name="getent.diff"
Content-Disposition: attachment; filename="getent.diff"
Content-Transfer-Encoding: base64

ZGlmZiAtLWdpdCBhL2xpYi9saWJjL25ldC9uc2NhY2hlLmMgYi9saWIvbGliYy9uZXQvbnNj
YWNoZS5jCmluZGV4IDM1MzdkNzdlZGJiZS4uOTM3NWQxYzlhZGIxIDEwMDY0NAotLS0gYS9s
aWIvbGliYy9uZXQvbnNjYWNoZS5jCisrKyBiL2xpYi9saWJjL25ldC9uc2NhY2hlLmMKQEAg
LTMxNywxMSArMzE3LDExIEBAIF9fbnNzX21wX2NhY2hlX3JlYWQodm9pZCAqcmV0dmFsLCB2
b2lkICptZGF0YSwgdmFfbGlzdCBhcCkKIAkJX19jbG9zZV9jYWNoZWRfbXBfcmVhZF9zZXNz
aW9uKHJzKTsKIAkJcnMgPSBJTlZBTElEX0NBQ0hFRF9NUF9SRUFEX1NFU1NJT047CiAJCWNh
Y2hlX2luZm8tPnNldF9tcF9yc19mdW5jKHJzKTsKLQkJcmV0dXJuIChyZXMgPT0gLTEgPyBO
U19SRVRVUk4gOiBOU19VTkFWQUlMKTsKKwkJcmV0dXJuIChyZXMgPT0gMSA/IE5TX05PVEZP
VU5EIDogTlNfVU5BVkFJTCk7CiAJfQogCiAJZnJlZShidWZmZXIpOwotCXJldHVybiAocmVz
ID09IDAgPyBOU19TVUNDRVNTIDogTlNfTk9URk9VTkQpOworCXJldHVybiAoTlNfU1VDQ0VT
Uyk7CiB9CiAKIGludApkaWZmIC0tZ2l0IGEvbGliL2xpYmMvbmV0L25zY2FjaGVkY2xpLmMg
Yi9saWIvbGliYy9uZXQvbnNjYWNoZWRjbGkuYwppbmRleCBmNTdlNjliZGNlYjIuLmJiM2Yx
Mzc4NGY0YyAxMDA2NDQKLS0tIGEvbGliL2xpYmMvbmV0L25zY2FjaGVkY2xpLmMKKysrIGIv
bGliL2xpYmMvbmV0L25zY2FjaGVkY2xpLmMKQEAgLTUzOCw3ICs1MzgsNyBAQCBfX2NhY2hl
ZF9tcF9yZWFkKHN0cnVjdCBjYWNoZWRfY29ubmVjdGlvbl8gKnJzLCBjaGFyICpkYXRhLCBz
aXplX3QgKmRhdGFfc2l6ZSkKIAkJZ290byBmaW47CiAKIAlpZiAocmVjX2Vycm9yX2NvZGUg
IT0gMCkgewotCQllcnJvcl9jb2RlID0gcmVjX2Vycm9yX2NvZGU7CisJCWVycm9yX2NvZGUg
PSAtcmVjX2Vycm9yX2NvZGU7CiAJCWdvdG8gZmluOwogCX0KIAo=

--------------ff50sKx0ynBebR1XvMoA1yeA--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?f819b96f-81a9-5b95-07e2-1c57c3bdb32e>