Date: Tue, 9 Oct 2001 22:12:20 -0400 From: Mike Barcroft <mike@FreeBSD.org> To: Bakul Shah <bakul@bitblocks.com> Cc: audit@FreeBSD.org Subject: Re: strnstr(3) - New libc function for review Message-ID: <20011009221220.C49828@coffee.q9media.com> In-Reply-To: <200110100127.VAA22073@rodney.cnchost.com>; from bakul@bitblocks.com on Tue, Oct 09, 2001 at 06:27:38PM -0700 References: <20011004215706.B34530@coffee.q9media.com> <200110100127.VAA22073@rodney.cnchost.com>
next in thread | previous in thread | raw e-mail | index | archive | help
[-hackers moved back to BCC. I had intended follow-ups to go to
-audit, but there might be some interested hackers reading now.
Follow-ups to this message go to -audit, thanks! :) ]
Bakul Shah <bakul@bitblocks.com> writes:
> > I would appreciate comments/reviews of the following new addition to
> > libc. It is largely based off the current strstr(3) implementation.
>
> Sorry for not getting to this sooner.
>
> > /*
> > * Find the first occurrence of find in s, where the search is limited to the
> > * first slen characters of s.
> > */
> > char *
> > strnstr(s, find, slen)
> > const char *s;
> > const char *find;
> > size_t slen;
> > {
> > char c, sc;
> > size_t len;
> >
> > if ((c = *find++) != '\0') {
> > len = strlen(find);
> > do {
> > do {
> > if ((sc = *s++) == '\0' || slen-- < 1)
> > return (NULL);
> > } while (sc != c);
> > if (len > slen)
> > return (NULL);
> > } while (strncmp(s, find, len) != 0);
> > s--;
> > }
> > return ((char *)s);
> > }
>
> Why not pass the length of the pattern as well? Regardless,
This is probably not needed for most uses for this. It would be rare
to have two non-NUL terminated strings. On the other hand, this could
be implemented in the future, if it's seen as useful. strnstr(3)
could easily be modified to call a strnstrn() and use strlen(3) to get
the missing size field.
> why not use simpler code that is easier to prove right?
>
> char*
> strnstr(const char *s, size_t slen, const chat *p, size_t plen)
This prototype is inconsistent with any strn...(3) functions that I'm
aware of.
> {
> while (slen >= plen) {
> if (strncmp(s, p, plen) == 0)
> return (char*)s;
> s++, slen--;
> }
It seems to me, it would be a pessimization to call strncmp(3) when
you don't even have one character that matches.
> return 0;
> }
Do you mean: return (NULL) ?
> Another reason for passing in both string lengths is to allow
> switching to a more efficient algorithm. The above algorithm
> runs in slen*plen time. Other more efficient algorithms have
> a startup cost that can be hiddne for a fairly moderate value
> of slen*plen. So you'd insert something like
>
> if (worth_it_to_run_KMP_algo(splen, plen))
> return kmp_strnstr(s, slen, p, plen);
>
> right above the while loop. This makes such functions
> useful for much larger strings (e.g. when you have
> mmapped in the whole file).
Yes, I recall seeing these algorithms discussed recently and I believe
it was concluded that making strstr(3) use one of these more advanced
algorithms would be a pessimization for most cases. That said, don't
let me hold you back from proposing alternative or complimentary
functions to strnstr(3).
Best regards,
Mike Barcroft
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-audit" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20011009221220.C49828>
