Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 9 Oct 2001 22:12:20 -0400
From:      Mike Barcroft <mike@FreeBSD.org>
To:        Bakul Shah <bakul@bitblocks.com>
Cc:        audit@FreeBSD.org
Subject:   Re: strnstr(3) - New libc function for review
Message-ID:  <20011009221220.C49828@coffee.q9media.com>
In-Reply-To: <200110100127.VAA22073@rodney.cnchost.com>; from bakul@bitblocks.com on Tue, Oct 09, 2001 at 06:27:38PM -0700
References:  <20011004215706.B34530@coffee.q9media.com> <200110100127.VAA22073@rodney.cnchost.com>

next in thread | previous in thread | raw e-mail | index | archive | help
[-hackers moved back to BCC.  I had intended follow-ups to go to
 -audit, but there might be some interested hackers reading now.
  Follow-ups to this message go to -audit, thanks! :) ]

Bakul Shah <bakul@bitblocks.com> writes:
> > I would appreciate comments/reviews of the following new addition to
> > libc.  It is largely based off the current strstr(3) implementation.
> 
> Sorry for not getting to this sooner.
> 
> > /*
> >  * Find the first occurrence of find in s, where the search is limited to the
> >  * first slen characters of s.
> >  */
> > char *
> > strnstr(s, find, slen)
> > 	const char *s;
> > 	const char *find;
> > 	size_t slen;
> > {
> > 	char c, sc;
> > 	size_t len;
> > 
> > 	if ((c = *find++) != '\0') {
> > 		len = strlen(find);
> > 		do {
> > 			do {
> > 				if ((sc = *s++) == '\0' || slen-- < 1)
> > 					return (NULL);
> > 			} while (sc != c);
> > 			if (len > slen)
> > 				return (NULL);
> > 		} while (strncmp(s, find, len) != 0);
> > 		s--;
> > 	}
> > 	return ((char *)s);
> > }
>
> Why not pass the length of the pattern as well?  Regardless,

This is probably not needed for most uses for this.  It would be rare
to have two non-NUL terminated strings.  On the other hand, this could 
be implemented in the future, if it's seen as useful.  strnstr(3)
could easily be modified to call a strnstrn() and use strlen(3) to get
the missing size field.

> why not use simpler code that is easier to prove right?
>
> char*
> strnstr(const char *s, size_t slen, const chat *p, size_t plen)

This prototype is inconsistent with any strn...(3) functions that I'm
aware of.

> {
> 	while (slen >= plen) {
> 		if (strncmp(s, p, plen) == 0)
> 			return (char*)s;
> 		s++, slen--;
> 	}

It seems to me, it would be a pessimization to call strncmp(3) when
you don't even have one character that matches.

> 	return 0;
> }

Do you mean: return (NULL) ?

> Another reason for passing in both string lengths is to allow
> switching to a more efficient algorithm.  The above algorithm
> runs in slen*plen time.  Other more efficient algorithms have
> a startup cost that can be hiddne for a fairly moderate value
> of slen*plen.  So you'd insert something like
> 
> 	if (worth_it_to_run_KMP_algo(splen, plen))
> 		return kmp_strnstr(s, slen, p, plen);
> 
> right above the while loop.  This makes such functions
> useful for much larger strings (e.g. when you have
> mmapped in the whole file).

Yes, I recall seeing these algorithms discussed recently and I believe
it was concluded that making strstr(3) use one of these more advanced
algorithms would be a pessimization for most cases.  That said, don't
let me hold you back from proposing alternative or complimentary
functions to strnstr(3).

Best regards,
Mike Barcroft

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20011009221220.C49828>