Date: Wed, 10 Oct 2001 20:09:47 -0400 From: Mike Barcroft <mike@FreeBSD.org> To: Bakul Shah <bakul@bitblocks.com> Cc: audit@FreeBSD.org Subject: Re: strnstr(3) - New libc function for review Message-ID: <20011010200947.F49828@coffee.q9media.com> In-Reply-To: <200110101725.NAA06941@valiant.cnchost.com>; from bakul@bitblocks.com on Wed, Oct 10, 2001 at 10:25:25AM -0700 References: <20011009221220.C49828@coffee.q9media.com> <200110101725.NAA06941@valiant.cnchost.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Bakul Shah <bakul@bitblocks.com> writes: > [synposis: > I am arguing to make strnstrn a part of libc. strnstrn instead > of, or in addition to, strnstr. strnstrn is defined as > char * strnstrn(const char* s, size_t slen, const char* p, size_t plen) > ] > > > This is probably not needed for most uses for this. It would be rare > > to have two non-NUL terminated strings. On the other hand, this could > > be implemented in the future, if it's seen as useful. strnstr(3) > > could easily be modified to call a strnstrn() and use strlen(3) to get > > the missing size field. > > Well, as you point out strnstr (as per your definition) is > not general enough but strnstrn is. The latter will accept > non nul terminated strings and can be used to build strnstr: > > strnstr(a,b,l) == strnstrn(a,l, b, strlen(b)) > > Actually it is not so rare to have two non nul terminated > strings, for example when you are comparing a substring (part > of a bigger string) against another string. strnstr will > force nul-termination and you either have to allocate a new > string or, worse (and the more frequent case), write a \nul > into a string, while remembering to save and restore the > overwritten char. This latter horrible habit is even > enshrined in strtok, strsep and friends. I did do some research into current uses of strnstr() and strnstrn(). strnstr() seemed to be the more popular of the two. Yes, strnstrn() would be more general, but I really can't see a lot of uses for it. > I don't care much about the name but do care about generality > (IMHO the names str{,n}chr and str{,n}str stink -- naming a > function after its argument types is pretty strange! But > that is a separate discussion:). > > > > { > > > while (slen >= plen) { > > > if (strncmp(s, p, plen) == 0) > > > return (char*)s; > > > s++, slen--; > > > } > > > > It seems to me, it would be a pessimization to call strncmp(3) when > > you don't even have one character that matches. > > Good point! Okay, how about the following? It should be as > efficient as your version. > > while (slen >= plen) { > if (*s == *p && strncmp(s+1, p+1, plen-1) == 0) > return (char*)s; > s++; slen--; > } You missed the case where "If little is an empty string, big is returned". > > > return 0; > > > > Do you mean: return (NULL) ? > > I mildly prefer 0 and removal of unnecessary parens but that > is just a style issue. Not important. Actually style is quite important, see style(9). > > Yes, I recall seeing these algorithms discussed recently and I believe > > it was concluded that making strstr(3) use one of these more advanced > > algorithms would be a pessimization for most cases. That said, don't > > let me hold you back from proposing alternative or complimentary > > functions to strnstr(3). > > No proof was proffered either way. I happen to believe it is > a win even when you are searching sub 100 byte strings but > I'll shut up until I can show that (or Andrew L. Neporada > does that!). *If* it is a win, IMHO it is better to have one > interface (strnstrn or whatever) that selects the appropriate > algorithm for a number of reasons: > > a) most people simply want to use a function that meets their > needs without doing any algorithmic analysis. They benefit > automatically. > > b) if tomorrow you come up with a faster algorithm, a new > strnstrn implementation that can take advantage of that > will benefit existing programs as well (if they use shared > libs). > > This is a philosophical argument about library design (not > just strnstrn) which is why I put freebsd-hackers back in > bcc:. To me it makes sense to provide algorithm specific > functions *and* a generic function that selects the best one > based on inputs. Use the specific version when you know > exatcly what you are doing and want a better control over the > behavior of your program; use the generic version when you > something `fast' but don't care beyond that. Sort of like > providing VM for the masses -- not everyone needs or wants to > do their own memory management! I don't see how strnstr(3) would impede future optimizations, you can just get the length by using strlen(3). Best regards, Mike Barcroft To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-audit" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20011010200947.F49828>