Date: Wed, 10 Oct 2001 10:25:25 -0700 From: Bakul Shah <bakul@bitblocks.com> To: Mike Barcroft <mike@FreeBSD.org> Cc: audit@FreeBSD.org Subject: Re: strnstr(3) - New libc function for review Message-ID: <200110101725.NAA06941@valiant.cnchost.com> In-Reply-To: Your message of "Tue, 09 Oct 2001 22:12:20 EDT." <20011009221220.C49828@coffee.q9media.com>
next in thread | previous in thread | raw e-mail | index | archive | help
[synposis: I am arguing to make strnstrn a part of libc. strnstrn instead of, or in addition to, strnstr. strnstrn is defined as char * strnstrn(const char* s, size_t slen, const char* p, size_t plen) ] > This is probably not needed for most uses for this. It would be rare > to have two non-NUL terminated strings. On the other hand, this could > be implemented in the future, if it's seen as useful. strnstr(3) > could easily be modified to call a strnstrn() and use strlen(3) to get > the missing size field. Well, as you point out strnstr (as per your definition) is not general enough but strnstrn is. The latter will accept non nul terminated strings and can be used to build strnstr: strnstr(a,b,l) == strnstrn(a,l, b, strlen(b)) Actually it is not so rare to have two non nul terminated strings, for example when you are comparing a substring (part of a bigger string) against another string. strnstr will force nul-termination and you either have to allocate a new string or, worse (and the more frequent case), write a \nul into a string, while remembering to save and restore the overwritten char. This latter horrible habit is even enshrined in strtok, strsep and friends. I don't care much about the name but do care about generality (IMHO the names str{,n}chr and str{,n}str stink -- naming a function after its argument types is pretty strange! But that is a separate discussion:). > > { > > while (slen >= plen) { > > if (strncmp(s, p, plen) == 0) > > return (char*)s; > > s++, slen--; > > } > > It seems to me, it would be a pessimization to call strncmp(3) when > you don't even have one character that matches. Good point! Okay, how about the following? It should be as efficient as your version. while (slen >= plen) { if (*s == *p && strncmp(s+1, p+1, plen-1) == 0) return (char*)s; s++; slen--; } > > return 0; > > Do you mean: return (NULL) ? I mildly prefer 0 and removal of unnecessary parens but that is just a style issue. Not important. > Yes, I recall seeing these algorithms discussed recently and I believe > it was concluded that making strstr(3) use one of these more advanced > algorithms would be a pessimization for most cases. That said, don't > let me hold you back from proposing alternative or complimentary > functions to strnstr(3). No proof was proffered either way. I happen to believe it is a win even when you are searching sub 100 byte strings but I'll shut up until I can show that (or Andrew L. Neporada does that!). *If* it is a win, IMHO it is better to have one interface (strnstrn or whatever) that selects the appropriate algorithm for a number of reasons: a) most people simply want to use a function that meets their needs without doing any algorithmic analysis. They benefit automatically. b) if tomorrow you come up with a faster algorithm, a new strnstrn implementation that can take advantage of that will benefit existing programs as well (if they use shared libs). This is a philosophical argument about library design (not just strnstrn) which is why I put freebsd-hackers back in bcc:. To me it makes sense to provide algorithm specific functions *and* a generic function that selects the best one based on inputs. Use the specific version when you know exatcly what you are doing and want a better control over the behavior of your program; use the generic version when you something `fast' but don't care beyond that. Sort of like providing VM for the masses -- not everyone needs or wants to do their own memory management! -- bakul To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-audit" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200110101725.NAA06941>