FreeBSD Mail Archives

Date:      Fri, 20 Jun 2014 20:10:11 -0500
From:      Pedro Giffuni <pfg@freebsd.org>
To:        Bruce Evans <brde@optusnet.com.au>
Cc:        svn-src-head@freebsd.org, svn-src-all@freebsd.org, Stefan Farfeleder <stefanf@freebsd.org>, src-committers@freebsd.org
Subject:   Re: svn commit: r267675 - head/lib/libc/regex
Message-ID:  <B74D36BF-1BF7-40E4-9583-76E6A917ECDC@freebsd.org>
In-Reply-To: <20140621072634.I921@besplex.bde.org>
References:  <201406201529.s5KFTAEB068038@svn.freebsd.org> <20140620182311.GA1214@mole.fafoe.narf.at> <53A48D62.4060801@FreeBSD.org> <20140621072634.I921@besplex.bde.org>

index | next in thread | previous in thread | raw e-mail



Il giorno 20/giu/2014, alle ore 16:59, Bruce Evans <brde@optusnet.com.au> ha scritto:

> 
>> El 6/20/2014 1:23 PM, Stefan Farfeleder escribi�:
>>> On Fri, Jun 20, 2014 at 03:29:10PM +0000, Pedro F. Giffuni wrote:
>>>> ...
>>>> Log:
>>>>   regex: Make use of reallocf().
>>>>   Use of reallocf is useful in libraries as we are not certain the
>>>>   application will exit after NULL.
>>>> ...
>>>>   This somewhat reduces portability but if since you are building
>>>>   this as part of libc it is likely you have our non-standard
>>>>   reallocf(3) already.
> 
> It also somewhat increases namespace pollution.
> 
>>>> Modified: head/lib/libc/regex/regcomp.c
>>>> ==============================================================================
>>>> --- head/lib/libc/regex/regcomp.c	Fri Jun 20 13:26:49 2014 (r267674)
>>>> +++ head/lib/libc/regex/regcomp.c	Fri Jun 20 15:29:09 2014 (r267675)
>>>> @@ -1111,7 +1111,7 @@ allocset(struct parse *p)
>>>>  {
>>>>  	cset *cs, *ncs;
>>>> -	ncs = realloc(p->g->sets, (p->g->ncsets + 1) * sizeof(*ncs));
>>>> +	ncs = reallocf(p->g->sets, (p->g->ncsets + 1) * sizeof(*ncs));
>>>>  	if (ncs == NULL) {
>>>>  		SETERROR(REG_ESPACE);
>>>>  		return (NULL);
>>>> @@ -1174,7 +1174,7 @@ CHadd(struct parse *p, cset *cs, wint_t
>>>>  	if (ch < NC)
>>>>  		cs->bmp[ch >> 3] |= 1 << (ch & 7);
>>>>  	else {
>>>> -		newwides = realloc(cs->wides, (cs->nwides + 1) *
>>>> +		newwides = reallocf(cs->wides, (cs->nwides + 1) *
>>>>  		    sizeof(*cs->wides));
>>>>  		if (newwides == NULL) {
>>>>  			SETERROR(REG_ESPACE);
>>> I don't think these changes are OK. If reallocf() fails here, the
>>> cs->wides pointer will be freed and later freeset() will call
>>> free(cs->wides), probably crashing. The other cases are most probably
>>> similar though I haven't examined them closely.
>> 
>> OK ...
>> 
>> I don't think there is any problem:
>> 
>> If reallocf fails, newwides will be set to NULL and if free() is called it doesn't do anything when the argument is NULL.
>> 
>> Also freeset() is meant to be called to "free a now-unused set" and it is not called within the library. I would think using a value when the allocation has failed is a much more serious issue than attempting to fail to free it after trying to use it. ;-).
> 
> It doesn't look OK to me.  It turns the careful use of newwides, etc.,
> into garbage, and leaves a garbage pointer in cs->wides, etc., to cause
> problems later.  It might work without this careful use of a temporary
> variable, but even that is not clear.  Consider:
> 
> - start with a consistent data structure with some pointers to malloced
>  storage in it; foo->p say
> - simple reallocf() allocation strategy:
>    foo->p = reallocf(foo->p, size)
>    if (foo->p == NULL)
>      return (ERROR);
>  This leaves the data structure consistent if and only if the only thing
>  in it relevant to p is p itself, with foo->p serving as a flag for
>  validity.
> - more complicated reallocf() allocation strategy:
>    foo->p = reallocf(foo->p, size)
>    if (foo->p == NULL) {
>      foo->psize = 0;
>      foo->pvalid = 0;
>      foo->foovalid = 0;
>      ...
>      return (ERROR);
>    }
>  This still can't do things like cleaning up what foo->p points to, since
>  reallocf() clobbered that.
> This shows that reallocf() is only useful in simple cases.  In general, you
> must keep the pointer valid to clean things up:
>    newp = reallocf(foo->p, size)
>    if (newp == NULL) {
>      for (i = 0; i < foo->psize; i++)
> 	free(foo->p[i]);
>      foo->p = NULL;
>      foo->psize = 0;
>      foo->pvalid = 0;
>      foo->foovalid = 0;
>      ...
>      return (ERROR);
>    }
> 
> Of course, malloc never fails so all this error checking is more a source
> of complexity and bugs than useful.
> 

This is interesting, however I have problems understanding that a library would let you ignore an error condition and pass you garbage as part of it�s normal behavior. Plus being a standard library I am not sure if you can count on other implementations doing the same ...


> Re-quoting/recovering some context:
> 
> @ Modified: head/lib/libc/regex/regcomp.c
> @ ==============================================================================
> @ --- head/lib/libc/regex/regcomp.c	Fri Jun 20 13:26:49 2014	(r267674)
> @ +++ head/lib/libc/regex/regcomp.c	Fri Jun 20 15:29:09 2014	(r267675)
> @ @@ -1111,7 +1111,7 @@ allocset(struct parse *p)
> @  {
> @  	cset *cs, *ncs;
> @ @ -	ncs = realloc(p->g->sets, (p->g->ncsets + 1) * sizeof(*ncs));
> @ +	ncs = reallocf(p->g->sets, (p->g->ncsets + 1) * sizeof(*ncs));
> @  	if (ncs == NULL) {
> @  		SETERROR(REG_ESPACE);
> @  		return (NULL);
> 
> This code shows that even Henry didn't know how to program memory allocation
> in 1988.  The code was much worse in 4.4BSD-Lite2:
> 
> old@ 		if (p->g->sets == NULL)
> old@ 			p->g->sets = (cset *)malloc(nc * sizeof(cset));
> old@ 		else
> old@ 			p->g->sets = (cset *)realloc((char *)p->g->sets,
> old@ 							nc * sizeof(cset));
> 
> This just leaked memory.  It was improved by assigning to ncs and by
> removing the pre-C90 and C++ casts.  Now it is unimproved by turning the
> careful use of ncs into garbage and leaving garbage in *p.
> 
> old@ 		if (p->g->setbits == NULL)
> old@ 			p->g->setbits = (uch *)malloc(nbytes);
> old@ 		else {
> old@ 			p->g->setbits = (uch *)realloc((char *)p->g->setbits,
> old@ 								nbytes);
> old@ 			/* xxx this isn't right if setbits is now NULL */
> old@ 			for (i = 0; i < no; i++)
> old@ 				p->g->sets[i].ptr = p->g->setbits + css*(i/CHAR_BIT);
> old@ 		}
> 
> Honest memory mismanagement.  It just assumes that malloc() and realloc()
> can't fail.  In -current, the function is much simpler and doesn't have
> this code.  I think part of the simplication is to use realloc() of NULL
> instead of malloc() to start up.
> 
> @ @@ -1174,7 +1174,7 @@ CHadd(struct parse *p, cset *cs, wint_t @  	if (ch < NC)
> @  		cs->bmp[ch >> 3] |= 1 << (ch & 7);
> @  	else {
> @ -		newwides = realloc(cs->wides, (cs->nwides + 1) *
> @ +		newwides = reallocf(cs->wides, (cs->nwides + 1) *
> @  		    sizeof(*cs->wides));
> @  		if (newwides == NULL) {
> @  			SETERROR(REG_ESPACE);
> @ @@ -1203,7 +1203,7 @@ CHaddrange(struct parse *p, cset *cs, wi
> @  		CHadd(p, cs, min);
> @  	if (min >= max)
> @  		return;
> @ -	newranges = realloc(cs->ranges, (cs->nranges + 1) *
> @ +	newranges = reallocf(cs->ranges, (cs->nranges + 1) *
> @  	    sizeof(*cs->ranges));
> @  	if (newranges == NULL) {
> @  		SETERROR(REG_ESPACE);
> @ @@ -1227,7 +1227,7 @@ CHaddtype(struct parse *p, cset *cs, wct
> @  	for (i = 0; i < NC; i++)
> @  		if (iswctype(i, wct))
> @  			CHadd(p, cs, i);
> @ -	newtypes = realloc(cs->types, (cs->ntypes + 1) *
> @ +	newtypes = reallocf(cs->types, (cs->ntypes + 1) *
> @  	    sizeof(*cs->types));
> @  	if (newtypes == NULL) {
> @  		SETERROR(REG_ESPACE);
> @ @@ -1350,7 +1350,7 @@ enlarge(struct parse *p, sopno size)
> @  	if (p->ssize >= size)
> @  		return 1;
> @ @ -	sp = (sop *)realloc(p->strip, size*sizeof(sop));
> @ +	sp = (sop *)reallocf(p->strip, size*sizeof(sop));
> @  	if (sp == NULL) {
> @  		SETERROR(REG_ESPACE);
> @  		return 0;
> 
> Similarly in 4 more cases, except most of them weren't in old versions.
> 
> @ @@ -1368,7 +1368,7 @@ static void
> @  stripsnug(struct parse *p, struct re_guts *g)
> @  {
> @  	g->nstates = p->slen;
> @ -	g->strip = (sop *)realloc((char *)p->strip, p->slen * sizeof(sop));
> @ +	g->strip = (sop *)reallocf((char *)p->strip, p->slen * sizeof(sop));
> @  	if (g->strip == NULL) {
> @  		SETERROR(REG_ESPACE);
> @  		g->strip = p->strip;
> 
> This was the only realloc() that had not been cleaned up, so using
> reallocf() has a chance of improving it.  It still has the pre-C90 and
> C++ casts.  But there is an obvious new bug visible without reading more
> than the patch context:
> - the local variable might not be needed now, since the variable assigned
>  to, g->strip, is different from the variable realloced, p->strip
> - on realloc failure, p->strip becomes invalid
> - the error handling is completely broken.  It points g->strip at the old
>  (now deallocated) storage.  This is immediate use of the garbage pointer
>  created by using reallocf().
> 
> Clearly, only realloc() code of the form "p = realloc(p, size);", where the
> pointer assigned to is the same as the pointer realloced, can be improved
> by blindly converting it to use reallocf().
> 

Ah yes, my fault there.

> The last function is most interesting.  It seems to be to reduce the size.
> So an allocation failure is even less likely than usually, except lots of
> small allocations and reallocations are more likely to waste space than
> save it.  But if failure occurs it is almost harmless, and the error
> handling of keeping using the previous allocation was good.  Maybe Henry
> knew how to program memory allocation after all.  (The last function
> survived previously-unchanged from 4.4BSDLite-2 except for removing K&R
> support and 'register�.

OK, it�s pretty tricky, but I agree that reallocf() simply won�t do anything here. I will revert.

Thanks!

Pedro.

help

Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?B74D36BF-1BF7-40E4-9583-76E6A917ECDC>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation