From owner-svn-src-head@FreeBSD.ORG Sat Jun 21 01:16:16 2014 Return-Path: Delivered-To: svn-src-head@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 5B1F18E5 for ; Sat, 21 Jun 2014 01:16:16 +0000 (UTC) Received: from nm26-vm0.bullet.mail.bf1.yahoo.com (nm26-vm0.bullet.mail.bf1.yahoo.com [98.139.213.74]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 094012E5D for ; Sat, 21 Jun 2014 01:16:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1403313014; bh=0IEd4U0iLcATHZkXGPs9Y9VGyj60crFlhlywllZ/MnM=; h=Received:Received:Received:X-Yahoo-Newman-Id:X-Yahoo-Newman-Property:X-YMail-OSG:X-Yahoo-SMTP:X-Rocket-Received:Content-Type:Mime-Version:Subject:From:In-Reply-To:Date:Cc:Content-Transfer-Encoding:Message-Id:References:To:X-Mailer; b=VKJY8DrkcYz3Oe/SDLNdWCuOaYGkD2rphb/HGbtGVPesQlKE/s2h4gUZslArACz/Z+OXasHZD562r8x6iVe32seuIzpyevoSZIHmhxCMr2hK/TJrpPr15iradQ09+hhidIC2EkdpsQKft/HwXZOVECStec/yNGRL01hJ7soyy+pkNDZQfOLZ6wyJ1R5nE+kpEfr6Q3d9Hqe0F8Bom9FagxIp2VszbLJry8tNDfltzJvMl5aVVlrX2pukpJN5yvPwI+ASsV8JS+bIiKKhO/riJcEny/uiCNM6t72HOCFn4bCU+jJ2eHTIFj8TurOc4osSxDP6dnw50iyiBEz2+6Cf/w== DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s2048; d=yahoo.com; b=lZvWuZyEh7JF+vJVVYCi8gpwzb/ki4cKMa0oCfFZ1wNsTPp6VMOiScU/8rosuaEmJ28BmiD2YG4jowf9HOb0T/abaMCp+jSb3m2Q2Ur4hsgd7VPeU7Ix1lcyR2JFT+af1Rm2+3eZByJd6CRClbInZbgYyoLmI6eQ5TxjR0NWgJ+z60Se02IKmV4o6L6TvGNaxH2CUcqbOmOHD2eh2xovW8p/R/y2q2Ry9ie0/x9fgevkGQ+b1EZfhU9NgcR7M3yhN160dyzQ9ET/t9DxfJPeb0WGoIv+Z5EO9293GhzFcRiGB2jjlVgq2g9R5KCjRVmwdZ4u735V6i6Xj5PFf+5z5Q==; Received: from [66.196.81.174] by nm26.bullet.mail.bf1.yahoo.com with NNFMP; 21 Jun 2014 01:10:14 -0000 Received: from [98.139.211.160] by tm20.bullet.mail.bf1.yahoo.com with NNFMP; 21 Jun 2014 01:10:14 -0000 Received: from [127.0.0.1] by smtp217.mail.bf1.yahoo.com with NNFMP; 21 Jun 2014 01:10:14 -0000 X-Yahoo-Newman-Id: 48125.62604.bm@smtp217.mail.bf1.yahoo.com X-Yahoo-Newman-Property: ymail-3 X-YMail-OSG: _ciFXCcVM1mX080aTWe0oxHA5U1NFuBCdRgJPQR.57UOn.8 R7ivk38I6Sug36XsJmIEqqK7ze5G1Y3Oa9s8M0mQY_t8Pkyw0Fw.YUmIQvek 5hJh9mGdg7TxZRW4.rrPfy1H3lrqkfpPJtqDnZcuDRTfKi8iK8JL0jwuiEYx QquONScyefuBvbgtj4T8KSOr7yxYNTKS_lzpcOvgYZRqIlFSoqhmQ6GErdJS EbndRvKTYvvjD4IioBeYgjsfCeB7GmY7I9q8wl_Stkc6nyR6aq_gTuEjONDF 3Fecjop8KkqFPoab.ZTP1Y4tVDjIHj2mLVnZ5XaclabRmAjk.nyba_C4WpmD aqlGKvl0MfXFLqjH9Y0eDEq8dgUaCVh6wywHxy7SZia2qFTzrmPqaz3IPQ4W zyKOeqXdP4AtyythMRvTio4eOwwWXYHE45XssmJAk.gAfWh0tOzb2j3IgJb. .6Hk4w3hyy3bD_DFoOfQxbfp8e3qc.R3nFsGcHL9T_Gp2b6f22XNQYvB9Pcu RH9c8PzBivKHgRMjjiY0- X-Yahoo-SMTP: xcjD0guswBAZaPPIbxpWwLcp9Unf X-Rocket-Received: from [192.168.0.101] (pfg@190.157.126.109 with plain [98.139.211.125]) by smtp217.mail.bf1.yahoo.com with SMTP; 20 Jun 2014 18:10:13 -0700 PDT Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.2\)) Subject: Re: svn commit: r267675 - head/lib/libc/regex From: Pedro Giffuni In-Reply-To: <20140621072634.I921@besplex.bde.org> Date: Fri, 20 Jun 2014 20:10:11 -0500 Content-Transfer-Encoding: quoted-printable Message-Id: References: <201406201529.s5KFTAEB068038@svn.freebsd.org> <20140620182311.GA1214@mole.fafoe.narf.at> <53A48D62.4060801@FreeBSD.org> <20140621072634.I921@besplex.bde.org> To: Bruce Evans X-Mailer: Apple Mail (2.1878.2) Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, Stefan Farfeleder , src-committers@freebsd.org X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 21 Jun 2014 01:16:16 -0000 Il giorno 20/giu/2014, alle ore 16:59, Bruce Evans = ha scritto: >=20 >> El 6/20/2014 1:23 PM, Stefan Farfeleder escribi=F3: >>> On Fri, Jun 20, 2014 at 03:29:10PM +0000, Pedro F. Giffuni wrote: >>>> ... >>>> Log: >>>> regex: Make use of reallocf(). >>>> Use of reallocf is useful in libraries as we are not certain the >>>> application will exit after NULL. >>>> ... >>>> This somewhat reduces portability but if since you are building >>>> this as part of libc it is likely you have our non-standard >>>> reallocf(3) already. >=20 > It also somewhat increases namespace pollution. >=20 >>>> Modified: head/lib/libc/regex/regcomp.c >>>> = =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D >>>> --- head/lib/libc/regex/regcomp.c Fri Jun 20 13:26:49 2014 = (r267674) >>>> +++ head/lib/libc/regex/regcomp.c Fri Jun 20 15:29:09 2014 = (r267675) >>>> @@ -1111,7 +1111,7 @@ allocset(struct parse *p) >>>> { >>>> cset *cs, *ncs; >>>> - ncs =3D realloc(p->g->sets, (p->g->ncsets + 1) * sizeof(*ncs)); >>>> + ncs =3D reallocf(p->g->sets, (p->g->ncsets + 1) * sizeof(*ncs)); >>>> if (ncs =3D=3D NULL) { >>>> SETERROR(REG_ESPACE); >>>> return (NULL); >>>> @@ -1174,7 +1174,7 @@ CHadd(struct parse *p, cset *cs, wint_t >>>> if (ch < NC) >>>> cs->bmp[ch >> 3] |=3D 1 << (ch & 7); >>>> else { >>>> - newwides =3D realloc(cs->wides, (cs->nwides + 1) * >>>> + newwides =3D reallocf(cs->wides, (cs->nwides + 1) * >>>> sizeof(*cs->wides)); >>>> if (newwides =3D=3D NULL) { >>>> SETERROR(REG_ESPACE); >>> I don't think these changes are OK. If reallocf() fails here, the >>> cs->wides pointer will be freed and later freeset() will call >>> free(cs->wides), probably crashing. The other cases are most = probably >>> similar though I haven't examined them closely. >>=20 >> OK ... >>=20 >> I don't think there is any problem: >>=20 >> If reallocf fails, newwides will be set to NULL and if free() is = called it doesn't do anything when the argument is NULL. >>=20 >> Also freeset() is meant to be called to "free a now-unused set" and = it is not called within the library. I would think using a value when = the allocation has failed is a much more serious issue than attempting = to fail to free it after trying to use it. ;-). >=20 > It doesn't look OK to me. It turns the careful use of newwides, etc., > into garbage, and leaves a garbage pointer in cs->wides, etc., to = cause > problems later. It might work without this careful use of a temporary > variable, but even that is not clear. Consider: >=20 > - start with a consistent data structure with some pointers to = malloced > storage in it; foo->p say > - simple reallocf() allocation strategy: > foo->p =3D reallocf(foo->p, size) > if (foo->p =3D=3D NULL) > return (ERROR); > This leaves the data structure consistent if and only if the only = thing > in it relevant to p is p itself, with foo->p serving as a flag for > validity. > - more complicated reallocf() allocation strategy: > foo->p =3D reallocf(foo->p, size) > if (foo->p =3D=3D NULL) { > foo->psize =3D 0; > foo->pvalid =3D 0; > foo->foovalid =3D 0; > ... > return (ERROR); > } > This still can't do things like cleaning up what foo->p points to, = since > reallocf() clobbered that. > This shows that reallocf() is only useful in simple cases. In = general, you > must keep the pointer valid to clean things up: > newp =3D reallocf(foo->p, size) > if (newp =3D=3D NULL) { > for (i =3D 0; i < foo->psize; i++) > free(foo->p[i]); > foo->p =3D NULL; > foo->psize =3D 0; > foo->pvalid =3D 0; > foo->foovalid =3D 0; > ... > return (ERROR); > } >=20 > Of course, malloc never fails so all this error checking is more a = source > of complexity and bugs than useful. >=20 This is interesting, however I have problems understanding that a = library would let you ignore an error condition and pass you garbage as = part of it=92s normal behavior. Plus being a standard library I am not = sure if you can count on other implementations doing the same ... > Re-quoting/recovering some context: >=20 > @ Modified: head/lib/libc/regex/regcomp.c > @ = =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D > @ --- head/lib/libc/regex/regcomp.c Fri Jun 20 13:26:49 2014 = (r267674) > @ +++ head/lib/libc/regex/regcomp.c Fri Jun 20 15:29:09 2014 = (r267675) > @ @@ -1111,7 +1111,7 @@ allocset(struct parse *p) > @ { > @ cset *cs, *ncs; > @ @ - ncs =3D realloc(p->g->sets, (p->g->ncsets + 1) * sizeof(*ncs)); > @ + ncs =3D reallocf(p->g->sets, (p->g->ncsets + 1) * sizeof(*ncs)); > @ if (ncs =3D=3D NULL) { > @ SETERROR(REG_ESPACE); > @ return (NULL); >=20 > This code shows that even Henry didn't know how to program memory = allocation > in 1988. The code was much worse in 4.4BSD-Lite2: >=20 > old@ if (p->g->sets =3D=3D NULL) > old@ p->g->sets =3D (cset *)malloc(nc * = sizeof(cset)); > old@ else > old@ p->g->sets =3D (cset *)realloc((char = *)p->g->sets, > old@ nc * = sizeof(cset)); >=20 > This just leaked memory. It was improved by assigning to ncs and by > removing the pre-C90 and C++ casts. Now it is unimproved by turning = the > careful use of ncs into garbage and leaving garbage in *p. >=20 > old@ if (p->g->setbits =3D=3D NULL) > old@ p->g->setbits =3D (uch *)malloc(nbytes); > old@ else { > old@ p->g->setbits =3D (uch *)realloc((char = *)p->g->setbits, > old@ nbytes); > old@ /* xxx this isn't right if setbits is now NULL = */ > old@ for (i =3D 0; i < no; i++) > old@ p->g->sets[i].ptr =3D p->g->setbits + = css*(i/CHAR_BIT); > old@ } >=20 > Honest memory mismanagement. It just assumes that malloc() and = realloc() > can't fail. In -current, the function is much simpler and doesn't = have > this code. I think part of the simplication is to use realloc() of = NULL > instead of malloc() to start up. >=20 > @ @@ -1174,7 +1174,7 @@ CHadd(struct parse *p, cset *cs, wint_t @ = if (ch < NC) > @ cs->bmp[ch >> 3] |=3D 1 << (ch & 7); > @ else { > @ - newwides =3D realloc(cs->wides, (cs->nwides + 1) * > @ + newwides =3D reallocf(cs->wides, (cs->nwides + 1) * > @ sizeof(*cs->wides)); > @ if (newwides =3D=3D NULL) { > @ SETERROR(REG_ESPACE); > @ @@ -1203,7 +1203,7 @@ CHaddrange(struct parse *p, cset *cs, wi > @ CHadd(p, cs, min); > @ if (min >=3D max) > @ return; > @ - newranges =3D realloc(cs->ranges, (cs->nranges + 1) * > @ + newranges =3D reallocf(cs->ranges, (cs->nranges + 1) * > @ sizeof(*cs->ranges)); > @ if (newranges =3D=3D NULL) { > @ SETERROR(REG_ESPACE); > @ @@ -1227,7 +1227,7 @@ CHaddtype(struct parse *p, cset *cs, wct > @ for (i =3D 0; i < NC; i++) > @ if (iswctype(i, wct)) > @ CHadd(p, cs, i); > @ - newtypes =3D realloc(cs->types, (cs->ntypes + 1) * > @ + newtypes =3D reallocf(cs->types, (cs->ntypes + 1) * > @ sizeof(*cs->types)); > @ if (newtypes =3D=3D NULL) { > @ SETERROR(REG_ESPACE); > @ @@ -1350,7 +1350,7 @@ enlarge(struct parse *p, sopno size) > @ if (p->ssize >=3D size) > @ return 1; > @ @ - sp =3D (sop *)realloc(p->strip, size*sizeof(sop)); > @ + sp =3D (sop *)reallocf(p->strip, size*sizeof(sop)); > @ if (sp =3D=3D NULL) { > @ SETERROR(REG_ESPACE); > @ return 0; >=20 > Similarly in 4 more cases, except most of them weren't in old = versions. >=20 > @ @@ -1368,7 +1368,7 @@ static void > @ stripsnug(struct parse *p, struct re_guts *g) > @ { > @ g->nstates =3D p->slen; > @ - g->strip =3D (sop *)realloc((char *)p->strip, p->slen * = sizeof(sop)); > @ + g->strip =3D (sop *)reallocf((char *)p->strip, p->slen * = sizeof(sop)); > @ if (g->strip =3D=3D NULL) { > @ SETERROR(REG_ESPACE); > @ g->strip =3D p->strip; >=20 > This was the only realloc() that had not been cleaned up, so using > reallocf() has a chance of improving it. It still has the pre-C90 and > C++ casts. But there is an obvious new bug visible without reading = more > than the patch context: > - the local variable might not be needed now, since the variable = assigned > to, g->strip, is different from the variable realloced, p->strip > - on realloc failure, p->strip becomes invalid > - the error handling is completely broken. It points g->strip at the = old > (now deallocated) storage. This is immediate use of the garbage = pointer > created by using reallocf(). >=20 > Clearly, only realloc() code of the form "p =3D realloc(p, size);", = where the > pointer assigned to is the same as the pointer realloced, can be = improved > by blindly converting it to use reallocf(). >=20 Ah yes, my fault there. > The last function is most interesting. It seems to be to reduce the = size. > So an allocation failure is even less likely than usually, except lots = of > small allocations and reallocations are more likely to waste space = than > save it. But if failure occurs it is almost harmless, and the error > handling of keeping using the previous allocation was good. Maybe = Henry > knew how to program memory allocation after all. (The last function > survived previously-unchanged from 4.4BSDLite-2 except for removing = K&R > support and 'register=92. OK, it=92s pretty tricky, but I agree that reallocf() simply won=92t do = anything here. I will revert. Thanks! Pedro.