Date: Wed, 27 Aug 2008 05:32:21 +0400 From: Andrey Chernov <ache@nagual.pp.ru> To: Gabor Kovesdan <gabor@kovesdan.org> Cc: current@freebsd.org, Max Khon <fjoe@freebsd.org>, hackers@freebsd.org, krion@freebsd.org, dougb@freebsd.org Subject: Re: CFT: BSD grep Message-ID: <20080827013221.GA82176@nagual.pp.ru> In-Reply-To: <48B44A7D.3070108@kovesdan.org> References: <48B44A7D.3070108@kovesdan.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Aug 26, 2008 at 08:25:01PM +0200, Gabor Kovesdan wrote: > Hello all, > > I've reviewed BSD grep based on your comments and the bug reports I > received. The new version is committed to the ports tree as > textproc/bsdgrep and there is a base patch available: > http://kovesdan.org/patches/grep-base.diff Just from quick looking at the sources... This code looks suspicious: wend = sscanf(&l->dat[pmatch.rm_eo], "%lc", &wend); Perhaps it should be if (sscanf(&l->dat[pmatch.rm_eo], "%lc", &wend) != 1) r = REG_NOMATCH; The next thing is that perhaps each r = REG_NOMATCH; case should be isolated from others in this block (with "else if"?) F.e. failing mbstowcs() can leave buffer for sscanf() in junk. wbegin = grep_malloc(mbstowcs(NULL, l->dat, pmatch.rm_so)); grep_malloc() here could terminate program for invalid mbstowcs() sequence, but really must set only r = REG_NOMATCH; Think about files which, for various reasons, may contain not only valid MB sequences. fgrepcomp() uses toupper()/tolower() while should use wide chars analogs (MB chars can be in the pattern too). There are also many other places where pattern treated as single chars one, fastcomp() etc. grep_cmp() compares single chars toupper(data[]) too. There must be no plain ctype usage in the whole data _and_ pattern handling code. -- http://ache.pp.ru/
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080827013221.GA82176>