From owner-freebsd-hackers@FreeBSD.ORG Tue Feb 17 12:10:19 2004 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 852A716A4EA for ; Tue, 17 Feb 2004 12:10:19 -0800 (PST) Received: from artax.karlin.mff.cuni.cz (artax.karlin.mff.cuni.cz [195.113.31.125]) by mx1.FreeBSD.org (Postfix) with ESMTP id 53CF843D1D for ; Tue, 17 Feb 2004 12:10:19 -0800 (PST) (envelope-from mikulas@artax.karlin.mff.cuni.cz) Received: by artax.karlin.mff.cuni.cz (Postfix, from userid 17421) id 5296640A1; Tue, 17 Feb 2004 21:10:17 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by artax.karlin.mff.cuni.cz (Postfix) with ESMTP id 51EBD408C; Tue, 17 Feb 2004 21:10:17 +0100 (CET) Date: Tue, 17 Feb 2004 21:10:17 +0100 (CET) From: Mikulas Patocka To: Tim Kientzle In-Reply-To: <40325F81.502@acm.org> Message-ID: References: <40325F81.502@acm.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Mailman-Approved-At: Wed, 18 Feb 2004 05:39:56 -0800 cc: freebsd-hackers@freebsd.org Subject: Re: signed char bug in regexp library X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 17 Feb 2004 20:10:19 -0000 > > Hi > > > > I ripped regexp library from FreeBSD 4 and use it in another program. I > > get random crashes because the library casts char to int and uses it as > > array index ... the most obvious case is engine.i:189: > > register char *dp; > > dp += charjump[(int)*dp]; > > but there are many more and I'm unable to spot them all. > > This problem was fixed in 2000 by offsetting the array > so that accesses such as the above work correctly. > A key part of the fix is this line in regcomp.c: > > g->charjump = &g->charjump[-(CHAR_MIN)]; > > Here's the log entry: > > ---------------------------- > revision 1.20 > date: 2000/07/07 07:46:36; author: dcs; state: Exp; lines: +6 -4 > Deal with the signed/unsigned chars issue in a more proper manner. We > use a CHAR_MIN-based array, like elsewhere in the code. > > Remove a number of unused variables (some due to the above change, one > that was left after a number of optimizing steps through the source). > > Brucified by: bde > ---------------------------- Sorry for bogus bug report --- now I got it. CHAR_MAX was incorrectly defined as (unsigned) type, so loops like int i; for (i = CHAR_MIN; i <= CHAR_MAX; i++) in regexp library didn't work. When I changed CHAR_MAX to signed type, it works fine. Of course it doesn't happen on FreeBSD because it has signed CHAR_MAX. Mikulas