From owner-freebsd-hackers@FreeBSD.ORG Tue Feb 17 10:37:57 2004 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id BDB2E16A4CE for ; Tue, 17 Feb 2004 10:37:57 -0800 (PST) Received: from kientzle.com (h-66-166-149-50.SNVACAID.covad.net [66.166.149.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0C2C543D1D for ; Tue, 17 Feb 2004 10:37:57 -0800 (PST) (envelope-from kientzle@acm.org) Received: from acm.org ([66.166.149.54]) by kientzle.com (8.12.9/8.12.9) with ESMTP id i1HIbrkX050446; Tue, 17 Feb 2004 10:37:54 -0800 (PST) (envelope-from kientzle@acm.org) Message-ID: <40325F81.502@acm.org> Date: Tue, 17 Feb 2004 10:37:53 -0800 From: Tim Kientzle User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.4) Gecko/20031006 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Mikulas Patocka References: In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit cc: freebsd-hackers@freebsd.org Subject: Re: signed char bug in regexp library X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: kientzle@acm.org List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 17 Feb 2004 18:37:57 -0000 Mikulas Patocka wrote: > Hi > > I ripped regexp library from FreeBSD 4 and use it in another program. I > get random crashes because the library casts char to int and uses it as > array index ... the most obvious case is engine.i:189: > register char *dp; > dp += charjump[(int)*dp]; > but there are many more and I'm unable to spot them all. This problem was fixed in 2000 by offsetting the array so that accesses such as the above work correctly. A key part of the fix is this line in regcomp.c: g->charjump = &g->charjump[-(CHAR_MIN)]; Here's the log entry: ---------------------------- revision 1.20 date: 2000/07/07 07:46:36; author: dcs; state: Exp; lines: +6 -4 Deal with the signed/unsigned chars issue in a more proper manner. We use a CHAR_MIN-based array, like elsewhere in the code. Remove a number of unused variables (some due to the above change, one that was left after a number of optimizing steps through the source). Brucified by: bde ----------------------------