From owner-freebsd-hackers@FreeBSD.ORG Tue Feb 17 10:11:12 2004 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B1D0016A4CE for ; Tue, 17 Feb 2004 10:11:12 -0800 (PST) Received: from kientzle.com (h-66-166-149-50.SNVACAID.covad.net [66.166.149.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1A94443D1D for ; Tue, 17 Feb 2004 10:11:12 -0800 (PST) (envelope-from kientzle@acm.org) Received: from acm.org ([66.166.149.54]) by kientzle.com (8.12.9/8.12.9) with ESMTP id i1HIB5kX050276; Tue, 17 Feb 2004 10:11:09 -0800 (PST) (envelope-from kientzle@acm.org) Message-ID: <40325939.6000104@acm.org> Date: Tue, 17 Feb 2004 10:11:05 -0800 From: Tim Kientzle User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.4) Gecko/20031006 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Mikulas Patocka References: In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit cc: freebsd-hackers@freebsd.org Subject: Re: signed char bug in regexp library X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: kientzle@acm.org List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 17 Feb 2004 18:11:12 -0000 Mikulas Patocka wrote: > Hi > > I ripped regexp library from FreeBSD 4 and use it in another program. I > get random crashes because the library casts char to int and uses it as > array index ... the most obvious case is engine.i:189: > register char *dp; > dp += charjump[(int)*dp]; > but there are many more and I'm unable to spot them all. > > When i compile library with -funsigned-char, it works fine. But it isn't > compiled with that flag in FreeBSD. Mikulas, Could you verify that programs in FreeBSD 4 crash because of this? That would provide incentive to get it fixed. One easy fix, by the way, is: dp += charjump[(int)(unsigned char)*dp]; For what it's worth, the code probably isn't assuming unsigned characters; it's probably assuming ASCII. ;-) Tim Kientzle