From owner-freebsd-hackers@FreeBSD.ORG Sat May 17 07:40:44 2003 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 05B6737B401 for ; Sat, 17 May 2003 07:40:44 -0700 (PDT) Received: from internetDog.org (CPE00010230ac1b-CM014490005040.cpe.net.cable.rogers.com [24.102.167.64]) by mx1.FreeBSD.org (Postfix) with ESMTP id B4A2343F93 for ; Sat, 17 May 2003 07:40:42 -0700 (PDT) (envelope-from alih@internetDog.org) Received: from alih by internetDog.org with local (Exim 3.12 #1 (Debian)) id 19H2rE-00060q-00 for ; Sat, 17 May 2003 10:40:20 -0400 Date: Sat, 17 May 2003 10:40:20 -0400 From: Ali Bahar To: freebsd-hackers@freebsd.org Message-ID: <20030517104020.A20330@internetDog.org> Mail-Followup-To: freebsd-hackers@freebsd.org References: <20030508150341.B28906@internetDog.org> <1789.1052421172@critter.freebsd.dk> <20030508195410.A670@internetDog.org> <20030509064025.GA91122@walton.maths.tcd.ie> <20030509104313.B1465@internetDog.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20030509104313.B1465@internetDog.org>; from alih@internetDog.org on Fri, May 09, 2003 at 10:43:13AM -0400 Subject: Re: cache_purge > cache_zap segmentation fault X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: alih@internetDog.org List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 17 May 2003 14:40:44 -0000 Yup, _I_ was doing the scribbling! There was no bug in the filesystem code. There was another box running similar code, but going thru different tests. It rarely experienced this crash. So comparing the tests, it became obvious which area of our module to focus on. From there, visual inspection was enough to find the culprit. A combination of insufficient malloc size and excessive offsets, caused writes into the next heap segment. To confirm that this segment belonged to the namecache, the write address was printed while the test was carried out. ... Eventually, the seg fault occured, and the namecache node involved, was one of the write addresses. QED! The fix was tested by running the test repeatedly in batch, while running 'ls -lR /' -- which ought have exercised the namecache code mightily! ;-) Thanks much for everyone's help. regards, ali On Fri, May 09, 2003 at 10:43:13AM -0400, Ali wrote: > On Fri, May 09, 2003 at 07:40:25AM +0100, David Malone wrote: > > Is it possible that one of your modules is somehow stomping on > > memory that doesn't belong to it? > The possibility of memory overwrite by an in-development module is > about 3 orders of magnitude higher than the possibility of a name > cache bug. I can't yet see how it is happening, but I've seen weirder > coincidences in scribblers. -- Right of Return for all Palestinian refugees. Universal Declaration of Human Rights. Article 13.