From owner-freebsd-current@FreeBSD.ORG Thu May 8 12:43:31 2003 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 38D6437B404 for ; Thu, 8 May 2003 12:43:31 -0700 (PDT) Received: from mail26b.sbc-webhosting.com (mail26b.sbc-webhosting.com [216.173.237.165]) by mx1.FreeBSD.org (Postfix) with SMTP id 8520B43F75 for ; Thu, 8 May 2003 12:43:29 -0700 (PDT) (envelope-from alc@imimic.com) Received: from www.imimic.com (64.143.12.21)4-0165992994; Thu, 8 May 2003 15:43:14 -0400 (EDT) Sender: alc@FreeBSD.ORG Message-ID: <3EBAB353.4E5F8440@imimic.com> Date: Thu, 08 May 2003 14:43:15 -0500 From: "Alan L. Cox" Organization: iMimic Networking, Inc. X-Mailer: Mozilla 4.8 [en] (X11; U; Linux 2.4.2 i386) X-Accept-Language: en MIME-Version: 1.0 To: Mark Santcroos References: <20030506162410.M66653@daneel.foundation.hs> <20030506171632.GA767@laptop.6bone.nl> <20030508111807.GB1390@laptop.6bone.nl> <20030508155123.G78057@daneel.foundation.hs> <20030508142234.GA1359@laptop.6bone.nl> Content-Type: text/plain; charset=x-user-defined Content-Transfer-Encoding: 7bit X-Loop-Detect: 1 cc: alc@freebsd.org cc: Poul-Henning Kamp cc: freebsd-current@freebsd.org cc: Robert Watson cc: Heiko Schaefer Subject: Re: data corruption with current (maybe sis chipset related?) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 08 May 2003 19:43:31 -0000 Mark Santcroos wrote: > > On Thu, May 08, 2003 at 04:05:29PM +0200, Heiko Schaefer wrote: > > > On Tue, May 06, 2003 at 07:16:32PM +0200, Mark Santcroos wrote: > > > > On Tue, May 06, 2003 at 04:41:30PM +0200, Heiko Schaefer wrote: > > > > > does anyone know of any (freebsd-current) issues that might be causing > > > > > this - or have any idea on how i can further rule out anything of this > > > > > kind ? > > > > > > > > Try this in your kernel config: > > > > > > > > options DISABLE_PSE > > > > options DISABLE_PG_G > > > > ok. i have done my copying and checksumming orgy with these options in the > > kernel - and it seems that i do not get any corruption anymore. > > i will re-run my test once again now, just to be safe. > > > > after that: does it make sense to single out which of the two options is > > the relevant one ? > > No. > > > i'm also clueless what exactly these options do, and what exactly we've > > ruled out now. could the lack of data-corruption just be some side-effect? > > They prevent the usage of 4M pages. It's a CPU bug that will cause data > corruption if enabled. > For the record, this is not completely correct. DISABLE_PSE does control the use of 4MB pages. DISABLE_PG_G is, however, something entirely different. Normally, on a process context switch TLB entries for the old address space are flushed. If a TLB entry corresponds to an in-kernel virtual-to-physical mapping, there is no reason to flush that entry because all processes share the same kernel address space. Setting the "G"lobal bit on a page table entry accomplishes this. In other words, the TLB entry persists across context switches. DISABLE_PG_G disables this. The problem with PG_G is that the "flush all" TLB entries operation, just like a context switch, has no effect on entries marked as global. Such entries have to be flushed by specifying their address in a "flush single page" operation. Sometimes the failure to flush an entry is covered up by a combination of DISABLE_PG_G and a lucky context switch. So, in summary, it does make sense to try these options separately. Alan