From owner-freebsd-current@FreeBSD.ORG Tue Jan 18 20:32:29 2005 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 59C2816A4CE for ; Tue, 18 Jan 2005 20:32:29 +0000 (GMT) Received: from cs.rice.edu (cs.rice.edu [128.42.1.30]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2AF5943D2D for ; Tue, 18 Jan 2005 20:32:29 +0000 (GMT) (envelope-from alc@cs.rice.edu) Received: from localhost (calypso.cs.rice.edu [128.42.1.127]) by cs.rice.edu (Postfix) with ESMTP id DC3C74A9B4; Tue, 18 Jan 2005 14:32:28 -0600 (CST) Received: from cs.rice.edu ([128.42.1.30]) by localhost (calypso.cs.rice.edu [128.42.1.127]) (amavisd-new, port 10024) with LMTP id 31872-01-54; Tue, 18 Jan 2005 14:32:28 -0600 (CST) Received: from noel.cs.rice.edu (noel.cs.rice.edu [128.42.1.136]) by cs.rice.edu (Postfix) with ESMTP id 5C0304A99D; Tue, 18 Jan 2005 14:32:28 -0600 (CST) Received: (from alc@localhost) by noel.cs.rice.edu (8.12.10+Sun/8.12.9/Submit) id j0IKVrF7009074; Tue, 18 Jan 2005 14:31:53 -0600 (CST) Date: Tue, 18 Jan 2005 14:31:53 -0600 From: Alan Cox To: Kris Kennaway Message-ID: <20050118203153.GM3194@noel.cs.rice.edu> References: <20050115083847.GA47466@xor.obsecurity.org> <20050116003432.GA448@xor.obsecurity.org> <20050116050433.GA65733@xor.obsecurity.org> <20050116211349.GG26214@noel.cs.rice.edu> <20050117014746.GA96797@xor.obsecurity.org> <20050117021815.GA8953@xor.obsecurity.org> <20050117023031.GA12825@xor.obsecurity.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050117023031.GA12825@xor.obsecurity.org> User-Agent: Mutt/1.3.28i X-Virus-Scanned: by amavis-20030616-p7 at cs.rice.edu cc: Alan Cox cc: current@freebsd.org Subject: Re: fstat triggered INVARIANTS panic in memrw() X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 18 Jan 2005 20:32:29 -0000 On Sun, Jan 16, 2005 at 06:30:31PM -0800, Kris Kennaway wrote: > On Sun, Jan 16, 2005 at 06:18:15PM -0800, Kris Kennaway wrote: > > On Sun, Jan 16, 2005 at 05:47:46PM -0800, Kris Kennaway wrote: > > > On Sun, Jan 16, 2005 at 03:13:49PM -0600, Alan Cox wrote: > > > > > > > The "deadc0de" passed to generic_copyout() comes from the following > > > > lines in devfs_read_f(c51773b8,eed96c84,ca75c800,flags=0): > > > > > > > > if ((flags & FOF_OFFSET) == 0) > > > > uio->uio_offset = fp->f_offset; > > > > > > > > Can you print the contents of the file structure? > > > > > > (kgdb) frame 28 > > > #28 0xc04d8d91 in devfs_read_f (fp=0xc25f5dd0, uio=0xe7275c84, cred=0xc3540380, flags=0, td=0xc3c34170) > > > at ../../../fs/devfs/devfs_vnops.c:931 > > > 931 error = dsw->d_read(dev, uio, ioflag); > > > (kgdb) print *fp > > > $1 = {f_list = {le_next = 0xc25f5bf4, le_prev = 0xc25f52a8}, f_type = 1, f_data = 0xc22f8200, f_flag = 1, > > > f_mtxp = 0xc2251fd0, f_ops = 0xc074c140, f_cred = 0xc2b2a900, f_count = 2, f_vnode = 0xc3c6fbdc, > > > f_offset = 3735929054, f_gcflag = 0, f_msgcount = 0, f_seqcount = 1, f_nextoff = 3263609792} > > > > 3735929054 = 0xdeadc0de. This same struct file appears all the way > > back to the syscall frame. I wonder if fstat is racing with a tty > > device removal or something (it's certainly racing with something, > > e.g.: > > Devices may not be to blame; I was able to trigger this by running > fstat in a loop and then running 'make' in /usr/ports/misc/screen > (with the idea of testing the tty hypothesis :) > > An interesting datapoint is that none of the non-i386 package machines > have hit this problem, but the i386 machines can't stay up for more > than a few minutes under load (which translates to only a few fstat > invocations). The field f_offset is 64 bits wide. If this were a race between use and deallocation of the file structure within the kernel, then I would expect f_offset's value to be 0xdeadc0dedeadc0de, not 0x00000000deadc0de. More likely than not, the 0xdeadc0de is being passed in from user level. The i386 kernel is just not handling it gracefully. Alan