From owner-freebsd-stable@FreeBSD.ORG Wed Jun 29 18:43:17 2005 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7E5CF16A41C; Wed, 29 Jun 2005 18:43:17 +0000 (GMT) (envelope-from dom@goodforbusiness.co.uk) Received: from mail.helenmarks.co.uk (mail.helenmarks.co.uk [82.68.196.22]) by mx1.FreeBSD.org (Postfix) with ESMTP id AA28943D1F; Wed, 29 Jun 2005 18:43:16 +0000 (GMT) (envelope-from dom@goodforbusiness.co.uk) Received: from localhost (localhost [127.0.0.1]) by mail.helenmarks.co.uk (Postfix) with ESMTP id CFD4E222403; Wed, 29 Jun 2005 19:43:15 +0100 (BST) Received: from mail.helenmarks.co.uk ([127.0.0.1]) by localhost (mail.helenmarks.co.uk [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 94480-08; Wed, 29 Jun 2005 19:43:14 +0100 (BST) Received: from egg.helenmarks.co.uk (egg.helenmarks.co.uk [192.168.15.3]) by mail.helenmarks.co.uk (Postfix) with ESMTP id AD36A222402; Wed, 29 Jun 2005 19:43:14 +0100 (BST) From: Dominic Marks Organization: GoodforBusiness.co.uk To: freebsd-stable@freebsd.org Date: Wed, 29 Jun 2005 19:45:17 +0100 User-Agent: KMail/1.8 References: <200506291642.50119.dom@goodforbusiness.co.uk> In-Reply-To: <200506291642.50119.dom@goodforbusiness.co.uk> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200506291945.17914.dom@goodforbusiness.co.uk> X-Virus-Scanned: By ClamAV 0.85.1 Cc: pjd@freebsd.org Subject: Re: graid3 + rsync + 5.4-STABLE repeatable panic (Fatal trap 12: page fault while in kernel mode) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Jun 2005 18:43:17 -0000 On Wednesday 29 June 2005 16:42, Dominic Marks wrote: > Hello, > > I'm trying to use graid3 to create a raid volume from three > 250GB SATA discs. I can successfully label, format, and mount > the disc. The problem arises when I try and migrate some data > on to the new volume. I'm using rsync to do this from over the > local network, unfortunately this seems to be produce an > immediate and reproduceable panic (hand copied): > > Fatal trap 12: page fault while in kernel mode > > fault virtual address = 0xc30f8000 > fault code = supervisor write, page not present > instruction pointer = 0x8:0xc05e9783 > stack pointer = 0x10:0xd8030c38 > frame pointer = 0x10:0xd8030c80 > code segment = base 0x0, limit 0xfffff type 0x1b > = DPL 0, pres 1, def32 1, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = 617 (g_raid3 raid) > trap number = 12 > panic: page fault Having recompiled I can no longer produce the panic. I think I may have caused it myself, I had forgotten that I had been tinkering with some values in sys/sys/param.h last week, but it didn't ring a bell when the system went down. I'd been running with MAXPHYS and DFLTPHYS at 256 and it seems graid3 does not like one of those paramters being raised, I suspect its DFLTPHYS and that perhaps graid3 depends on its value for some calculations. This is pure speculation. My apologies for the incorrect report. > Other programs (touch, ls, diskinfo, etc) do not seem to provoke > the panic, but rsync will kill the system within a second. > > I got a dump (once), but I think it is corrupt in some way > because I have not been able to get a backtrace or any other > useful data from it. > > # kgdb kernel.debug /usr/crash/vmcore.0 > kgdb: kvm_read: invalid address (f9) > > (This line is printed again, and again, and again ...) > > This may be because I compiled my debugging kernel after I had > installed the system, although it should have been an identical > source tree ... I'm currently rebuilding the system to > the freshest available -STABLE in the hope that may give a > full backtrace. > > FreeBSD mrt.helenmarks.co.uk 5.4-STABLE FreeBSD 5.4-STABLE #0 > Mon Jun 27 09:34:02 BST 2005 > root@mrt.helenmarks.co.uk:/usr/obj/usr/src/sys/DEV i386 > > The only thing slightly odd about the machine is that each > disc is one its own SATA controller. One disc is attached to an > Intel ICH6 the other two are attached two Silicon Image (3112) > based cards. The root device is ad2, since the additional cards > have pushed themselves to the front. This is a temporary setup > to facilitate migration of data from system to system. > > If I can do anything to help track the problem down, please say. > I really want this to work, and I have some time in which to run > tests. > > * A side note, I have noticed that the panic is often accompanied by > a ATA DMA timeout (ad1). Could this cause the panic to occur? > > Copyright (c) 1992-2005 The FreeBSD Project. > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, > 1994 The Regents of the University of California. All rights > reserved. FreeBSD 5.4-STABLE #0: Mon Jun 27 09:34:02 BST 2005 > root@mrt.helenmarks.co.uk:/usr/obj/usr/src/sys/DEV > WARNING: debug.mpsafenet forced to 0 as ipsec requires Giant > WARNING: MPSAFE network stack disabled, expect reduced performance. > ACPI APIC Table: > Timecounter "i8254" frequency 1193182 Hz quality 0 > CPU: Intel(R) Celeron(R) CPU 2.53GHz (2527.01-MHz 686-class CPU) > Origin = "GenuineIntel" Id = 0xf41 Stepping = 1 > > Features=0xbfebfbffPGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PB >E> real memory = 526958592 (502 MB) > avail memory = 509628416 (486 MB) > ioapic0: Changing APIC ID to 8 > ioapic0 irqs 0-23 on motherboard > lapic0: Forcing LINT1 to edge trigger > npx0: on motherboard > npx0: INT 16 interface > acpi0: on motherboard > acpi0: Power Button (fixed) > Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 > acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0 > cpu0: on acpi0 > acpi_button0: on acpi0 > pcib0: port 0xcf8-0xcff on acpi0 > pci0: on pcib0 > pcib1: irq 16 at device 1.0 on pci0 > pci1: on pcib1 > pci0: at device 2.0 (no driver attached) > pcib2: irq 16 at device 28.0 on pci0 > pci2: on pcib2 > bge0: mem > 0xdfdf0000-0xdfdfffff irq 16 at device 0.0 on pci2 > miibus0: on bge0 > brgphy0: on miibus0 > brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, > 1000baseTX-FDX, auto > bge0: Ethernet address: 00:11:11:c3:2c:91 > pcib3: irq 17 at device 28.1 on pci0 > pci3: on pcib3 > pci0: at device 29.0 (no driver attached) > pci0: at device 29.1 (no driver attached) > pci0: at device 29.2 (no driver attached) > pci0: at device 29.3 (no driver attached) > pci0: at device 29.7 (no driver attached) > pcib4: at device 30.0 on pci0 > pci4: on pcib4 > atapci0: port > 0xdce0-0xdcef,0xdcb4-0xdcb7,0xdcc8-0xdccf,0xdcb0-0xdcb3,0xdcc0-0xdcc7 > mem 0xdfaffc00-0xdfaffdff irq 17 at device 1.0 on pci4 > ata2: channel #0 on atapci0 > ata3: channel #1 on atapci0 > atapci1: port > 0xdcf0-0xdcff,0xdcbc-0xdcbf,0xdcd8-0xdcdf,0xdcb8-0xdcbb,0xdcd0-0xdcd7 > mem 0xdfaffe00-0xdfafffff irq 18 at device 2.0 on pci4 > ata4: channel #0 on atapci1 > ata5: channel #1 on atapci1 > isab0: at device 31.0 on pci0 > isa0: on isab0 > atapci2: port > 0xffa0-0xffaf,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 irq 16 at device > 31.1 on pci0 > ata0: channel #0 on atapci2 > ata1: channel #1 on atapci2 > atapci3: port > 0xfea0-0xfeaf,0xfe30-0xfe33,0xfe20-0xfe27,0xfe10-0xfe13,0xfe00-0xfe07 > irq 20 at device 31.2 on pci0 > ata6: channel #0 on atapci3 > ata7: channel #1 on atapci3 > ichsmb0: port 0xece0-0xecff irq 17 at device 31.3 > on pci0 > atkbdc0: port 0x64,0x60 irq 1 on acpi0 > atkbd0: irq 1 on atkbdc0 > sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 > on acpi0 > sio0: type 16550A > pmtimer0 on isa0 > orm0: at iomem > 0xcf800-0xcffff,0xce000-0xcf7ff,0xc9800-0xcdfff,0xc0000-0xc97ff on > isa0 sc0: at flags 0x100 on isa0 > sc0: VGA <16 virtual consoles, flags=0x300> > vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on > isa0 > ppc0: parallel port not found. > sio1: configured irq 3 not in bitmap of probed irqs 0 > sio1: port may not be enabled > Timecounter "TSC" frequency 2527010839 Hz quality 800 > Timecounters tick every 1.250 msec > IPsec: Initialized Security Association Processing. > ad0: 238475MB [484521/16/63] at > ata4-master SATA150 > ad1: 238475MB [484521/16/63] at > ata5-master SATA150 > ad2: 76319MB [155061/16/63] at > ata6-master SATA150 > ad3: 238475MB [484521/16/63] at > ata7-master SATA150 > Mounting root from ufs:/dev/ad2s1a > WARNING: / was not properly dismounted > WARNING: /usr was not properly dismounted > WARNING: /var was not properly dismounted > > Thanks very much, -- Dominic GoodforBusiness.co.uk I.T. Services for SMEs in the UK.