From owner-freebsd-current@FreeBSD.ORG Fri May 18 10:37:33 2012 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id D87581065672 for ; Fri, 18 May 2012 10:37:33 +0000 (UTC) (envelope-from luigi@onelab2.iet.unipi.it) Received: from onelab2.iet.unipi.it (onelab2.iet.unipi.it [131.114.59.238]) by mx1.freebsd.org (Postfix) with ESMTP id 975E88FC0A for ; Fri, 18 May 2012 10:37:32 +0000 (UTC) Received: by onelab2.iet.unipi.it (Postfix, from userid 275) id BEC227300A; Fri, 18 May 2012 12:57:47 +0200 (CEST) Date: Fri, 18 May 2012 12:57:47 +0200 From: Luigi Rizzo To: "Bjoern A. Zeeb" Message-ID: <20120518105747.GB5494@onelab2.iet.unipi.it> References: <2103A722-43BF-4BCF-AEDE-2E0CB13DF620@kientzle.com> <20769DCB-D3EF-49C6-A791-E190A5CCECAE@lists.zabbadoz.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20769DCB-D3EF-49C6-A791-E190A5CCECAE@lists.zabbadoz.net> User-Agent: Mutt/1.4.2.3i Cc: freebsd-current FreeBSD Subject: Re: SUJ file system corruption. X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 18 May 2012 10:37:33 -0000 On Fri, May 18, 2012 at 10:18:47AM +0000, Bjoern A. Zeeb wrote: > > On 13. May 2012, at 22:35 , Tim Kientzle wrote: > > > FYI: Saw a crash due to filesystem corruption when running SUJ. > > > > This is on a ARM AM335x system (BeagleBone) that is > > still pretty experimental, so I certainly cannot rule out other > > problems, but in case it means something to > > someone, here's the scenario: > > > > Reset the board to reboot (which is routine for these > > small embedded boards) and when it came back up > > it went through SUJ recovery, and then a little later > > the kernel panicked with this stack trace: > > > > rm: /var/run/dmesg.boot: Bad file descriptor > > panic: ffs_write: type 0xc1e86660 0 (0,1024) > > > Can you tell us if this was HEAD, stable/9 or 9.0-RELEASE? on stable/9 and amd64 as of 2-3 months ago i am seeing these panics every time (fortunately very rare) the system needs to recover from a crash. On the subsequent reboot the system keeps crashing randomly as soon as i load disk-intensive applications (often browsers or most things that run under X11, but sometimes the crashes are even before that. I then need to reboot in single user and do a manual fsck. I tried to run fsck using the journal, but after it completes a subsequent non-journal fsck finds errors. In the end, i am not sure if it makes sense to keep the SU+J active on the disk, i am so afraid of crashes that i don't even dare anymore to run experimental kernels or modules on my main workstation! cheers luigi