Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 05 Oct 2009 10:23:25 -0400
From:      Charles Owens <cowens@greatbaysoftware.com>
To:        freebsd-fs@freebsd.org
Subject:   gjournal crash: "error while writing data (error=6)"
Message-ID:  <4ACA015D.3090800@greatbaysoftware.com>

next in thread | raw e-mail | index | archive | help
This is a multi-part message in MIME format.
--------------070607070806010207090300
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
X-WatchGuard-AntiVirus: part scanned. clean action=allow

Hello folks,

We've had a system crash, apparently related to GEOM_JOURNAL, on an i386
system running 7.0-RELEASE-p11.   Here's what we could see on the screen
(formatted for readability):

GEOM_JOURNAL: [copy] Error while writting data (error=6) \
    ad4s1a[WRITE(offset=43561402368, length=16384)]
GEOM_JOURNAL: [copy] Error while writting data (error=6) \
    ad4s1a[WRITE(offset=48868164096, length=16896)]
GEOM_JOURNAL: Error while reading data from ad4s1a (error=6).
mode=0134172, inum=5323776, fs = / 
panic: ffs_valloc: dup alloc
cpuid = 3
Uptime: 119d10h5m43s
Cannot dump. No dump device defined.
GEOM_JOURNAL: Flush of cache of ad4s1a: error=6.
GEOM_JOURNAL: [flush] Error while writting data (error=6) \
    ad4s1a[WRITE(offset=48868197888, length=98816)]
(4 more lines like last one)
Rebooting...
cpu_reset: Stopping other CPUs


The system didn't actually reboot.. just got stuck there. When it was
eventually manually rebooted, it booted just fine.  Any thoughts as to
what could be the real problem?  What does "error=6" indicate?

I've done some scouring of the net and found something that may not
directly relate to this crash... but does relate, at least, to my
filesystem configuration.   One of the threads: 
http://markmail.org/message/tamo4r2jho3zdv3z

In the described crash, similar error messages were seen, but with
"error=1".  Ultimately Pawel Dawidek (gjournal author) gave the
diagnosis that the crash was related to the first filesystem in the
slice being set up with an offset of zero, not the correct offset of
16.  Either in this thread or elsewhere I also learned that sysinstall
always uses the zero offset... even though it is not best practice.  Not
a happy discovery.

Looking at our system that crashed... sure enough, zero offset (see
label below below -- both 'a' and 'd' are journaled).  So this then
prompts two questions:

    * Can our crash be explained by the zero offset filesystem
      configuration?
    * If not, separate from the crash, how much should we be worried
      about running a system with gjournal like this?


Thanks very much for any and all assistance,

Charles

# bsdlabel ad4s1

# /dev/ad4s1:
8 partitions:
#        size   offset    fstype   [fsize bsize bps/cpg]
 a: 77594624        0    4.2BSD     2048 16384 28552 
 b: 24113088 77594624      swap                    
 c: 156296322        0    unused        0     0         # "raw" part, don't edit
 d: 54588610 101707712    4.2BSD     2048 16384 28552 


-- 

**Charles Owens**
*Great Bay Software**|** ** e: *cowens@GreatBaySoftware.com****


--------------070607070806010207090300--




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4ACA015D.3090800>