Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 09 Sep 2011 12:17:06 +0700
From:      Eugene Grosbein <egrosbein@rdtc.ru>
To:        FreeBSD Stable <freebsd-stable@FreeBSD.org>
Subject:   gmirror+gjournal often makes inconsistens file systems
Message-ID:  <4E69A152.6090408@rdtc.ru>

next in thread | raw e-mail | index | archive | help
Hi!

For long time I experience same UFS2 filesystem problems with several 8.2 systems
running on gmirror+gjournal+async. In case of unclean shutdown, kernel panic or power failure
gjournal makes fsck skip its checks and that's why I use it.

But quite often my /var partition (and sometimes others) still has severe damage in it
and running with such /var mounted read-write leads to another panics or hangs and so on.

For example, I have such 8.2-STABLE system with ad4 and ad6 drives combined to /dev/mirror/gm0.
I have just removed ad6 from the mirror, ran fsck -y manually for all its filesystems,
shut down this machine again cleanly and booted it next time from ad6
while keeping mirror with ad4 not mounted nor checked.

Then, I ran fsck -y /dev/mirror/gm0.journals1e (/var on the mirrored drive)
and got LOTS of bad errors on presumably clean file system.
Of course, I've seen the same errors while checking ad6 after it was removed from running mirror.
I have auto-sync gmirror feature turned ON. I've tried to turn it OFF but that just
increase frequency of such damages not fixed after reboot.

It seems that gjournal cannot handle system crashes reliably, can it?
I basically run in without any manual tuning. I've also tried to tune it - without luck,
it works nice when there are no unclean shutdowns but it's here to deal with them in the first place.

# fsck -t ffs -y /dev/mirror/gm0.journals1e
** /dev/mirror/gm0.journals1e
** Last Mounted on /var
** Phase 1 - Check Blocks and Sizes
3955872 DUP I=989242
3955873 DUP I=989242
3955874 DUP I=989242
3955875 DUP I=989242
3955876 DUP I=989242
3955877 DUP I=989242
3955878 DUP I=989242
3955879 DUP I=989242
3955880 DUP I=989242
3955881 DUP I=989242
3955882 DUP I=989242
EXCESSIVE DUP BLKS I=989242
CONTINUE? yes

INCORRECT BLOCK COUNT I=989242 (448 should be 424)
CORRECT? yes

3955888 DUP I=989289
3955889 DUP I=989289
3955890 DUP I=989289
3955891 DUP I=989289
3955892 DUP I=989289
3955893 DUP I=989289
3955894 DUP I=989289
3955895 DUP I=989289
** Phase 1b - Rescan For More DUPS
3955872 DUP I=989242
3955873 DUP I=989242
3955874 DUP I=989242
3955875 DUP I=989242
3955876 DUP I=989242
3955877 DUP I=989242
3955878 DUP I=989242
3955879 DUP I=989242
3955880 DUP I=989242
3955881 DUP I=989242
3955888 DUP I=989242
3955889 DUP I=989242
3955890 DUP I=989242
3955891 DUP I=989242
3955892 DUP I=989242
3955893 DUP I=989242
3955894 DUP I=989242
3955895 DUP I=989242
** Phase 2 - Check Pathnames
DUP/BAD  I=989289  OWNER=root MODE=100640
SIZE=14367 MTIME=Sep  9 11:30 2011 
FILE=/log/kernel.log

REMOVE? yes

DUP/BAD  I=989242  OWNER=root MODE=100640
SIZE=202631 MTIME=Sep  8 19:52 2011 
FILE=/log/mpd.log.0

REMOVE? yes

** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
UNREF FILE I=376866  OWNER=root MODE=140666
SIZE=0 MTIME=Sep  5 12:27 2011 
CLEAR? yes

UNREF FILE I=376868  OWNER=root MODE=140666

UNREF FILE I=376868  OWNER=root MODE=140666
SIZE=0 MTIME=Sep  7 20:30 2011
CLEAR? yes

UNREF FILE I=376869  OWNER=root MODE=140666
SIZE=0 MTIME=Sep  8 11:17 2011
CLEAR? yes

UNREF FILE I=376870  OWNER=root MODE=140666
SIZE=0 MTIME=Sep  8 12:11 2011
CLEAR? yes

BAD/DUP FILE I=989242  OWNER=root MODE=100640
SIZE=202631 MTIME=Sep  8 19:52 2011
CLEAR? yes

UNREF FILE  I=989259  OWNER=root MODE=100640
SIZE=648 MTIME=Aug 27 00:00 2011
RECONNECT? yes

BAD/DUP FILE I=989289  OWNER=root MODE=100640
SIZE=14367 MTIME=Sep  9 11:30 2011
CLEAR? yes
LINK COUNT FILE I=989293  OWNER=root MODE=100640
SIZE=961 MTIME=Sep  9 11:26 2011  COUNT 1 SHOULD BE 2
ADJUST? yes

UNREF FILE  I=989327  OWNER=root MODE=100640
SIZE=114 MTIME=Aug 27 00:00 2011
RECONNECT? yes

** Phase 5 - Check Cyl groups
FREE BLK COUNT(S) WRONG IN SUPERBLK
SALVAGE? yes

SUMMARY INFORMATION BAD
SALVAGE? yes

BLK(S) MISSING IN BIT MAPS
SALVAGE? yes

1188 files, 90007 used, 4987072 free (360 frags, 623339 blocks, 0.0%
fragmentation)

***** FILE SYSTEM IS CLEAN *****

***** FILE SYSTEM WAS MODIFIED *****



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4E69A152.6090408>