Date: Fri, 09 Sep 2011 17:31:49 +0700 From: Eugene Grosbein <egrosbein@rdtc.ru> To: FreeBSD Stable <freebsd-stable@freebsd.org> Cc: pjd@freebsd.org Subject: Re: gmirror+gjournal often makes inconsistens file systems Message-ID: <4E69EB15.50808@rdtc.ru> In-Reply-To: <4E69A152.6090408@rdtc.ru> References: <4E69A152.6090408@rdtc.ru>
next in thread | previous in thread | raw e-mail | index | archive | help
Dear Pawel Jakub, 09.09.2011 12:17, Eugene Grosbein writes: > Hi! > > For long time I experience same UFS2 filesystem problems with several 8.2 systems > running on gmirror+gjournal+async. In case of unclean shutdown, kernel panic or power failure > gjournal makes fsck skip its checks and that's why I use it. > > But quite often my /var partition (and sometimes others) still has severe damage in it > and running with such /var mounted read-write leads to another panics or hangs and so on. > > For example, I have such 8.2-STABLE system with ad4 and ad6 drives combined to /dev/mirror/gm0. > I have just removed ad6 from the mirror, ran fsck -y manually for all its filesystems, > shut down this machine again cleanly and booted it next time from ad6 > while keeping mirror with ad4 not mounted nor checked. > > Then, I ran fsck -y /dev/mirror/gm0.journals1e (/var on the mirrored drive) > and got LOTS of bad errors on presumably clean file system. > Of course, I've seen the same errors while checking ad6 after it was removed from running mirror. > I have auto-sync gmirror feature turned ON. I've tried to turn it OFF but that just > increase frequency of such damages not fixed after reboot. > > It seems that gjournal cannot handle system crashes reliably, can it? > I basically run in without any manual tuning. I've also tried to tune it - without luck, > it works nice when there are no unclean shutdowns but it's here to deal with them in the first place. > > # fsck -t ffs -y /dev/mirror/gm0.journals1e > ** /dev/mirror/gm0.journals1e > ** Last Mounted on /var > ** Phase 1 - Check Blocks and Sizes > 3955872 DUP I=989242 > 3955873 DUP I=989242 > 3955874 DUP I=989242 > 3955875 DUP I=989242 > 3955876 DUP I=989242 > 3955877 DUP I=989242 > 3955878 DUP I=989242 > 3955879 DUP I=989242 > 3955880 DUP I=989242 > 3955881 DUP I=989242 > 3955882 DUP I=989242 > EXCESSIVE DUP BLKS I=989242 > CONTINUE? yes > > INCORRECT BLOCK COUNT I=989242 (448 should be 424) > CORRECT? yes > > 3955888 DUP I=989289 > 3955889 DUP I=989289 > 3955890 DUP I=989289 > 3955891 DUP I=989289 > 3955892 DUP I=989289 > 3955893 DUP I=989289 > 3955894 DUP I=989289 > 3955895 DUP I=989289 > ** Phase 1b - Rescan For More DUPS > 3955872 DUP I=989242 > 3955873 DUP I=989242 > 3955874 DUP I=989242 > 3955875 DUP I=989242 > 3955876 DUP I=989242 > 3955877 DUP I=989242 > 3955878 DUP I=989242 > 3955879 DUP I=989242 > 3955880 DUP I=989242 > 3955881 DUP I=989242 > 3955888 DUP I=989242 > 3955889 DUP I=989242 > 3955890 DUP I=989242 > 3955891 DUP I=989242 > 3955892 DUP I=989242 > 3955893 DUP I=989242 > 3955894 DUP I=989242 > 3955895 DUP I=989242 > ** Phase 2 - Check Pathnames > DUP/BAD I=989289 OWNER=root MODE=100640 > SIZE=14367 MTIME=Sep 9 11:30 2011 > FILE=/log/kernel.log > > REMOVE? yes > > DUP/BAD I=989242 OWNER=root MODE=100640 > SIZE=202631 MTIME=Sep 8 19:52 2011 > FILE=/log/mpd.log.0 > > REMOVE? yes > > ** Phase 3 - Check Connectivity > ** Phase 4 - Check Reference Counts > UNREF FILE I=376866 OWNER=root MODE=140666 > SIZE=0 MTIME=Sep 5 12:27 2011 > CLEAR? yes > > UNREF FILE I=376868 OWNER=root MODE=140666 > > UNREF FILE I=376868 OWNER=root MODE=140666 > SIZE=0 MTIME=Sep 7 20:30 2011 > CLEAR? yes > > UNREF FILE I=376869 OWNER=root MODE=140666 > SIZE=0 MTIME=Sep 8 11:17 2011 > CLEAR? yes > > UNREF FILE I=376870 OWNER=root MODE=140666 > SIZE=0 MTIME=Sep 8 12:11 2011 > CLEAR? yes > > BAD/DUP FILE I=989242 OWNER=root MODE=100640 > SIZE=202631 MTIME=Sep 8 19:52 2011 > CLEAR? yes > > UNREF FILE I=989259 OWNER=root MODE=100640 > SIZE=648 MTIME=Aug 27 00:00 2011 > RECONNECT? yes > > BAD/DUP FILE I=989289 OWNER=root MODE=100640 > SIZE=14367 MTIME=Sep 9 11:30 2011 > CLEAR? yes > LINK COUNT FILE I=989293 OWNER=root MODE=100640 > SIZE=961 MTIME=Sep 9 11:26 2011 COUNT 1 SHOULD BE 2 > ADJUST? yes > > UNREF FILE I=989327 OWNER=root MODE=100640 > SIZE=114 MTIME=Aug 27 00:00 2011 > RECONNECT? yes > > ** Phase 5 - Check Cyl groups > FREE BLK COUNT(S) WRONG IN SUPERBLK > SALVAGE? yes > > SUMMARY INFORMATION BAD > SALVAGE? yes > > BLK(S) MISSING IN BIT MAPS > SALVAGE? yes > > 1188 files, 90007 used, 4987072 free (360 frags, 623339 blocks, 0.0% > fragmentation) > > ***** FILE SYSTEM IS CLEAN ***** > > ***** FILE SYSTEM WAS MODIFIED ***** Please explain if such partitioning is supported? physical drive - geom_mirror - geom_journal - geom_part_mbr - geom_part_bsd - journalled UFS2 If not, mounting such UFS2 should warn us, shouldn't it? No warnings now. Eugene Grosbein
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4E69EB15.50808>