FreeBSD Mail Archives

Date:      Mon, 12 Nov 2018 07:30:26 +0000
From:      bugzilla-noreply@freebsd.org
To:        fs@FreeBSD.org
Subject:   [Bug 195485] [ufs] mksnap_ffs(8) cannot create snapshot with journaled soft updates enabled
Message-ID:  <bug-195485-3630-vgUtT2LyY7@https.bugs.freebsd.org/bugzilla/>
In-Reply-To: <bug-195485-3630@https.bugs.freebsd.org/bugzilla/>
References:  <bug-195485-3630@https.bugs.freebsd.org/bugzilla/>


https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=195485

Kirk McKusick <mckusick@FreeBSD.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|New                         |Open
                 CC|                            |mckusick@FreeBSD.org

--- Comment #2 from Kirk McKusick <mckusick@FreeBSD.org> ---
(In reply to t_uemura from comment #1)
Short answer: snapshots work while SU+J is running. The problem arises because
the fsck code that does the journal recovery does not know how to repair
snapshots. Thus after a crash recovery all the snapshots that were on the
filesystem are possibly corrupted and will cause a panic if used.

Long answer: when files are deleted, the blocks are normally returned to the
list of free blocks so that they can be allocated to new files.  When a
filesystem contains snapshots, each freed blocks is first offered to each of
the snapshots so that they can claim it if it is part of one of the files in
the snapshot. By claiming a block they prevent it from being put on the list of
free blocks and thus its contents will be preserved for the snapshotted file.

The journal recovery code has never had the logic added to it to do these
checks. Hence, when it frees blocks, it does not check the snapshots to see if
they want to claim these blocks. Thus blocks that should be claimed by the
snapshots are instead put on the list of free blocks and will eventually be
reused. If one of these blocks is part of the metadata of a file in a snapshot
(such as a block of indirect pointers) and that block gets overwritten with
other data, then attempts to access that file in the snapshot will cause a data
inconsistency leading to a kernel panic.

The correct solution is to extract the code from the kernel that handles
freeing of blocks and add it to the journal recovery code in fsck. This is a
lot of complicated code and would take a lot of effort to do. As ZFS provides
cheap snapshots, that is the filesystem of preference for folks that want
snapshot functionality. The only remaining use for snapshots in UFS is the
ability to do live dumps.  Thus I have not been motivated to go to the effort
to migrate the kernel code to fsck (and nobody has offered to pay me the $25K
to have me do it).

An easier solution would be to simply delete all the snapshots as part of doing
the filesystem recovery. The problem is that while there is a list of all the
inode numbers for the active snapshots in the superblock, we do not know the
pathnames for all of these snapshots, so we would have to do a complete
traversal of the filesystem to find them which would largely negate the speed
benefit of journaling.

Another easy solution would be to truncate all the snapshots to zero length and
stop offering them as snapshots. This would be much quicker as we have the list
of inode number that need to be truncated and all we would be left to clean up
would be a list of zero-length files which could be handled by a find after the
system is up and running.

I am happy to review changes if someone wants to implement this solution (or
the more difficult correct solution noted above).

-- 
You are receiving this mail because:
You are the assignee for the bug.

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-195485-3630-vgUtT2LyY7>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation