From owner-freebsd-current@FreeBSD.ORG Tue May 13 12:42:49 2014 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 75396CD8 for ; Tue, 13 May 2014 12:42:49 +0000 (UTC) Received: from smtp.mei.co.jp (smtp.mei.co.jp [133.183.100.20]) by mx1.freebsd.org (Postfix) with ESMTP id 2D6EB2286 for ; Tue, 13 May 2014 12:42:48 +0000 (UTC) Received: from mail-gw.jp.panasonic.com ([157.8.1.157]) by smtp.mei.co.jp (8.12.11.20060614/3.7W/kc-maile12) with ESMTP id s4DCgeaG017721 for ; Tue, 13 May 2014 21:42:40 +0900 (JST) Received: from epochmail.jp.panasonic.com ([157.8.1.130]) by mail.jp.panasonic.com (8.11.6p2/3.7W/kc-maili14) with ESMTP id s4DCgfL09498 for ; Tue, 13 May 2014 21:42:41 +0900 Received: by epochmail.jp.panasonic.com (8.12.11.20060308/3.7W/lomi14) id s4DCgfw0007723 for freebsd-current@freebsd.org; Tue, 13 May 2014 21:42:41 +0900 From: takehara.mikihito@jp.panasonic.com Received: from jp.panasonic.com by lomi14.jp.panasonic.com (8.12.11.20060308/3.7W) with SMTP id s4DCgdaR007707 for ; Tue, 13 May 2014 21:42:39 +0900 To: freebsd-current@freebsd.org Subject: fsck bug in replaying partial frag truncate journal on UFS SU+J? Date: Tue, 13 May 2014 21:42:19 +0900 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Mailer: HidemaruMail 6.22 (WinNT,601) Message-Id: <39CF6EA8C75D61takehara.mikihito@jp.panasonic.com> X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 May 2014 12:42:49 -0000 Hello, I think fsck_ffs has a bug in replaying partial frag truncate journal on UFS SU+J. Bellow I tested about the issue. I tested blocksize==4KB, fragsize=512byte UFS SU+J. But I think these parameters are not related to this issue. 1) Preparing 4096byte(1block) test file. 2) Use "truncate" to shorten filesize to 3584byte(7frags). 3) Shutdown without unmount after journal is written and before inode size ufs2_dinode is written. 4) Run fsck_ffs with journal. 5) Mount again and remove test file. Then I face panic. "panic: ffs_blkfree_cg: freeing free block" This seems to be caused by fsck_ffs to replay JOP_FREEBLK whose "blkno" is not block-aligned. The above case 2), JOP_FREEBLK journal is like this: FREEBLK ino=5, blkno=1727, lbn=0, frags=1, oldfrags=0 <---(a) fsck_ffs handles JOP_NEWBLK almost same as JOP_FREEBLK. But my understanding is that in case of JOP_NEWBLK kernel always create block-aligned blkno. So this issue is only in case of JOP_FREEBLK. To analyze this issue, I also tested that using the above test file 3584byte(7frags) and write 512byte with append to largen to 4096byte(1block). In this case JOP_NEWBLK journal is like this: JOP_NEWBLK ino=5, blkno=1720, lbn=0, frags=8, oldfrags=7 <---(b) I think the bug is in fsck_ffs suj.c's arround blk_check(). My understanding is that blk_check() cannot handle non-block-aligned blkno so there is a little trick in blk_build() bellow. ---------------------------------------------------------------------------------------- /* * Rewrite the record using oldfrags to indicate the offset into * the block. Leave jb_frags as the actual allocated count. */ blkrec->jb_blkno -= frag; blkrec->jb_oldfrags = frag; ---------------------------------------------------------------------------------------- By this trick, (a) is modified like: JOP_FREEBLK ino=5, blkno=1720, lbn=0, frags=1, oldfrags=7 <---(a') and (b) is modified like: JOP_NEWBLK ino=5, blkno=1720, lbn=0, frags=8, oldfrags=0 <---(b') But blk_check() cannot handle the case "oldfrags!=0". If "oldfrags!=0", blk_check()'s "isat" comes to be 0 even though the blk is present. So (a) should be modified same as (b') The following is my patch to fix like this. According to my test, this patch works fine. ======================================================================= diff --git a/sbin/fsck_ffs/suj.c b/sbin/fsck_ffs/suj.c index e21ffc6..2512522 100644 --- a/sbin/fsck_ffs/suj.c +++ b/sbin/fsck_ffs/suj.c @@ -2503,11 +2503,11 @@ blk_build(struct jblkrec *blkrec) frag = fragnum(fs, blkrec->jb_blkno); sblk = blk_lookup(blk, 1); /* - * Rewrite the record using oldfrags to indicate the offset into - * the block. Leave jb_frags as the actual allocated count. + * Rewrite the record to indicate the offset into the block. */ blkrec->jb_blkno -= frag; - blkrec->jb_oldfrags = frag; + blkrec->jb_frags += frag; + blkrec->jb_oldfrags = 0; if (blkrec->jb_oldfrags + blkrec->jb_frags > fs->fs_frag) err_suj("Invalid fragment count %d oldfrags %d\n", blkrec->jb_frags, frag); ======================================================================= I think we can fix this by changing kernel code not to create non-block-aligned JOP_FREEBLK. But I also think it is better to change fsck by considering compatibility.