From owner-freebsd-fs Mon Feb 12 12:35: 5 2001 Delivered-To: freebsd-fs@freebsd.org Received: from bingnet2.cc.binghamton.edu (bingnet2.cc.binghamton.edu [128.226.1.18]) by hub.freebsd.org (Postfix) with ESMTP id EA57D37B491 for ; Mon, 12 Feb 2001 12:35:01 -0800 (PST) Received: from opal (cs.binghamton.edu [128.226.123.101]) by bingnet2.cc.binghamton.edu (8.11.2/8.11.2) with ESMTP id f1CKYsG11922; Mon, 12 Feb 2001 15:34:54 -0500 (EST) Date: Mon, 12 Feb 2001 15:34:54 -0500 (EST) From: Zhiui Zhang X-Sender: zzhang@opal To: Russell Cattelan Cc: freebsd-fs@FreeBSD.ORG Subject: Re: Design a journalled file system In-Reply-To: <3A883B74.F1CAFAFE@thebarn.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Mon, 12 Feb 2001, Russell Cattelan wrote: > > Another difficulty is that if several transactions are in progress at the > > same time, we must remember which metadata buffers are modified by which > > transactions. When we copy/rename the buffer, we must inform those > > transactions the fact that we did the copy/rename. The buffers modified > > by one transaction must be flushed at the same time. Thanks for your reply. I mean if a transaction locks down all the metadata (e.g., bitmap blocks) it modified until it commits, then there is no problem (but this reduces concurrency). Otherwise, the same metadata blocks can contain modifications done by more than one transaction. I do not know how XFS solves this problem. Since XFS uses B+ tree, I guess that locking can be done in a hierarchy way easily to avoid deadlock. But in FFS, the bitmap blocks has no relationship with each other. Locking the bitmap blocks in FFS in arbitrary order can cause deadlock, I guess. IBM JFS seems to use incore log implemented as page cache. XFS has pagebuf. I expect that is something similar to IBM's page cache. > Hmm I'm not sure what the problem is here. > A transaction log entry will log all changes necessary to complete > that transaction, even if it involves multiple meta data objects, which is > almost always does. > In the event of a crash and subsequent replay of the log: the recovery code > will make sure all the meta data on the disk is consistent with the log. > If one meta data write happened but the another one didn't the recovery > code only updates the one that didn't complete. > > What is the size of the disk block container on bsd buf_t's ? > if they are 64bit we shouldn't have a problem... simply use absolution disk > addressing for meta data items. > Why would we need to copy a meta data buf_t? > In sys/buf.h of FreeBSD, it has: daddr_t b_lblkno; /* Logical block number. */ daddr_t b_blkno; /* Underlying physical block number. */ Both are 32-bit integer. I am not sure why it is not 64-bit. Maybe it has something to do with merged buffer cache. -Zhihui To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message