From owner-freebsd-fs@FreeBSD.ORG Thu Feb 15 13:46:26 2007 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 005C416A402 for ; Thu, 15 Feb 2007 13:46:25 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from relay02.kiev.sovam.com (relay02.kiev.sovam.com [62.64.120.197]) by mx1.freebsd.org (Postfix) with ESMTP id 8D21113C461 for ; Thu, 15 Feb 2007 13:46:25 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from [212.82.216.227] (helo=fw.zoral.com.ua) by relay02.kiev.sovam.com with esmtps (TLSv1:AES256-SHA:256) (Exim 4.60) (envelope-from ) id 1HHgwB-0003LL-Sy for freebsd-fs@freebsd.org; Thu, 15 Feb 2007 15:46:24 +0200 Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua [10.1.1.148]) by fw.zoral.com.ua (8.13.4/8.13.4) with ESMTP id l1FDkAiJ004184 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 15 Feb 2007 15:46:10 +0200 (EET) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.13.8/8.13.8) with ESMTP id l1FDkA5v020572; Thu, 15 Feb 2007 15:46:10 +0200 (EET) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.13.8/8.13.8/Submit) id l1FDk8us020571; Thu, 15 Feb 2007 15:46:09 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Thu, 15 Feb 2007 15:46:08 +0200 From: Kostik Belousov To: Tomas Olsson Message-ID: <20070215134608.GG39168@deviant.kiev.zoral.com.ua> References: <20070214162938.GA96725@keira.kiwi-computer.com> <20070214173211.L1054@chrishome.localnet> <20070214170808.GC96725@keira.kiwi-computer.com> <20070215044707.GA39168@deviant.kiev.zoral.com.ua> <20070215104537.GC39168@deviant.kiev.zoral.com.ua> <20070215120855.GE39168@deviant.kiev.zoral.com.ua> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="7lMq7vMTJT4tNk0a" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.2i X-Virus-Scanned: ClamAV version 0.88.7, clamav-milter version 0.88.7 on fw.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-0.1 required=5.0 tests=ALL_TRUSTED,SPF_NEUTRAL autolearn=failed version=3.1.7 X-Spam-Checker-Version: SpamAssassin 3.1.7 (2006-10-05) on fw.zoral.com.ua X-Scanner-Signature: 63e2ad49ecdb9688bd8f281dc7b291dd X-DrWeb-checked: yes X-SpamTest-Envelope-From: kostikbel@gmail.com X-SpamTest-Group-ID: 00000000 X-SpamTest-Info: Profiles 776 [Feb 15 2007] X-SpamTest-Info: helo_type=3 X-SpamTest-Info: {received from trusted relay: not dialup} X-SpamTest-Method: none X-SpamTest-Method: Local Lists X-SpamTest-Rate: 0 X-SpamTest-Status: Not detected X-SpamTest-Status-Extended: not_detected X-SpamTest-Version: SMTP-Filter Version 3.0.0 [0255], KAS30/Release Cc: freebsd-fs@freebsd.org, arla-drinkers@stacken.kth.se Subject: Re: Arla on FreeBSD X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Feb 2007 13:46:26 -0000 --7lMq7vMTJT4tNk0a Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Feb 15, 2007 at 02:07:19PM +0100, Tomas Olsson wrote: > Kostik Belousov writes: > > On Thu, Feb 15, 2007 at 12:59:04PM +0100, Tomas Olsson wrote: > > > Kostik Belousov writes: > > > > I made really quick look at the places you mentioned. I have some > > > > comment for open_file(). For FreeBSD >=3D 6.x, the right way to ope= n vnode > > > > from the kernel code is to use vn_open() (and then vn_close()) API. > > > > > > > Great! Sounds reasonable. > > >=20 > > > We currently open the cache files from nnpfs' VOPs, are there any ris= ks > > > (deadlock?) involved if one passes an absolute path to vn_open() in s= uch a > > > context? I'd have liked to do use arlad's thread for this, but vput() > > There, you already have nnpfs vnode locked. The right lock order for vn= odes > > is from root down by the tree. As such, you may end up with reversals, = that > > would result in deadlocks, IMHO. > > > Ok, vn_open() must be passed curthread so we can't use arlad's thread when > in our VOP. And we cannot use an absolute path to the cache. So vn_open() > can't be used? =20 vn_open() does not need curthread in strong sence, but td is the thread that all locks and sleeps will be performed for. What I said does not exclude neither usage of vn_open() nor pathes, but right locking order shall be ensured to prevent deadlocks. I think that your current strategy would work, but it needs to be checked. >=20 > > > explicitly uses curthread deep down in namei. Also, users are not nor= mally > > > allowed to access the cache files directly so some OSes complain on s= uch a > > > lookup with user creds; would that be a problem here? > > How is the user access to cache is disabled ? And what is the cache its= elf ? > > Local filesystem (UFS) that stores your blocks in regular files ? > > > It's just a local dir tree, on UFS or whatever. Currently each node gets > it's own subdir, and each "block" (128kB perhaps) a plain file in that di= r. > The cache is supposed to be readable only by arlad (usually root) and > through nnpfs; that's good when handling fancy ACLs etc, so the cache root > is chmod:ed to 0700 for root:wheel. >=20 > > > Of course, we wouldn't have to worry about such things if we just kep= t the > > > vnode handy for each cache block file. Maybe it's a price worth payin= g. > >=20 > > Then, you need to take some care of cached vnode lifecircle (e.g., even > > keeping the vnode vref'ed would not prevent it from being recycled, so = you > > may end with dead vnode). > > > Eep. Tricky. >=20 > So where does this leave us, is plain lookup() (or VOP_LOOKUP) and > VOP_CREATE on the cache a possible way to go? Seems to work on OpenBSD. vn_open() is the right way. Otherwise, you would in fact reimplement the code from it. Also, you could look at file handle API, that would save you of path lookups after the vnode is looked up first time (look around for vfs_vptofh and vfs_fhtovp ops). This API is used by NFS server, so it shall work :). Instead of caching vnode, you would save file handle for future accesses. >=20 > > Also, as Robert pointed out in his email, you probably need to decide a= bout > > MP-safeness of nnpfs. > > > Well, it's marked "doesn't need giant"; we use our own global lock. I don= 't > trust it 100%, but it seems to work so far. I haven't tried it on any MP > FreeBSD boxes though. Again, this could lead to lock order reversals with vnode lock. For instance, when lookup() traverses tree, it has the parent directory locked and calls fs-provided VOP_LOOKUP(). This method shall return the leaf vnode locked. Since, according to you claim, some alra-specific global lock is taken between these two vnode locks, it could LOR between vnode locks and alra lock (imagine ".." lookup in one thread, and direct lookup in another). --7lMq7vMTJT4tNk0a Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (FreeBSD) iD8DBQFF1GQgC3+MBN1Mb4gRAuVZAJ9pW94fsxUtY2y63VlMJjGl1EeIeQCfeQAW i+g1CzEkYZ7b0xpAWreRho8= =fXJL -----END PGP SIGNATURE----- --7lMq7vMTJT4tNk0a--