From owner-svn-src-head@FreeBSD.ORG Tue Nov 20 14:13:59 2012 Return-Path: Delivered-To: svn-src-head@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id CA17D91B; Tue, 20 Nov 2012 14:13:59 +0000 (UTC) (envelope-from asmrookie@gmail.com) Received: from mail-la0-f54.google.com (mail-la0-f54.google.com [209.85.215.54]) by mx1.freebsd.org (Postfix) with ESMTP id C11C58FC0C; Tue, 20 Nov 2012 14:13:58 +0000 (UTC) Received: by mail-la0-f54.google.com with SMTP id j13so5704737lah.13 for ; Tue, 20 Nov 2012 06:13:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:reply-to:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=zxCP/2yyXugqqoTdutzNEQJOQkrYK8xrIGoQC7eDmdA=; b=XWClroWTFlW/ppCi3ndeRYTwagqtz/VBm/XS8Rhp/jhoWiF+D6tI762PMu3C0uq9/D 68CIw5w6TySEkOQSgKJvq9OKohD3snM4XsqOq6D1v+UIjld1XWNylwR2B1zajefy9Uqi JDahzk3NkLkqjci5HxWE3V6J3pEhsy8TyxzeAYG3Adgrrt1FHAfLKHoASwnR6Ne6fZD7 GocpnxFY2SYYME0RIWnMViEANkQrWTHV4Ge/vXLvdGwsrY36/BqX55U2nr5hW3HSsM9L wwX5TPEoi9x6u0H+j9qDbaTngTTI7hgjqB5PluK0w3i9cESWpP0yoFCUpfhM2hZ3FXH1 8U/Q== MIME-Version: 1.0 Received: by 10.112.28.98 with SMTP id a2mr6350913lbh.110.1353420837470; Tue, 20 Nov 2012 06:13:57 -0800 (PST) Sender: asmrookie@gmail.com Received: by 10.112.134.5 with HTTP; Tue, 20 Nov 2012 06:13:57 -0800 (PST) In-Reply-To: <20121120141137.GX73505@kib.kiev.ua> References: <201211192243.qAJMhjFF055708@svn.freebsd.org> <20121120162708.I924@besplex.bde.org> <20121120230708.G6016@besplex.bde.org> <20121120234822.J6178@besplex.bde.org> <20121120141137.GX73505@kib.kiev.ua> Date: Tue, 20 Nov 2012 14:13:57 +0000 X-Google-Sender-Auth: RJFb9UFvLiKm-8PzwETwvzOKf_k Message-ID: Subject: Re: svn commit: r243311 - in head/sys: fs/ext2fs fs/msdosfs fs/nfsclient fs/nullfs fs/unionfs gnu/fs/reiserfs nfsclient ufs/ffs From: Attilio Rao To: Konstantin Belousov Content-Type: text/plain; charset=UTF-8 Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org, Bruce Evans X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: attilio@FreeBSD.org List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 20 Nov 2012 14:13:59 -0000 On 11/20/12, Konstantin Belousov wrote: > On Tue, Nov 20, 2012 at 01:27:07PM +0000, Attilio Rao wrote: >> On 11/20/12, Bruce Evans wrote: >> > On Tue, 20 Nov 2012, Attilio Rao wrote: >> > >> >> On 11/20/12, Attilio Rao wrote: >> >>> On 11/20/12, Bruce Evans wrote: >> >>>> On Tue, 20 Nov 2012, Attilio Rao wrote: >> >>>> >> >>>>> On Tue, Nov 20, 2012 at 6:20 AM, Bruce Evans >> >>>>> wrote: >> >>>>>> On Mon, 19 Nov 2012, Attilio Rao wrote: >> >>>>>> >> >>>>>>> Log: >> >>>>>>> r16312 is not any longer real since many years (likely since >> >>>>>>> when >> >>>>>>> VFS >> >>>>>>> received granular locking) but the comment present in UFS has >> >>>>>>> been >> >>>>>>> copied all over other filesystems code incorrectly for several >> >>>>>>> times. >> >>>>>>> >> >>>>>>> Removes comments that makes no sense now. >> >>>>>> >> >>>>>> >> >>>>>> It still made sense (except for bitrot in the function name), but >> >>>>>> might >> >>>>>> not >> >>>>>> be true). The code made sense with it. Now the code makes no >> >>>>>> sense. >> >>>>>> >> >>>>>> >> >>>>>>> Modified: head/sys/ufs/ffs/ffs_vfsops.c >> >>>>>>> >> >>>>>>> ============================================================================== >> >>>>>>> --- head/sys/ufs/ffs/ffs_vfsops.c Mon Nov 19 21:58:14 2012 >> >>>>>>> (r243310) >> >>>>>>> +++ head/sys/ufs/ffs/ffs_vfsops.c Mon Nov 19 22:43:45 2012 >> >>>>>>> (r243311) >> >>>>>>> @@ -1676,14 +1676,6 @@ ffs_vgetf(mp, ino, flags, vpp, ffs_flags >> >>>>>>> ump = VFSTOUFS(mp); >> >>>>>>> dev = ump->um_dev; >> >>>>>>> fs = ump->um_fs; >> >>>>>>> - >> >>>>>>> - /* >> >>>>>>> - * If this malloc() is performed after the getnewvnode() >> >>>>>> >> >>>>>> >> >>>>>> This malloc() didn't match the code, which uses uma_zalloc(). Old >> >>>>>> versions used MALLOC() in both the comment and the code. ffs's >> >>>>>> comment >> >>>>>> was updated to say malloc() when the code was changed to use >> >>>>>> malloc(), >> >>>>>> then rotted when the code was changed to use uma_zalloc(). In >> >>>>>> some >> >>>>>> other file systems, the comment still said MALLOC(). >> >>>>>> >> >>>>>> >> >>>>>>> - * it might block, leaving a vnode with a NULL v_data to >> >>>>>>> be >> >>>>>>> - * found by ffs_sync() if a sync happens to fire right >> >>>>>>> then, >> >>>>>>> - * which will cause a panic because ffs_sync() blindly >> >>>>>>> - * dereferences vp->v_data (as well it should). >> >>>>>>> - */ >> >>>>>>> ip = uma_zalloc(uma_inode, M_WAITOK | M_ZERO); >> >>>>>>> >> >>>>>>> /* Allocate a new vnode/inode. */ >> >>>>>>> >> >>>>>> >> >>>>>> The code makes no sense now. The comment explains why ip is >> >>>>>> allocated >> >>>>>> before vp, instead of in the natural, opposite order like it used >> >>>>>> to >> >>>>>> be. Allocating things in an unnatural order requires extra code >> >>>>>> to >> >>>>>> free ip when the allocation of vp fails. >> >>>>> >> >>>>> "Used to be" is very arguably. The code has been like its current >> >>>>> form >> >>>>> many more years than the opposite (16 against 3 I think). >> >>>>> And the code makes perfectly sense if you know the history. So I >> >>>>> don't >> >>>>> agree with you. >> >>>> >> >>>> But it shouldn't be necessary to know the history of the code to >> >>>> understand it. The code only makes sense if its comment is not >> >>>> removed, >> >>>> or if you know the history of the code so that you can restore the >> >>>> removed comment. However, if the comment makes no sense as you >> >>>> claim, >> >>>> then the code that it it describes makes no sense. >> >>> >> >>> The "code that makes no sense" is basically the justification to have >> >>> the allocation before the getnewvnode(). It makes no sense because >> >>> the >> >>> order makes no sense (you can allocate before or after getnewvnode(), >> >>> you won't have v_data corruption as the comment claims). >> >>> >> >>> Hence the code makes no sense. >> >> >> >> Herm, s/code/comment. >> > >> > I think you are right that the comment makes no sense. A preemptible >> > kernel may be preempted without it calling malloc() where it may block. >> > Thus ffs_fsync() and anything else that looks at the inode must be >> > doing something to avoid dereferencing v_data if the vnode is not fully >> > constructed. This seems to be done by iterating over vnodes using >> > MNT_VNODE_FOREACH_ACTIVE*() and not making incomplete vnodes active. >> > ffs_fsync() still just blindly dereferences the inode. >> > >> > I think I am right that the code makes no sense. It is ordered like >> > it is because placing the allocation of ip before the allocation of >> > vp used to be enough to prevent v_data being dereferenced. This makes >> > no sense when it isn't enough. >> >> In the past, before VFS got locking and kernel was single-threaded, >> the comment and code arranged in this way were sensitive and >> effective. >> As now this is not true anymore, there is no strict relationship >> between the getnewvnode() and sleeping points. >> It is important to remove stale comments because they confuse people, >> the porters and as you can see the code/comment has been cut&paste >> quite a bit around. >> > > Putting the issue of the comment making no sense at all aside, which is > true but tangential to what I want to note. > > Although malloc(9) is not ordered with the vnode locks in the lock > order, it is still good practice to not perform allocations under the > vnode lock. The reason is that pagedaemon might need to clean and write > pages, in particular, belonging to the vnodes. Generally, if you own > vnode lock and do something which might kick pagedaemon, you are creating > troubles for pagedaemon, which would attempt to lock the vnode with > timeout, spending the precious time waiting for lock it cannot obtain. > > This is less an issue for the new vnode instantiation locations, because > vnode cannot have resident pages yet. But it is good to follow the same > pattern anywhere. Yes, I completely agree. Infact the code is not changed also to avoid pagedaemon hosing. However this is a different situation respect a "if you allocate while you hold the lock you will get a deadlock/race/corruption". Attilio -- Peace can only be achieved by understanding - A. Einstein