From owner-freebsd-hackers@freebsd.org Sat Jul 4 13:30:42 2015 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B12D492A8 for ; Sat, 4 Jul 2015 13:30:42 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: from mail-wg0-x22e.google.com (mail-wg0-x22e.google.com [IPv6:2a00:1450:400c:c00::22e]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 43764119A; Sat, 4 Jul 2015 13:30:42 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: by wgjx7 with SMTP id x7so106557859wgj.2; Sat, 04 Jul 2015 06:30:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:subject:message-id:mail-followup-to:references :mime-version:content-type:content-disposition :content-transfer-encoding:in-reply-to:user-agent; bh=yqAEEaKro4VLiLf6++pIYNLnu9q1GtjPvWod6/0q1Us=; b=WUp8IPy++1NJJtr26kvJ6tdxCFTIO+XfOm1fYZ2TqHKThvBchezOMYQdmxSPKVyrPP /TeMNfhFEB6X04MZSCsK32jB2EtH5WljkxvRArZMJzn/GYhfknc1X3QUL7iRUPpjzb1F LCPraqBU/+hP4ht6oeY+FFJ0u8R6D7BR17St2sRrUJTr6Dk403uTHDgoZgUYzquXUzgg BJBfbmBOxaJqmdcp/+2PKAU9uSjoIaDXmryg56u04l7pL2Xz4rK0SQEvcTlwYRpK8mOD GHhiFqSMH+XhZLXzdCe40OBCuC5b0Gb+KxHdXHeP8b3x/E73j/C/vDmH2tDSJhVhPdvC 4RMA== X-Received: by 10.194.205.101 with SMTP id lf5mr84008362wjc.37.1436016639167; Sat, 04 Jul 2015 06:30:39 -0700 (PDT) Received: from dft-labs.eu (n1x0n-1-pt.tunnel.tserv5.lon1.ipv6.he.net. [2001:470:1f08:1f7::2]) by mx.google.com with ESMTPSA id dl10sm14823020wjb.42.2015.07.04.06.30.36 (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Sat, 04 Jul 2015 06:30:37 -0700 (PDT) Date: Sat, 4 Jul 2015 15:30:34 +0200 From: Mateusz Guzik To: Renato Botelho Cc: FreeBSD Hackers Subject: Re: rename() + fsync() implementation Message-ID: <20150704133034.GA3102@dft-labs.eu> Mail-Followup-To: Mateusz Guzik , Renato Botelho , FreeBSD Hackers References: <2770DA5F-D7F2-46D9-9158-10C86115F8AC@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <2770DA5F-D7F2-46D9-9158-10C86115F8AC@FreeBSD.org> User-Agent: Mutt/1.5.21 (2010-09-15) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 04 Jul 2015 13:30:42 -0000 On Fri, Jul 03, 2015 at 01:59:43PM -0300, Renato Botelho wrote: > Some time ago we found a bug on pfSense and after investigating we figured out the root cause was passwd / group related tools was not checking if files were safe in disk, and system ended up with a 0 length passwd/group db after a power cycle. There are more context at revision [1]. > > After that, bapt@ suggest to do similar fix for cap_mkdb and services_mkdb, that also can be found at another review [2]. > This definitely needs an explanation what is going on here. When the problem from [1] is encountered, which file appears to be zeroed? If the new one, should not O_SYNC you added in several places take care of that? (btw, it would be nicer if that was fsynced before close instead) If the old one, this still has the window (although miniscule compared to 5 mins) since whatever crash/failure you experienced can happen before you fsync. It may be ok enough in practice, but then the question is whether O_SYNC on the new file was of any significance. Or to state differently, do callers have to fsync/O_SYNC the file they are passing as an argument? Of course it may be either one can appear truncated. Assuming the whole approach makes sense here are some comments about the code itself: > /* > * Make sure file is safe on disk. To improve performance we will call > * fsync() to the directory where file lies > */ > if (rename(tname, dbname) == -1 || > (dbname_dir = dirname(dbname)) == NULL || dirname returns a pointer to an internal buffer, so it is not suitable for use in a library function (think: foo = dirname(...); this(...);) Since it can fail and does not depend on rename, it should have been done earlier. dbname_dir looks like a weird name. Something like dbdirname would be better. > (dbname_dir_fd = open(dbname_dir, O_RDONLY|O_DIRECTORY)) == -1 || dbname_dir_fd is definitely bad. This is not a name, this is a fd. So dbdirfd. > fsync(dbname_dir_fd) != 0) { Why does this do '!= 0' instead of '== -1'? > if (dbname_dir_fd != -1) > close(dbname_dir_fd); > err(1, "Cannot rename `%s' to `%s'", tname, dbname); It could be renamed succeeded, so this msg should be modified to state what really failed. But as a library func it likely should return an error instead, indicating which part failed. > } > > if (dbname_dir_fd != -1) > close(dbname_dir_fd); > At this point dbname_dir_fd is guaranteed to be != -1. > The idea is to implement a “sync rename” function to do all these steps. I thought about it and IMO lib/libutil would be a good place to implement it. But since I’m starting to touch src now, I would like to hear more opinions about this. > > > [1] https://reviews.freebsd.org/D2978 > [2] https://reviews.freebsd.org/D2982 -- Mateusz Guzik