Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 4 Jul 2015 15:30:34 +0200
From:      Mateusz Guzik <mjguzik@gmail.com>
To:        Renato Botelho <garga@FreeBSD.org>
Cc:        FreeBSD Hackers <freebsd-hackers@freebsd.org>
Subject:   Re: rename() + fsync() implementation
Message-ID:  <20150704133034.GA3102@dft-labs.eu>
In-Reply-To: <2770DA5F-D7F2-46D9-9158-10C86115F8AC@FreeBSD.org>
References:  <2770DA5F-D7F2-46D9-9158-10C86115F8AC@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Jul 03, 2015 at 01:59:43PM -0300, Renato Botelho wrote:
> Some time ago we found a bug on pfSense and after investigating we figured out the root cause was passwd / group related tools was not checking if files were safe in disk, and system ended up with a 0 length passwd/group db after a power cycle. There are more context at revision [1].
> 
> After that, bapt@ suggest to do similar fix for cap_mkdb and services_mkdb, that also can be found at another review [2].
> 

This definitely needs an explanation what is going on here.

When the problem from [1] is encountered, which file appears to be
zeroed?

If the new one, should not O_SYNC you added in several places take care
of that? (btw, it would be nicer if that was fsynced before close
instead)

If the old one, this still has the window (although miniscule compared
to 5 mins) since whatever crash/failure you experienced can happen
before you fsync. It may be ok enough in practice, but then the question
is whether O_SYNC on the new file was of any significance. Or to state
differently, do callers have to fsync/O_SYNC the file they are passing
as an argument?

Of course it may be either one can appear truncated.

Assuming the whole approach makes sense here are some comments about the
code itself:

>         /*
>          * Make sure file is safe on disk. To improve performance we will call
>          * fsync() to the directory where file lies
>          */
>         if (rename(tname, dbname) == -1 ||
>             (dbname_dir = dirname(dbname)) == NULL ||

dirname returns a pointer to an internal buffer, so it is not suitable
for use in a library function (think: foo = dirname(...); this(...);)

Since it can fail and does not depend on rename, it should have been
done earlier.

dbname_dir looks like a weird name. Something like dbdirname would be
better.

>             (dbname_dir_fd = open(dbname_dir, O_RDONLY|O_DIRECTORY)) == -1 ||

dbname_dir_fd is definitely bad. This is not a name, this is a fd. So
dbdirfd.

>             fsync(dbname_dir_fd) != 0) {

Why does this do '!= 0' instead of '== -1'?

>                 if (dbname_dir_fd != -1)
>                         close(dbname_dir_fd);
>                 err(1, "Cannot rename `%s' to `%s'", tname, dbname);

It could be renamed succeeded, so this msg should be modified to state
what really failed. But as a library func it likely should return an
error instead, indicating which part failed.

>         }
> 
>         if (dbname_dir_fd != -1)
>                 close(dbname_dir_fd);
> 

At this point dbname_dir_fd is guaranteed to be != -1.

> The idea is to implement a “sync rename” function to do all these steps. I thought about it and IMO lib/libutil would be a good place to implement it. But since I’m starting to touch src now, I would like to hear more opinions about this.
> 
> 
> [1] https://reviews.freebsd.org/D2978
> [2] https://reviews.freebsd.org/D2982

-- 
Mateusz Guzik <mjguzik gmail.com>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20150704133034.GA3102>