Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 24 Aug 2021 11:30:38 +0000
From:      bugzilla-noreply@freebsd.org
To:        bugs@FreeBSD.org
Subject:   [Bug 258022] [FUSES] Inode attributes are cached unnecessarily/for too long
Message-ID:  <bug-258022-227@https.bugs.freebsd.org/bugzilla/>

next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D258022

            Bug ID: 258022
           Summary: [FUSES] Inode attributes are cached unnecessarily/for
                    too long
           Product: Base System
           Version: 13.0-RELEASE
          Hardware: Any
                OS: Any
            Status: New
          Severity: Affects Some People
          Priority: ---
         Component: kern
          Assignee: bugs@FreeBSD.org
          Reporter: chogata@moosefs.pro

This is a problem a user of MooseFS reported. Under some circumstances crea=
ting
a new fs entry (any type: directory, regular file, special file) on MooseFS
mount shows a message "Resource temporarily unavailable" and any subsequent
operations on this inode (ls -al, rm or rmdir) also show this message. And
MooseFS cannot be unmounted, the system shows a message "Device busy". Only=
 a
restart of the whole machine helps.

Since this was a bit similar to a problem some versions of Linux kernel had,
when a process on one machine deleted a CWD of a process on a different
machine, we at first thought it had to do with CWDs only and introduced some
safeguards in MooseFS client for FreeBSD. But recent findings show this is =
much
more serious and on FreeBSD side.

A simple test: we take two machines and mount MooseFS on both.=20
On FreeBSD machine (13.0-RELEASE-p3) we use these mount options:

mfsmount -o mfsattrcacheto=3D0 -o mfsxattrcacheto=3D0 -o mfsentrycacheto=3D=
0 -o
mfsdirentrycacheto=3D0 -o mfssymlinkcacheto=3D0 -o mfsgroupscacheto=3D0 /mn=
t/mfs

All the -o options are to disable any attribute caches that may exist (any
lookup, access, mkdir etc. operations will return 0 seconds as cache time).

We also have a second machine with the same MooseFS instance. Operating sys=
tem
on the second machine is irrelevant.

Then we perform these steps, exactly in the order shown below:

***FreeBSD machine***
~# cd /mnt/mfs/testdir
/mnt/mfs/testdir# ls -al
total 2932
drwxr-xr-x   2 root  wheel        1 Aug 23 12:41 .
drwxrwxrwx  43 root  wheel  3001433 Aug 23 12:28 ..
/mnt/mfs/testdir#
******
(FreeBSD "sees" that "testdir" is empty)

***OTHER machine***
~# cd /mnt/mfs/testdir
/mnt/mfs/testdir# mkdir dir
/mnt/mfs/testdir#
******
(Other machine creates a directory named "dir" inside "testdir")

***FreeBSD machine***
/mnt/mfs/testdir# ls -al
total 2933
drwxr-xr-x   3 root  wheel        1 Aug 23 12:59 .
drwxrwxrwx  43 root  wheel  3001433 Aug 23 12:28 ..
drwxr-xr-x   2 root  wheel        1 Aug 23 12:59 dir
/mnt/mfs/testdir#
******
(FreeBSD "sees" that there is now "dir" inside "testdir")

***OTHER machine***
/mnt/mfs/testdir# ls -ali
total 2933
8 drwxr-xr-x   3 root  wheel        1 Aug 23 12:59 .
1 drwxrwxrwx  43 root  wheel  3001433 Aug 23 12:28 ..
9 drwxr-xr-x   2 root  wheel        1 Aug 23 12:59 dir
/mnt/mfs/testdir# rmdir dir
/mnt/mfs/testdir# ls -al
total 2932
drwxr-xr-x   2 root  wheel        1 Aug 23 13:00 .
drwxrwxrwx  43 root  wheel  3001433 Aug 23 12:28 ..
/mnt/mfs/testdir#
******
(We check the inode number of "dir" on the other machine and delete "dir")

***FreeBSD machine***
/mnt/mfs/testdir# ls -al
total 2932
drwxr-xr-x   2 root  wheel        1 Aug 23 13:00 .
drwxrwxrwx  43 root  wheel  3001433 Aug 23 12:28 ..
/mnt/mfs/testdir#
******
(FreeBSD "sees" again, that "testdir" is empty)

Now we wait for at least 5 minutes, the timing will be explained below.

***FreeBSD machine***
/mnt/mfs/testdir# echo "foo" > file.txt
-bash: file.txt: Resource temporarily unavailable
/mnt/mfs/testdir# ls -al
ls: file.txt: Resource temporarily unavailable
total 2932
drwxr-xr-x   2 root  wheel        1 Aug 23 13:17 .
drwxrwxrwx  43 root  wheel  3001433 Aug 23 12:28 ..
/mnt/mfs/testdir#
******
Ooops?!

***OTHER machine***
/mnt/mfs/testdir# ls -ali
total 2932
8 drwxr-xr-x   2 root  wheel        1 Aug 23 13:17 .
1 drwxrwxrwx  43 root  wheel  3001433 Aug 23 12:28 ..
9 -rw-r--r--   1 root  wheel        0 Aug 23 13:17 file.txt
/mnt/mfs/testdir#
******
The newly created file got the same inode number as the recently deleted
directory "dir"...

Notes:
1) The effect is not exclusive to former directory inode numbers becoming f=
ile
inode numbers. It happens whenever the new object is of a different type th=
an
the old one (so ex-directory inode number becomes re-used as file, ex-file =
as
fifo, ex-fifo as a device or directory etc.). The "ls -al" scenario is not =
the
only one, the same will happen if objects are created on FreeBSD machine and
then deleted from another machine, which is of course a normal occurrence i=
n a
network file system.
2) Default inode reuse time in MooseFS is 24 hours. It was set to 5 minutes=
 for
testing purposes only. The person, that reported the problem first (there w=
ere
others after), used the default 24 hours. And only inodes that are truly "f=
ree"
are reused, that means: no CWDs (active on any MooseFS client connected to =
the
instance), no sustained (deleted but still open) files are reused. The 24 h=
our
delay is counted from the moment they are considered free, so if a file is =
in a
sustained state for, let's say, 24 hours after deletion (and then whatever
process had a hold on it finally finishes), its inode number is still not
reused for another 24 hours.
3) Default cache times in MooseFS: file attributes cache timeout - 1 second,
extended attributes (xattr) cache timeout - 30 seconds, directory entry cac=
he
timeout - 1 second, negative entry cache timeout - 0 seconds (default no
negative cache), symbolic link cache timeout - 300 seconds, supplementary
groups cache timeout - 300 seconds
4) Caches in the above experiment were ALL set to 0.
5) The problem was first reported on FreeBSD 12.1.

So, to sum it up: we say "don't cache anything at all/longer than 300 secon=
ds",
FreeBSD caches indode attributes (we don't know, which ones, but at least t=
ype)
for longer than 24 hours and it causes a serious problem, because a new ino=
de
with reused inode number is basically unusable in the file system.

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-258022-227>