From owner-freebsd-fs@FreeBSD.ORG Sat Feb 25 18:30:16 2012 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D6C081065670 for ; Sat, 25 Feb 2012 18:30:16 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id A7CE88FC19 for ; Sat, 25 Feb 2012 18:30:16 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q1PIUGLd056189 for ; Sat, 25 Feb 2012 18:30:16 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q1PIUGV5056188; Sat, 25 Feb 2012 18:30:16 GMT (envelope-from gnats) Date: Sat, 25 Feb 2012 18:30:16 GMT Message-Id: <201202251830.q1PIUGV5056188@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org From: Jilles Tjoelker Cc: Subject: Re: kern/165392: Multiple mkdir/rmdir fails with errno 31 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Jilles Tjoelker List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 25 Feb 2012 18:30:16 -0000 The following reply was made to PR kern/165392; it has been noted by GNATS. From: Jilles Tjoelker To: bug-followup@FreeBSD.org, vvv@colocall.net Cc: Subject: Re: kern/165392: Multiple mkdir/rmdir fails with errno 31 Date: Sat, 25 Feb 2012 19:27:02 +0100 > [mkdir fails with [EMLINK], but link count < LINK_MAX] I can reproduce this problem with UFS with soft updates (with or without journaling). A reproduction without C programs is: cd empty_dir mkdir `jot 32766 1` # the last one will fail (correctly) rmdir 1 mkdir a # will erroneously fail The problem appears to be because the previous rmdir has not yet been fully completed. It is still holding onto the link count until the directory is written, which may take up to two minutes. The same problem can occur with other calls that increase the link count such as link() and rename(). A workaround is to call fsync() on the directory that contained the deleted entries. It will then release its hold on the link count and allow mkdir or other calls. If fsync() is only called when [EMLINK] is returned, the performance impact should not be very bad, although it still causes more I/O than necessary. The book "The Design and Implementation of the FreeBSD Operating System" contains a detailed description of soft updates in section 8.6 Soft Updates. The subsection "File Removal Requirements for Soft Updates" appears particularly relevant to this problem. A possible solution is to check for the problematic situation (i_effnlink < LINK_MAX && i_nlink >= LINK_MAX) and if so synchronously write one or more deleted directory entries that pointed to the inode with the link count problem. After that, i_nlink should be less than LINK_MAX and the link count can be checked again (depending on whether locks need to be dropped to do the write, it may or may not be possible for another thread to use up the last link first). For mkdir() and rename(), the directory that contains the deleted entries is obvious (the directory that will contain the new directory) while for link() it can (in the general case) only be found in soft updates data structures. Soft updates must track this because (if the link count became 0) it will not clear the inode before all directory entries that pointed to it have been written. Simply replacing the i_nlink < LINK_MAX check with i_effnlink < LINK_MAX is unsafe because it will lead to overflow of the 16-bit signed i_nlink field. If the field is made larger, I don't see how it is prevented that the code commits such a set of changes that an inode on disk has more than LINK_MAX links for some time (for example if a file in the new directory is fsynced while the old directory entries are still on the disk). -- Jilles Tjoelker