Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 5 Jan 2016 13:07:40 +0200
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        Benno Rice <benno@FreeBSD.org>
Cc:        freebsd-current <freebsd-current@freebsd.org>
Subject:   Re: Possible bug in or around posix_fadvise after r292326
Message-ID:  <20160105110740.GH3625@kib.kiev.ua>
In-Reply-To: <ACCB7B60-A8BA-418D-BE71-EC4E83693FFA@FreeBSD.org>
References:  <ACCB7B60-A8BA-418D-BE71-EC4E83693FFA@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Jan 04, 2016 at 10:05:21PM -0800, Benno Rice wrote:
> Hi Konstantin,
> 
> I recently updated my dev box to r292962. After doing this I attempted to set up PostgreSQL 9.4. When I ran initdb the last phase hung. Using procstat -kk I found it appeared to be stuck in a loop inside a posix_fadvise syscall. I could not ^C or ^Z the initdb process. I could kill it but a subsequent attempt to rm -rf the /usr/local/pgsql/data directory also got stuck and was unkillable by any means. Rebooting allowed me to remove the directory but the initdb process still hung when I re-ran it.
> 
> I tried PostgreSQL 9.3 with similar results.
> 
> Looking at the source code for initdb I found that it calls posix_fadvise like so[1]:
> 
>      /*
>       * We do what pg_flush_data() would do in the backend: prefer to use
>       * sync_file_range, but fall back to posix_fadvise.  We ignore errors
>       * because this is only a hint.
>       */
>  #if defined(HAVE_SYNC_FILE_RANGE)
>      (void) sync_file_range(fd, 0, 0, SYNC_FILE_RANGE_WRITE);
>  #elif defined(USE_POSIX_FADVISE) && defined(POSIX_FADV_DONTNEED)
>      (void) posix_fadvise(fd, 0, 0, POSIX_FADV_DONTNEED);
>  #else
>  #error PG_FLUSH_DATA_WORKS should not have been defined
>  #endif
> 
> Looking for recent commits involving POSIX_FADV_DONTNEED I found r292326:
> 
> https://svnweb.freebsd.org/changeset/base/292326 <https://svnweb.freebsd.org/changeset/base/292326>;
> 
> Backing this revision out allowed the initdb process to complete.
> 
> My current theory is that some how we???re getting ENOLCK or EAGAIN from the BUF_TIMELOCK call in bnoreuselist:
> 
> https://svnweb.freebsd.org/base/head/sys/kern/vfs_subr.c?view=annotate#l1676 <https://svnweb.freebsd.org/base/head/sys/kern/vfs_subr.c?view=annotate#l1676>;
> 
> Leading to an infinite loop in vop_stdadvise:
> 
> https://svnweb.freebsd.org/base/head/sys/kern/vfs_default.c?annotate=292373#l1083 <https://svnweb.freebsd.org/base/head/sys/kern/vfs_default.c?annotate=292373#l1083>;
> 
> I haven???t managed to dig any deeper than that yet.
> 
> Is there any other information I could give you to help narrow this down?

I do not see this issue locally.

When the hang in initdb occur, what is the state of the initdb thread
which performs advise() ?  Is it "brlsfl" sleep, or is the thread running ?

If buffer lock is not available, and this is the cause of the ENOLCK/EAGAIN,
then the question is who is the owner of the corresponding buffer lock.
You could overview the state of the system with 'ps' command in ddb, and
'show alllocks' would list owner, unless buffer was async.

Also, I do not quite understand the behaviour of SIGINT/SIGKILL.  Could
it be that the process was not killed by SIGKILL as well ?  It would be
consistent with the vnode lock still owned and preventing the accesses.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20160105110740.GH3625>