Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 04 Oct 2012 19:14:19 +0300
From:      Andriy Gapon <avg@FreeBSD.org>
To:        Nikolay Denev <ndenev@gmail.com>, freebsd-fs <freebsd-fs@FreeBSD.org>, Pawel Jakub Dawidek <pjd@FreeBSD.org>
Subject:   Re: nfs + zfs hangs on RELENG_9
Message-ID:  <506DB5DB.7080302@FreeBSD.org>
In-Reply-To: <506D81A7.8030506@FreeBSD.org>
References:  <906543F2-96BD-4519-B693-FD5AFB646F87@gmail.com> <506BF372.1090208@FreeBSD.org> <CF9C7048-15C1-4C7A-8395-2BAB3AE31322@gmail.com> <506C4049.4040100@FreeBSD.org> <D50A4777-BF2E-438A-B15B-661D4CB3C3B6@gmail.com> <506D81A7.8030506@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
on 04/10/2012 15:31 Andriy Gapon said the following:
> 
> [restoring cc to fs@]
> 
> on 04/10/2012 14:32 Nikolay Denev said the following:
>> I have procstat only for the nfsd threads from the moment of the IO hang.
>> And this is the only one with "arc" :
>>
>>      1422 138630 nfsd             nfsd: service    mi_switch+0x186
>>     sleepq_wait+0x42 _sleep+0x390 arc_lowmem+0x77 kmem_malloc+0xc1
>>     uma_large_malloc+0x4a malloc+0xd9 arc_get_data_buf+0xb5 arc_read_nolock+0x1ec
>>     arc_read+0x93 dbuf_read+0x452 dmu_buf_hold_array_by_dnode+0x16b
>>     dmu_buf_hold_array+0x67 dmu_read_uio+0x3f zfs_freebsd_read+0x3e8
>>     nfsvno_read+0x2e5 nfsrvd_read+0x3ff nfsrvd_dorpc+0x3c0
> 
> Oh, very important stack trace.
> 
> Earlier Nikolay Denev said the following:
>>   PID    TID COMM             TDNAME           KSTACK                       
>>     7 100192 zfskern          arc_reclaim_thre mi_switch+0x186 sleepq_wait+0x42 _sx_xlock_hard+0x428
>> _sx_xlock+0x51 arc_buf_remove_ref+0x8a dbuf_rele_and_unlock+0x132 dbuf_evict+0x11
>> dbuf_do_evict+0x53 arc_do_user_evicts+0xb4 arc_reclaim_thread+0x263 fork_exit+0x11f
>> fork_trampoline+0xe 
> 
> To me this looks like a deadlock caused by a FreeBSD add-on to ZFS: arc_lowmem
> handler.
> I think that this is what happens:
> The nfsd thread does read, arc_read_nolock finds a buffer in a ghost cache and
> calls arc_get_data_buf while holding a hash_lock (one of buffer hash locks).
> arc_get_data_buf needs to allocate some memory and, as luck would have it, there
> is a memory shortage.  Low memory handlers are invoked (directly) and one of them
> is arc_lowmem.  arc_lowmem simply kicks arc_reclaim_thread to do its job and then
> loops sleep-waiting until memory shortage is less severe.  arc_reclaim_thread
> tries to evict some buffers and, as luck would have it again, it attempts to evict
> either the same buffer or, most likely, a different buffer that hashes to the same
> lock.
> So arc_reclaim_thread is blocked on the arc buffer lock.  While the nfsd thread
> holds the lock, but waits in arc_lowmem for arc_reclaim_thread to make progress.
> 
> Eventually the held lock stalls other threads that attempt to grab it, the stall
> propagates to txg_sync_thread threads and all ZFS I/O stops.
> 

BTW, one thing to note here is that the lowmem hook was invoked because of KVA
space shortage, not because of page shortage.

>From practical point of view this may mean that having sufficient KVA size may
help to not run into this deadlock.

>From programming point of view I am tempted to let arc_lowmem block only if
curproc == pageproc.  That should both handle the case where blocking is most
needed and should prevent the deadlock described above.

-- 
Andriy Gapon



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?506DB5DB.7080302>