From owner-freebsd-fs@freebsd.org Mon Mar 28 23:51:01 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 172F9AE04B9 for ; Mon, 28 Mar 2016 23:51:01 +0000 (UTC) (envelope-from sobomax@sippysoft.com) Received: from mail-oi0-x231.google.com (mail-oi0-x231.google.com [IPv6:2607:f8b0:4003:c06::231]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id CF6CD1DEC for ; Mon, 28 Mar 2016 23:51:00 +0000 (UTC) (envelope-from sobomax@sippysoft.com) Received: by mail-oi0-x231.google.com with SMTP id h6so135774178oia.2 for ; Mon, 28 Mar 2016 16:51:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sippysoft-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc; bh=xOpc2G+wyYEpY8w/KSJDXtuiEj5ZRZnTKndkWUy3gDs=; b=yU5yG4EAelsWtLWkcPcSdTjD0//WukO5bQwRS2k1CQJD5UED44orpx6zikT5YFr4XU EjKrm/T9lBKPYs+bRHm8eqKth/C3CTRKtT44YcvTHelO0ITyfO4tJCsBdw7TQYoDK35i 3WnIfFUy4nH9FgjHwUZFBYeljF+R/yXypqZ/aZU3ODMZj52fxpls6oMWWRoR+QJVEBWx /HEsPL6R/6p2iC40QUV5vGwYR0ogCcpVkQWvZM60T7/4zNCq1pgizZaCGhZpDs+4RrB4 4uCUEf/oX3Oxl3jeplsQf2kJ4mD8di+lUM/zF13JiJU34358YKiPMHFPByDpFOh8JLT0 wWcQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc; bh=xOpc2G+wyYEpY8w/KSJDXtuiEj5ZRZnTKndkWUy3gDs=; b=m6ZyOgjttw5+PgGEsNWllME9H6wAjGcSVlTaeEPw42aUqXCjXn4KFhckVJPvC2ssiO 51WYGhWZK6YUbW1+QfWhKfAhdxQ8v+/zAMIqVCn1bNVe9hDzhf9+s9aqy0DknB62deQh NEFyedtuH95ioPgl4i/drG/hi6BoOuSeEaRa0weiMbCn0O5UnlZr0qeC64huCAIwuav2 iHaionvcaTLD7QUmBSa3fhwuw1zn5xUwS5kmNaqpNPzwI+lCOmigl0T1Q02KgN59D9Xi Ftn1Im4IeVu9qyBjnCJrLj80jmWcERmD5rMou1oa6xQN0FswPjIfUV4NabN8TS/becuS Bk1g== X-Gm-Message-State: AD7BkJKupOYsWHS0ztpIayg9neRXNIAzhE65jRruvUY+unlbSd0Ec2AxMj27b3zQTgHJiphyjnRa5+CzbPRjDmKR MIME-Version: 1.0 X-Received: by 10.202.200.16 with SMTP id y16mr2897996oif.92.1459209060243; Mon, 28 Mar 2016 16:51:00 -0700 (PDT) Received: by 10.157.11.143 with HTTP; Mon, 28 Mar 2016 16:50:59 -0700 (PDT) In-Reply-To: <56F96792.2010800@FreeBSD.org> References: <20160328162310.GJ1741__41334.1269981631$1459182219$gmane$org@kib.kiev.ua> <56F96792.2010800@FreeBSD.org> Date: Mon, 28 Mar 2016 16:50:59 -0700 Message-ID: Subject: Re: Process stuck in "vnread" From: Maxim Sobolev To: Andriy Gapon Cc: Konstantin Belousov , freebsd-fs@freebsd.org, Kirk McKusick , stable@freebsd.org, kib@freebsd.org Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.21 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Mar 2016 23:51:01 -0000 Andriy, this is file that gets copies from md(4)-baked UFS to file that is located on ZFS zraid volume that is mounted via NULLFS. The file that backs up md(4) is located on ZFS, so in a sense we have full cycle with the backing block starting on ZFS and ending up on ZFS too. cp(1) calls write(), with a pointer to mmaped area and the page is not mapped in, so it triggers pagein and waits for the page to arrive. In any case, it looks like your patch from D5738 might be working, I've put it into my 10.3-rc3 buildbox and give it some shake to see if it improves things or not. I've also made sure that I have all debug symbols installed, so that I can poke inside zfs.ko if I need to. Thanks, guys, Andriy what keep you from pushing that patch into the tree? On Mon, Mar 28, 2016 at 10:19 AM, Andriy Gapon wrote: > On 28/03/2016 19:23, Konstantin Belousov wrote: > > On Mon, Mar 28, 2016 at 08:52:03AM -0700, Maxim Sobolev wrote: > >> Done some head scratching, it looks like it's got page fault in the > >> copyin() (cp(1) AFAIK mmaps source file). There might be some interlock > >> issue between competing write to the same ZFS, the md0 device is locked > >> forever waiting for the write operation to complete at the very same > time. > >> I am curious as to whether we are allowed to sleep in the > dmu_write_uio_dbuf(), > >> AFAIK dmu is ZFS's transaction layer, so maybe copyin() should be done > >> earlier to avoid possible page fault in there? > > Maxim, > > is this copy from UFS to ZFS? > It looks like that because the copyin() fault goes to > vnode_pager_generic_getpages() -> bwait()... > > > No idea about ZFS, but if the issue is due to copyin(9) recursing into > > VM and then VFS while owning file system locks, it is well-known and > > long-standing issue. I sometimes call it 'ups deadlock', for some > > reasons, see tools/test/upsdl/ for the distilled test case. > > > > It is handled for UFS and NFS, read the long comment starting with 'The > > vn_io_fault() is a wrapper' in sys/kern/vfs_vnops.c, which describes the > > deadlock in details and explains the mechanism which is used to prevent > > it. Filesystems must opt-in into it by specifiying MNTK_NO_IOPF flag, > > and then being ready to get an array of pages for io instead of the > buffer > > KVA. > > > I don't have any idea why the thread would be stuck in bwait() and what > locks > and threads are involved here. But, as Kostik said, there is a general > problem > and I have a patch for ZFS: > https://reviews.freebsd.org/D2790 > > -- > Andriy Gapon > -- Maksym Sobolyev Sippy Software, Inc. Internet Telephony (VoIP) Experts Tel (Canada): +1-778-783-0474 Tel (Toll-Free): +1-855-747-7779 Fax: +1-866-857-6942 Web: http://www.sippysoft.com MSN: sales@sippysoft.com Skype: SippySoft