From owner-freebsd-current@freebsd.org Wed May 13 07:45:29 2020 Return-Path: Delivered-To: freebsd-current@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 70D932ED8C3 for ; Wed, 13 May 2020 07:45:29 +0000 (UTC) (envelope-from agapon@gmail.com) Received: from mail-lj1-f196.google.com (mail-lj1-f196.google.com [209.85.208.196]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 49MRY44Gf9z3Csg; Wed, 13 May 2020 07:45:28 +0000 (UTC) (envelope-from agapon@gmail.com) Received: by mail-lj1-f196.google.com with SMTP id l19so16668687lje.10; Wed, 13 May 2020 00:45:28 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:from:to:references:cc:openpgp:autocrypt :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=w3KoPuVJlXmFqJQou/1r4GD+FoDiyKZqJwgUALVp7BE=; b=a5ARPHevGd1zODtmOdcTfBFgEmmT31Apfzgk1kIUOfeG26ka7zh3/D4oeupFU222yi QyWk1/JAaMzmloGtQZKHSEBY3vRXGvIl9O2WOq6a6Rl0PwaCmFwUC2niHRMOuvR3My4r YYaZlG7yXLehZvisO0QsL6IMjp/VA/r+H7Paw3xxBoAisByq6+J7m90RXywNK326qokF CJkDqYZFxIbvhtOKiYWNUjX51fFdKq8arvwKbILPpyMhqO96L+s5ZD/imDhN4Sa+kIBW ttWK9Ux3SnI8zdYA70NSfyk/ckGWTXXiPvCWHBAGS0BwbmJlTKaYT31+meagU3GWHrLM m6Gw== X-Gm-Message-State: AOAM533pQmut+tk5WlU9wOMFa+hWdG/s8AHZN8kD1XqrZtgzDyiXLMZx fpYyYo7JCbeaDNoZwibQxIMTpwuT3ME= X-Google-Smtp-Source: ABdhPJywBThsl+vFDyE7YawrEQJCOL4fyU3AmjiM60iCVRORkDRn3FwuaUx6EIxY2AK220xP3R/RDA== X-Received: by 2002:a2e:3c06:: with SMTP id j6mr5364007lja.9.1589355926528; Wed, 13 May 2020 00:45:26 -0700 (PDT) Received: from [192.168.0.88] (east.meadow.volia.net. [93.72.151.96]) by smtp.googlemail.com with ESMTPSA id v19sm17146825lfa.54.2020.05.13.00.45.25 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 13 May 2020 00:45:25 -0700 (PDT) Subject: Re: zfs deadlock on r360452 relating to busy vm page From: Andriy Gapon To: Bryan Drewery , freebsd-current@FreeBSD.org References: <2bdc8563-283b-32cc-8a1a-85ff52aca99e@FreeBSD.org> <0e9cceba-84d0-ec4f-8784-36703452201d@FreeBSD.org> Openpgp: preference=signencrypt Autocrypt: addr=avg@FreeBSD.org; prefer-encrypt=mutual; keydata= mQINBFm4LIgBEADNB/3lT7f15UKeQ52xCFQx/GqHkSxEdVyLFZTmY3KyNPQGBtyvVyBfprJ7 mAeXZWfhat6cKNRAGZcL5EmewdQuUfQfBdYmKjbw3a9GFDsDNuhDA2QwFt8BmkiVMRYyvI7l N0eVzszWCUgdc3qqM6qqcgBaqsVmJluwpvwp4ZBXmch5BgDDDb1MPO8AZ2QZfIQmplkj8Y6Z AiNMknkmgaekIINSJX8IzRzKD5WwMsin70psE8dpL/iBsA2cpJGzWMObVTtCxeDKlBCNqM1i gTXta1ukdUT7JgLEFZk9ceYQQMJJtUwzWu1UHfZn0Fs29HTqawfWPSZVbulbrnu5q55R4PlQ /xURkWQUTyDpqUvb4JK371zhepXiXDwrrpnyyZABm3SFLkk2bHlheeKU6Yql4pcmSVym1AS4 dV8y0oHAfdlSCF6tpOPf2+K9nW1CFA8b/tw4oJBTtfZ1kxXOMdyZU5fiG7xb1qDgpQKgHUX8 7Rd2T1UVLVeuhYlXNw2F+a2ucY+cMoqz3LtpksUiBppJhw099gEXehcN2JbUZ2TueJdt1FdS ztnZmsHUXLxrRBtGwqnFL7GSd6snpGIKuuL305iaOGODbb9c7ne1JqBbkw1wh8ci6vvwGlzx rexzimRaBzJxlkjNfMx8WpCvYebGMydNoeEtkWldtjTNVsUAtQARAQABtB5BbmRyaXkgR2Fw b24gPGF2Z0BGcmVlQlNELm9yZz6JAlQEEwEIAD4WIQS+LEO7ngQnXA4Bjr538m7TUc1yjwUC WbgsiAIbIwUJBaOagAULCQgHAgYVCAkKCwIEFgIDAQIeAQIXgAAKCRB38m7TUc1yj+JAEACV l9AK/nOWAt/9cufV2fRj0hdOqB1aCshtSrwHk/exXsDa4/FkmegxXQGY+3GWX3deIyesbVRL rYdtdK0dqJyT1SBqXK1h3/at9rxr9GQA6KWOxTjUFURsU7ok/6SIlm8uLRPNKO+yq0GDjgaO LzN+xykuBA0FlhQAXJnpZLcVfPJdWv7sSHGedL5ln8P8rxR+XnmsA5TUaaPcbhTB+mG+iKFj GghASDSfGqLWFPBlX/fpXikBDZ1gvOr8nyMY9nXhgfXpq3B6QCRYKPy58ChrZ5weeJZ29b7/ QdEO8NFNWHjSD9meiLdWQaqo9Y7uUxN3wySc/YUZxtS0bhAd8zJdNPsJYG8sXgKjeBQMVGuT eCAJFEYJqbwWvIXMfVWop4+O4xB+z2YE3jAbG/9tB/GSnQdVSj3G8MS80iLS58frnt+RSEw/ psahrfh0dh6SFHttE049xYiC+cM8J27Aaf0i9RflyITq57NuJm+AHJoU9SQUkIF0nc6lfA+o JRiyRlHZHKoRQkIg4aiKaZSWjQYRl5Txl0IZUP1dSWMX4s3XTMurC/pnja45dge/4ESOtJ9R 8XuIWg45Oq6MeIWdjKddGhRj3OohsltKgkEU3eLKYtB6qRTQypHHUawCXz88uYt5e3w4V16H lCpSTZV/EVHnNe45FVBlvK7k7HFfDDkryLkCDQRZuCyIARAAlq0slcsVboY/+IUJdcbEiJRW be9HKVz4SUchq0z9MZPX/0dcnvz/gkyYA+OuM78dNS7Mbby5dTvOqfpLJfCuhaNYOhlE0wY+ 1T6Tf1f4c/uA3U/YiadukQ3+6TJuYGAdRZD5EqYFIkreARTVWg87N9g0fT9BEqLw9lJtEGDY EWUE7L++B8o4uu3LQFEYxcrb4K/WKmgtmFcm77s0IKDrfcX4doV92QTIpLiRxcOmCC/OCYuO jB1oaaqXQzZrCutXRK0L5XN1Y1PYjIrEzHMIXmCDlLYnpFkK+itlXwlE2ZQxkfMruCWdQXye syl2fynAe8hvp7Mms9qU2r2K9EcJiR5N1t1C2/kTKNUhcRv7Yd/vwusK7BqJbhlng5ZgRx0m WxdntU/JLEntz3QBsBsWM9Y9wf2V4tLv6/DuDBta781RsCB/UrU2zNuOEkSixlUiHxw1dccI 6CVlaWkkJBxmHX22GdDFrcjvwMNIbbyfQLuBq6IOh8nvu9vuItup7qemDG3Ms6TVwA7BD3j+ 3fGprtyW8Fd/RR2bW2+LWkMrqHffAr6Y6V3h5kd2G9Q8ZWpEJk+LG6Mk3fhZhmCnHhDu6CwN MeUvxXDVO+fqc3JjFm5OxhmfVeJKrbCEUJyM8ESWLoNHLqjywdZga4Q7P12g8DUQ1mRxYg/L HgZY3zfKOqcAEQEAAYkCPAQYAQgAJhYhBL4sQ7ueBCdcDgGOvnfybtNRzXKPBQJZuCyIAhsM BQkFo5qAAAoJEHfybtNRzXKPBVwQAKfFy9P7N3OsLDMB56A4Kf+ZT+d5cIx0Yiaf4n6w7m3i ImHHHk9FIetI4Xe54a2IXh4Bq5UkAGY0667eIs+Z1Ea6I2i27Sdo7DxGwq09Qnm/Y65ADvXs 3aBvokCcm7FsM1wky395m8xUos1681oV5oxgqeRI8/76qy0hD9WR65UW+HQgZRIcIjSel9vR XDaD2HLGPTTGr7u4v00UeTMs6qvPsa2PJagogrKY8RXdFtXvweQFz78NbXhluwix2Tb9ETPk LIpDrtzV73CaE2aqBG/KrboXT2C67BgFtnk7T7Y7iKq4/XvEdDWscz2wws91BOXuMMd4c/c4 OmGW9m3RBLufFrOag1q5yUS9QbFfyqL6dftJP3Zq/xe+mr7sbWbhPVCQFrH3r26mpmy841ym dwQnNcsbIGiBASBSKksOvIDYKa2Wy8htPmWFTEOPRpFXdGQ27awcjjnB42nngyCK5ukZDHi6 w0qK5DNQQCkiweevCIC6wc3p67jl1EMFY5+z+zdTPb3h7LeVnGqW0qBQl99vVFgzLxchKcl0 R/paSFgwqXCZhAKMuUHncJuynDOP7z5LirUeFI8qsBAJi1rXpQoLJTVcW72swZ42IdPiboqx NbTMiNOiE36GqMcTPfKylCbF45JNX4nF9ElM0E+Y8gi4cizJYBRr2FBJgay0b9Cp Message-ID: <889cb93b-85c7-3ec4-4ccf-5fb56ec38fa5@FreeBSD.org> Date: Wed, 13 May 2020 10:45:24 +0300 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:60.0) Gecko/20100101 Firefox/60.0 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: <0e9cceba-84d0-ec4f-8784-36703452201d@FreeBSD.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 49MRY44Gf9z3Csg X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of agapon@gmail.com designates 209.85.208.196 as permitted sender) smtp.mailfrom=agapon@gmail.com X-Spamd-Result: default: False [-2.02 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; RCVD_TLS_ALL(0.00)[]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; R_SPF_ALLOW(-0.20)[+ip4:209.85.128.0/17:c]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; MIME_TRACE(0.00)[0:+]; DMARC_NA(0.00)[FreeBSD.org]; TO_DN_SOME(0.00)[]; NEURAL_HAM_LONG(-1.00)[-0.997,0]; RCVD_COUNT_THREE(0.00)[3]; IP_SCORE(-0.03)[ip: (0.71), ipnet: 209.85.128.0/17(-0.39), asn: 15169(-0.42), country: US(-0.05)]; RCVD_IN_DNSWL_NONE(0.00)[196.208.85.209.list.dnswl.org : 127.0.5.0]; NEURAL_HAM_MEDIUM(-0.99)[-0.992,0]; FORGED_SENDER(0.30)[avg@FreeBSD.org,agapon@gmail.com]; RWL_MAILSPIKE_POSSIBLE(0.00)[196.208.85.209.rep.mailspike.net : 127.0.0.17]; R_DKIM_NA(0.00)[]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:209.85.128.0/17, country:US]; FROM_NEQ_ENVFROM(0.00)[avg@FreeBSD.org,agapon@gmail.com]; MID_RHS_MATCH_FROM(0.00)[]; RECEIVED_SPAMHAUS_PBL(0.00)[96.151.72.93.khpj7ygk5idzvmvt5x4ziurxhy.zen.dq.spamhaus.net : 127.0.0.10] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 13 May 2020 07:45:29 -0000 On 13/05/2020 10:35, Andriy Gapon wrote: > On 13/05/2020 01:47, Bryan Drewery wrote: >> Trivial repro: >> >> dd if=/dev/zero of=blah & tail -F blah >> ^C >> load: 0.21 cmd: tail 72381 [prev->lr_read_cv] 2.17r 0.00u 0.01s 0% 2600k >> #0 0xffffffff80bce615 at mi_switch+0x155 >> #1 0xffffffff80c1cfea at sleepq_switch+0x11a >> #2 0xffffffff80b57f0a at _cv_wait+0x15a >> #3 0xffffffff829ddab6 at rangelock_enter+0x306 >> #4 0xffffffff829ecd3f at zfs_freebsd_getpages+0x14f >> #5 0xffffffff810e3ab9 at VOP_GETPAGES_APV+0x59 >> #6 0xffffffff80f349e7 at vnode_pager_getpages+0x37 >> #7 0xffffffff80f2a93f at vm_pager_get_pages+0x4f >> #8 0xffffffff80f054b0 at vm_fault+0x780 >> #9 0xffffffff80f04bde at vm_fault_trap+0x6e >> #10 0xffffffff8106544e at trap_pfault+0x1ee >> #11 0xffffffff81064a9c at trap+0x44c >> #12 0xffffffff8103a978 at calltrap+0x8 > > In r329363 I re-worked zfs_getpages and introduced range locking to it. > At the time I believed that it was safe and maybe it was, please see the commit > message. > There, indeed, have been many performance / concurrency improvements to the VM > system and r358443 is one of them. Thinking more about it, it could be r352176. I think that vm_page_grab_valid (and later vm_page_grab_valid_unlocked) are not equivalent to the code that they replaced. The original code would check valid field before any locking and it would attempt any locking / busing if a page is invalid. The object was required to be locked though. The new code tries to busy the page in any case. > I am not sure how to resolve the problem best. Maybe someone who knows the > latest VM code better than me can comment on my assumptions stated in the commit > message. > > In illumos (and, I think, in OpenZFS/ZoL) they don't have the range locking in > this corner of the code because of a similar deadlock a long time ago. > >> On 5/12/2020 3:13 PM, Bryan Drewery wrote: >>>> panic: deadlres_td_sleep_q: possible deadlock detected for 0xfffffe25eefa2e00 (find), blocked for 1802392 ticks > ... >>>> (kgdb) backtrace >>>> #0 sched_switch (td=0xfffffe255eac0000, flags=) at /usr/src/sys/kern/sched_ule.c:2147 >>>> #1 0xffffffff80bce615 in mi_switch (flags=260) at /usr/src/sys/kern/kern_synch.c:542 >>>> #2 0xffffffff80c1cfea in sleepq_switch (wchan=0xfffff810fb57dd48, pri=0) at /usr/src/sys/kern/subr_sleepqueue.c:625 >>>> #3 0xffffffff80b57f0a in _cv_wait (cvp=0xfffff810fb57dd48, lock=0xfffff80049a99040) at /usr/src/sys/kern/kern_condvar.c:146 >>>> #4 0xffffffff82434ab6 in rangelock_enter_reader (rl=0xfffff80049a99018, new=0xfffff8022cadb100) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_rlock.c:429 >>>> #5 rangelock_enter (rl=0xfffff80049a99018, off=, len=, type=) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_rlock.c:477 >>>> #6 0xffffffff82443d3f in zfs_getpages (vp=, ma=0xfffffe259f204b18, count=, rbehind=0xfffffe259f204ac4, rahead=0xfffffe259f204ad0) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:4500 >>>> #7 zfs_freebsd_getpages (ap=) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:4567 >>>> #8 0xffffffff810e3ab9 in VOP_GETPAGES_APV (vop=0xffffffff8250a1e0 , a=0xfffffe259f2049f0) at vnode_if.c:2644 >>>> #9 0xffffffff80f349e7 in VOP_GETPAGES (vp=, m=, count=, rbehind=, rahead=) at ./vnode_if.h:1171 >>>> #10 vnode_pager_getpages (object=, m=, count=, rbehind=, rahead=) at /usr/src/sys/vm/vnode_pager.c:743 >>>> #11 0xffffffff80f2a93f in vm_pager_get_pages (object=0xfffff806cb637c60, m=0xfffffe259f204b18, count=1, rbehind=, rahead=) at /usr/src/sys/vm/vm_pager.c:305 >>>> #12 0xffffffff80f054b0 in vm_fault_getpages (fs=, nera=0, behindp=, aheadp=) at /usr/src/sys/vm/vm_fault.c:1163 >>>> #13 vm_fault (map=, vaddr=, fault_type=, fault_flags=, m_hold=) at /usr/src/sys/vm/vm_fault.c:1394 >>>> #14 0xffffffff80f04bde in vm_fault_trap (map=0xfffffe25653949e8, vaddr=, fault_type=, fault_flags=0, signo=0xfffffe259f204d04, ucode=0xfffffe259f204d00) at /usr/src/sys/vm/vm_fault.c:589 >>>> #15 0xffffffff8106544e in trap_pfault (frame=0xfffffe259f204d40, usermode=, signo=, ucode=) at /usr/src/sys/amd64/amd64/trap.c:821 >>>> #16 0xffffffff81064a9c in trap (frame=0xfffffe259f204d40) at /usr/src/sys/amd64/amd64/trap.c:340 >>>> #17 >>>> #18 0x00000000002034fc in ?? () > ... >>>> (kgdb) thread >>>> [Current thread is 8 (Thread 101255)] >>>> (kgdb) backtrace >>>> #0 sched_switch (td=0xfffffe25c8e9bc00, flags=) at /usr/src/sys/kern/sched_ule.c:2147 >>>> #1 0xffffffff80bce615 in mi_switch (flags=260) at /usr/src/sys/kern/kern_synch.c:542 >>>> #2 0xffffffff80c1cfea in sleepq_switch (wchan=0xfffffe001cbca850, pri=84) at /usr/src/sys/kern/subr_sleepqueue.c:625 >>>> #3 0xffffffff80f1de50 in _vm_page_busy_sleep (obj=, m=0xfffffe001cbca850, pindex=, wmesg=, allocflags=21504, locked=false) at /usr/src/sys/vm/vm_page.c:1094 >>>> #4 0xffffffff80f241f7 in vm_page_grab_sleep (object=0xfffff806cb637c60, m=, pindex=, wmesg=, allocflags=21504, locked=) at /usr/src/sys/vm/vm_page.c:4326 >>>> #5 vm_page_acquire_unlocked (object=0xfffff806cb637c60, pindex=1098, prev=, mp=0xfffffe2717fc6730, allocflags=21504) at /usr/src/sys/vm/vm_page.c:4469 >>>> #6 0xffffffff80f24c61 in vm_page_grab_valid_unlocked (mp=0xfffffe2717fc6730, object=0xfffff806cb637c60, pindex=1098, allocflags=21504) at /usr/src/sys/vm/vm_page.c:4645 >>>> #7 0xffffffff82440246 in page_busy (vp=0xfffff80571f29500, start=4497408, off=, nbytes=) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:414 >>>> #8 update_pages (vp=0xfffff80571f29500, start=4497408, len=32, os=0xfffff8096a277400, oid=2209520, segflg=, tx=) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:482 >>>> #9 zfs_write (vp=, uio=, ioflag=0, cr=, ct=) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:1071 >>>> #10 zfs_freebsd_write (ap=) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:4838 >>>> #11 0xffffffff810e0eaf in VOP_WRITE_APV (vop=0xffffffff8250a1e0 , a=0xfffffe2717fc68c8) at vnode_if.c:925 >>>> #12 0xffffffff80cb574c in VOP_WRITE (vp=0xfffff80571f29500, uio=0xfffffe2717fc6bb0, ioflag=8323073, cred=) at ./vnode_if.h:413 >>>> #13 vn_write (fp=0xfffff8048195e8c0, uio=, active_cred=, flags=, td=) at /usr/src/sys/kern/vfs_vnops.c:894 >>>> #14 0xffffffff80cb50c3 in vn_io_fault_doio (args=0xfffffe2717fc6af0, uio=0xfffffe2717fc6bb0, td=0xfffffe25c8e9bc00) at /usr/src/sys/kern/vfs_vnops.c:959 >>>> #15 0xffffffff80cb1c8c in vn_io_fault1 (vp=, uio=0xfffffe2717fc6bb0, args=0xfffffe2717fc6af0, td=0xfffffe25c8e9bc00) at /usr/src/sys/kern/vfs_vnops.c:1077 >>>> #16 0xffffffff80cafa32 in vn_io_fault (fp=0xfffff8048195e8c0, uio=0xfffffe2717fc6bb0, active_cred=0xfffff80f2cc12708, flags=0, td=) at /usr/src/sys/kern/vfs_vnops.c:1181 >>>> #17 0xffffffff80c34331 in fo_write (fp=0xfffff8048195e8c0, uio=0xfffffe2717fc6bb0, active_cred=, flags=, td=0xfffffe25c8e9bc00) at /usr/src/sys/sys/file.h:326 >>>> #18 dofilewrite (td=0xfffffe25c8e9bc00, fd=2, fp=0xfffff8048195e8c0, auio=0xfffffe2717fc6bb0, offset=, flags=) at /usr/src/sys/kern/sys_generic.c:564 >>>> #19 0xffffffff80c33eb0 in kern_writev (td=0xfffffe25c8e9bc00, fd=2, auio=) at /usr/src/sys/kern/sys_generic.c:491 >>>> #20 sys_write (td=0xfffffe25c8e9bc00, uap=) at /usr/src/sys/kern/sys_generic.c:406 >>>> #21 0xffffffff8106623d in syscallenter (td=) at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:150 >>>> #22 amd64_syscall (td=0xfffffe25c8e9bc00, traced=0) at /usr/src/sys/amd64/amd64/trap.c:1161 >>>> #23 >>>> #24 0x000000080043d53a in ?? () >>> >>> Maybe r358443 is related? >>> >>> >>>> (kgdb) frame 4 >>>> #4 0xffffffff80f241f7 in vm_page_grab_sleep (object=0xfffff806cb637c60, m=, pindex=, wmesg=, allocflags=21504, locked=) at /usr/src/sys/vm/vm_page.c:4326 >>>> 4326 if (_vm_page_busy_sleep(object, m, m->pindex, wmesg, allocflags, >>>> (kgdb) p *object >>>> $8 = {lock = {lock_object = {lo_name = 0xffffffff8114fa30 "vm object", lo_flags = 627245056, lo_data = 0, lo_witness = 0x0}, rw_lock = 1}, object_list = {tqe_next = 0xfffff806cb637d68, tqe_prev = 0xfffff806cb637b78}, shadow_head = {lh_first = 0x0}, shadow_list = {le_next = 0xffffffffffffffff, >>>> le_prev = 0xffffffffffffffff}, memq = {tqh_first = 0xfffffe001cbca850, tqh_last = 0xfffffe001cbca860}, rtree = {rt_root = 18446741875168421969}, size = 1099, domain = {dr_policy = 0x0, dr_iter = 0}, generation = 1, cleangeneration = 1, ref_count = 2, shadow_count = 0, memattr = 6 '\006', type = 2 '\002', >>>> flags = 4096, pg_color = 0, paging_in_progress = {__count = 2}, busy = {__count = 0}, resident_page_count = 1, backing_object = 0x0, backing_object_offset = 0, pager_object_list = {tqe_next = 0x0, tqe_prev = 0x0}, rvq = {lh_first = 0x0}, handle = 0xfffff80571f29500, un_pager = {vnp = {vnp_size = 4499568, >>>> writemappings = 0}, devp = {devp_pglist = {tqh_first = 0x44a870, tqh_last = 0x0}, ops = 0x0, dev = 0x0}, sgp = {sgp_pglist = {tqh_first = 0x44a870, tqh_last = 0x0}}, swp = {swp_tmpfs = 0x44a870, swp_blks = {pt_root = 0}, writemappings = 0}}, cred = 0x0, charge = 0, umtx_data = 0x0} >>>> (kgdb) frame 5 >>>> #5 vm_page_acquire_unlocked (object=0xfffff806cb637c60, pindex=1098, prev=, mp=0xfffffe2717fc6730, allocflags=21504) at /usr/src/sys/vm/vm_page.c:4469 >>>> 4469 if (!vm_page_grab_sleep(object, m, pindex, "pgnslp", >>>> (kgdb) p *m >>>> $9 = {plinks = {q = {tqe_next = 0xffffffffffffffff, tqe_prev = 0xffffffffffffffff}, s = {ss = {sle_next = 0xffffffffffffffff}}, memguard = {p = 18446744073709551615, v = 18446744073709551615}, uma = {slab = 0xffffffffffffffff, zone = 0xffffffffffffffff}}, listq = {tqe_next = 0x0, tqe_prev = 0xfffff806cb637ca8}, >>>> object = 0xfffff806cb637c60, pindex = 1098, phys_addr = 18988408832, md = {pv_list = {tqh_first = 0x0, tqh_last = 0xfffffe001cbca888}, pv_gen = 44682, pat_mode = 6}, ref_count = 2147483648, busy_lock = 1588330502, a = {{flags = 0, queue = 255 '\377', act_count = 0 '\000'}, _bits = 16711680}, order = 13 '\r', >>>> pool = 0 '\000', flags = 1 '\001', oflags = 0 '\000', psind = 0 '\000', segind = 6 '\006', valid = 0 '\000', dirty = 0 '\000'} >>> >>> Pretty sure this thread is holding the rangelock from zfs_write() that >>> tail is waiting on. So what is this thread (101255) waiting on exactly >>> for? I'm not sure the way to track down what is using vm object >>> 0xfffff806cb637c60. If the tail thread busied the page then they are >>> waiting on each other I guess. If that's true then r358443 removing the >>> write lock on the object in update_pages() could be a problem. >>> >>> >>> Not sure the rest is interesting. I think they are just waiting on the >>> locked vnode but I give it here in case I missed something. > > -- Andriy Gapon