From owner-freebsd-fs@freebsd.org Tue Sep 5 18:22:47 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id AE0C8E1700F for ; Tue, 5 Sep 2017 18:22:47 +0000 (UTC) (envelope-from martin@lispworks.com) Received: from lwfs1-cam.cam.lispworks.com (mail.lispworks.com [46.17.166.21]) by mx1.freebsd.org (Postfix) with ESMTP id BAB4B6355C for ; Tue, 5 Sep 2017 18:22:45 +0000 (UTC) (envelope-from martin@lispworks.com) Received: from higson.cam.lispworks.com (higson.cam.lispworks.com [192.168.1.7]) by lwfs1-cam.cam.lispworks.com (8.15.2/8.15.2) with ESMTP id v85IBmMo041059; Tue, 5 Sep 2017 19:11:48 +0100 (BST) (envelope-from martin@lispworks.com) Received: from higson.cam.lispworks.com (localhost.localdomain [127.0.0.1]) by higson.cam.lispworks.com (8.14.4) id v85IBm4n005444; Tue, 5 Sep 2017 19:11:48 +0100 Received: (from martin@localhost) by higson.cam.lispworks.com (8.14.4/8.14.4/Submit) id v85IBmbO005440; Tue, 5 Sep 2017 19:11:48 +0100 Date: Tue, 5 Sep 2017 19:11:48 +0100 Message-Id: <201709051811.v85IBmbO005440@higson.cam.lispworks.com> From: Martin Simmons To: freebsd-fs@freebsd.org In-reply-to: <87k21dzdrp.fsf@thinkpad.rath.org> (message from Nikolaus Rath on Tue, 05 Sep 2017 11:38:18 +0200) Subject: Re: umount() taking minutes for FUSE filesystems References: <87bmn44ruu.fsf@vostro.rath.org> <87o9qyrbs8.fsf@vostro.rath.org> <2FAD66DE-031B-4B36-9E85-C7BC6B52B5E6@gmail.com> <29de6425-9f92-3bd8-f446-1c9dded33b15@freebsd.org> <87k21dzdrp.fsf@thinkpad.rath.org> X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Sep 2017 18:22:47 -0000 >>>>> On Tue, 05 Sep 2017 11:38:18 +0200, Nikolaus Rath said: > > On Sep 05 2017, Stefan Esser wrote: > > Am 04.09.17 um 23:14 schrieb Ben RUBSON: > >> I managed to reproduce the issue. > >> unmount takes exactly 60 seconds, as if a timeout was running. > >> > >> # procstat -kk $! > >> COMM TDNAME KSTACK > >> printcap - mi_switch+0xd2 sleepq_catch_signals+0xb7 > >> sleepq_timedwait_sig+0x10 _sleep+0x26f fdisp_wait_answ+0x171 > >> fuse_vfsop_unmount+0xf5 dounmount+0x9b6 sys_unmount+0x41b > >> amd64_syscall+0x4ce Xfast_syscall+0xfb > >> > >> # uname -sr > >> FreeBSD 11.0-RELEASE-p9 > > > > I have given the exact position of this 60 second msleep() in multiple > > mails before. It is in fuse_ipc.c, the particular msleep with "fu_ans" > > (line 333 in -CURRENT). > > > > I did not try to diagnose, why this particular umount() takes so long, > > while others are fast, but it is obvious that the kernel module does > > wait for a signal at the end of some IPC and the signal is either lost > > or never sent. There is a check for a dead connection, just before the > > msleep() and the connection is considered alive at that point (and > > should be, to support the umount() result being reported). > > > > I did not have time to look into this during the previous week and > > won't during this week, but it should not be too hard to see, what's > > going on. A starting point could be to compare this test with those > > that perform the unmount without delay. > > Probably the crucial difference is that the test that takes long exits > its main loop on its own and then informs the FUSE kernel module about > that, while the other tests terminate the main loop because the kernel > module tells them to do so. What does "informs the FUSE kernel module about that" do to inform it? __Martin