From owner-freebsd-fs@freebsd.org Fri Sep 8 13:42:25 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5B1F3E18A85 for ; Fri, 8 Sep 2017 13:42:25 +0000 (UTC) (envelope-from longwitz@incore.de) Received: from dss.incore.de (dss.incore.de [195.145.1.138]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id E280D7FD44 for ; Fri, 8 Sep 2017 13:42:24 +0000 (UTC) (envelope-from longwitz@incore.de) Received: from inetmail.dmz (inetmail.dmz [10.3.0.3]) by dss.incore.de (Postfix) with ESMTP id E7718679D8 for ; Fri, 8 Sep 2017 15:42:15 +0200 (CEST) X-Virus-Scanned: amavisd-new at incore.de Received: from dss.incore.de ([10.3.0.3]) by inetmail.dmz (inetmail.dmz [10.3.0.3]) (amavisd-new, port 10024) with LMTP id O42wkulTjw4a for ; Fri, 8 Sep 2017 15:42:14 +0200 (CEST) Received: from mail.local.incore (fwintern.dmz [10.0.0.253]) by dss.incore.de (Postfix) with ESMTP id 8BD4A6798C for ; Fri, 8 Sep 2017 15:42:13 +0200 (CEST) Received: from bsdlo.incore (bsdlo.incore [192.168.0.84]) by mail.local.incore (Postfix) with ESMTP id 73D05508A9 for ; Fri, 8 Sep 2017 15:42:13 +0200 (CEST) Message-ID: <59B29E35.9000506@incore.de> Date: Fri, 08 Sep 2017 15:42:13 +0200 From: Andreas Longwitz User-Agent: Thunderbird 2.0.0.19 (X11/20090113) MIME-Version: 1.0 To: freebsd-fs@freebsd.org Subject: fsync: giving up on dirty on ufs partitions running vfs_write_suspend() Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Sep 2017 13:42:25 -0000 I try to describe the cause for the "fsync: given up on dirty" problem described in https://lists.freebsd.org/pipermail/freebsd-fs/2012-February/013804.html or https://lists.freebsd.org/pipermail/freebsd-fs/2013-August/018163.html Now I run FreeBSD 10.3 Stable r317936 and sometimes I see messages like dssbkp4 kernel: fsync: giving up on dirty dssbkp4 kernel: 0xfffff80040d6c938: tag devfs, type VCHR dssbkp4 kernel: usecount 1, writecount 0, refcount 47 mountedhere 0xfffff8004083a200 dssbkp4 kernel: flags (VI_ACTIVE) dssbkp4 kernel: v_object 0xfffff800409b3500 ref 0 pages 1138 cleanbuf 42 dirtybuf 4 dssbkp4 kernel: lock type devfs: EXCL by thread 0xfffff800403a8a00 (pid 26, g_journal switcher, tid 100181) dssbkp4 kernel: dev mirror/gmbkp4p5.journal dssbkp4 kernel: GEOM_JOURNAL: Cannot suspend file system /home (error=35). on all of my servers running gjournal. Similar messages can be seen when a snapshot is taken (e.g. dump -L) on a arbitrary ufs partition. In all these cases the function vfs_write_suspend() was called which returned EAGAIN. This error code is set in vop_stdfsync(), when the above messages are created. First I was confused about the "mountedhere" address, because the given address does not point to a "struct mount" but (as type = VCHR indicates) to a "struct cdev". Threfore I suggest the following patch to improve the output of vn_printf() using the textstrings from defines in /sys/sys/vnode.h: --- vfs_subr.c.orig 2017-05-08 14:17:38.000000000 +0200 +++ vfs_subr.c 2017-08-30 10:45:47.549740000 +0200 @@ -3003,6 +3003,8 @@ static char *typename[] = {"VNON", "VREG", "VDIR", "VBLK", "VCHR", "VLNK", "VSOCK", "VFIFO", "VBAD", "VMARKER"}; +static char *typetext[] = +{"", "", "mountedhere", "", "rdev", "", "socket", "fifoinfo", "", ""}; void vn_printf(struct vnode *vp, const char *fmt, ...) @@ -3016,8 +3018,9 @@ va_end(ap); printf("%p: ", (void *)vp); printf("tag %s, type %s\n", vp->v_tag, typename[vp->v_type]); - printf(" usecount %d, writecount %d, refcount %d mountedhere %p\n", - vp->v_usecount, vp->v_writecount, vp->v_holdcnt, vp->v_mountedhere); + printf(" usecount %d, writecount %d, refcount %d %s %p\n", + vp->v_usecount, vp->v_writecount, vp->v_holdcnt, typetext[vp->v_type], + vp->v_mountedhere); buf[0] = '\0'; buf[1] = '\0'; if (vp->v_vflag & VV_ROOT) Second I found, that the "dirty" situation during vfs_write_suspend() only occurs when a big file (more than 10G on a partition of 116G) is removed. If vfs_write_suspend() is called immediately after "rm bigfile", then in vop_stdfsync() 1000 tries (maxretry) are done to wait for the "rm bigfile" to complete. Because a lot of bitmap writes must be done, the value 1000 is not sufficient on my servers. I have increased maxretry and in the worst case I saw 8650 tries to complete without "dirty". In this case the time spent in vop_stdfsync() was about 0,5 seconds. The following patch solves the "dirty problem" for me: --- vfs_default.c.orig 2016-10-24 12:26:57.000000000 +0200 +++ vfs_default.c 2017-09-08 12:49:18.059970000 +0200 @@ -644,7 +644,7 @@ struct bufobj *bo; struct buf *nbp; int error = 0; - int maxretry = 1000; /* large, arbitrarily chosen */ + int maxretry = 100000; /* large, arbitrarily chosen */ bo = &vp->v_bufobj; BO_LOCK(bo); --- Andreas Longwitz