From owner-freebsd-stable@freebsd.org Sat Mar 6 06:37:34 2021 Return-Path: Delivered-To: freebsd-stable@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 3B470558BBF for ; Sat, 6 Mar 2021 06:37:34 +0000 (UTC) (envelope-from chris@cretaforce.gr) Received: from relay1.cretaforce.gr (relay1.cretaforce.gr [195.201.253.145]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "*.cretaforce.gr", Issuer "RapidSSL RSA CA 2018" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Dsvzc0kxlz4twk for ; Sat, 6 Mar 2021 06:37:31 +0000 (UTC) (envelope-from chris@cretaforce.gr) Received: from server1.cretaforce.gr (server1.cretaforce.gr [138.201.248.69]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "*.cretaforce.gr", Issuer "RapidSSL RSA CA 2018" (verified OK)) by smtp1.cretaforce.gr (Postfix) with ESMTPS id 748E31F53F for ; Sat, 6 Mar 2021 08:37:29 +0200 (EET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cretaforce.gr; s=cretaforce; t=1615012649; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=7ESM2OT2eoibvLIaitNQz+Gkf/xQQWwN2ED7JxNfFuI=; b=TcvUgWZ7drdAhw0lMei8qbXxvGNuBiEyGZ9me2uuJ+35PcWN1zkKheNnuAWx/KjxqqjQKK gxHEawGktB6XFDhUWUdXwVKUinlyp8UKioXYsFA0Z9NCbCfAY+Zwhdnkih2w67h+wF8y8G c9NtKmcQBe+VFsvkLUQx+EdBcDmYXgw= Received: from macbook-air.fritz.box (athedsl-127937.home.otenet.gr [85.75.75.96]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) (Authenticated sender: chris@cretaforce.gr) by server1.cretaforce.gr (Postfix) with ESMTPSA id 238472730A for ; Sat, 6 Mar 2021 08:37:26 +0200 (EET) From: Christos Chatzaras Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.60.0.2.21\)) Subject: Re: Filesystem operations slower in 13.0 than 12.2 Date: Sat, 6 Mar 2021 08:37:24 +0200 References: <202103051842.125IgNl9013402@nuc.oldach.net> To: FreeBSD-STABLE Mailing List In-Reply-To: Message-Id: X-Mailer: Apple Mail (2.3654.60.0.2.21) X-Rspamd-Queue-Id: 4Dsvzc0kxlz4twk X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=cretaforce.gr header.s=cretaforce header.b=TcvUgWZ7; dmarc=none; spf=pass (mx1.freebsd.org: domain of chris@cretaforce.gr designates 195.201.253.145 as permitted sender) smtp.mailfrom=chris@cretaforce.gr X-Spamd-Result: default: False [-4.10 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; MV_CASE(0.50)[]; R_SPF_ALLOW(-0.20)[+ip4:195.201.253.145]; RCVD_COUNT_THREE(0.00)[3]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[cretaforce.gr:+]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+,1:+,2:~]; RBL_DBL_DONT_QUERY_IPS(0.00)[195.201.253.145:from]; ASN(0.00)[asn:24940, ipnet:195.201.0.0/16, country:DE]; MID_RHS_MATCH_FROM(0.00)[]; RCVD_IN_DNSWL_LOW(-0.10)[195.201.253.145:from]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[cretaforce.gr:s=cretaforce]; FREEFALL_USER(0.00)[chris]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-stable@freebsd.org]; DMARC_NA(0.00)[cretaforce.gr]; RCPT_COUNT_ONE(0.00)[1]; SPAMHAUS_ZRD(0.00)[195.201.253.145:from:127.0.2.255]; DWL_DNSWL_LOW(-1.00)[cretaforce.gr:dkim]; RCVD_TLS_ALL(0.00)[]; MAILMAN_DEST(0.00)[freebsd-stable] Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.34 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 06 Mar 2021 06:37:34 -0000 Hello Konstantin, > On 6 Mar 2021, at 01:12, Konstantin Belousov = wrote: >=20 > There was (is) bugs in FreeBSD UFS SU < 13 > - some LoR existed in SU code, where it needed to lock a containing = directory > to provide posix guarantees for fsync(), while owning the vnode lock. = I > do not believe it is observable in a real-world uses If you are talking about these changes: https://svnweb.freebsd.org/base?view=3Drevision&revision=3D367672 = then only during doing Prestashop translations, and after clicking on = "Save" it removes and recreates Prestashop cache in /var/cache/prod = directory could trigger a "processes hanging in ufs state". I use = FreeBSD since 6.x and it was the first time I could trigger it (maybe = it's related to specific Prestashop version too). > - in some situations UFS SU in < 13 did not performed necessary = fsync() > of the directory, related to the previous item > The end result was that after sucessfull fsync() followed by a system > failure e.g. power or panic, the parent directory for the synced > vnode would not be synced and the vnode dirent' is not written to the > permanent store. This volatiles posix requirement that after fsync, = the > data can be read, since you plain cannot open the file. >=20 > During the development of the patch to fix both LoR and related > ommission of fsync, a mistake was made resulting in much more = aggessive > syncing of directories. It was not exactly that, but approximately, on > most of metadata operations that created or removed directory entry, > the directory was fully synced. This resulted in the significant slow > down, which was eliminated around BETA4..RC1. I.e. most of fixes come = to > BETA4, but minor parts were only discovered later and ready for RC1. I ask these questions to better understand how a FreeBSD developer works = (and more specifically when a bug is not reported). 1) How you discover about this LoR / fsync ommission bug? Someone else = found it and report it (I couldn't find a PR for this)? Is it discovered = by a test suite? You found it by doing other work in this part of the = code? 2) When I report the slowdown with BETA2 few weeks ago, you replied that = this is a known bug and it will be fixed in BETA3 or BETA4. After the initial patches that made more aggessive syncing of = directories, how did you discover the slowdown? > There are still more fsync(dir) in 13RC1 than it is in any 12, by the = nature > of the bug and its fix, but the current belief is that all fsync calls = left > in the flow are required for correctness. Thank you for explaining these changes.