From owner-freebsd-bugs@freebsd.org Tue Oct 20 10:41:35 2015 Return-Path: Delivered-To: freebsd-bugs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 064F4A18D47 for ; Tue, 20 Oct 2015 10:41:35 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id CE3AC6B7 for ; Tue, 20 Oct 2015 10:41:34 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id t9KAfYle096906 for ; Tue, 20 Oct 2015 10:41:34 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-bugs@FreeBSD.org Subject: [Bug 203891] Consider supporting linux' sync_file_range() Date: Tue, 20 Oct 2015 10:41:34 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.0-CURRENT X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: andres@anarazel.de X-Bugzilla-Status: New X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-bugs@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version rep_platform op_sys bug_status bug_severity priority component assigned_to reporter cc Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 20 Oct 2015 10:41:35 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=203891 Bug ID: 203891 Summary: Consider supporting linux' sync_file_range() Product: Base System Version: 11.0-CURRENT Hardware: Any OS: Any Status: New Severity: Affects Some People Priority: --- Component: kern Assignee: freebsd-bugs@FreeBSD.org Reporter: andres@anarazel.de CC: imp@FreeBSD.org, koobs@FreeBSD.org Hi, postgresql is about to use sync_file_range(SYNC_FILE_RANGE_WRITE) to control writeback more explicitly. It'd be cool if more OSs than just linux could benefit. Some background: Postgres regularly 'checkpoints' it's in-memory data to disk, to be able to remove older journalling/write ahead log data. In a database with a write heavy workload that can imply a lot of writes. At the end of the checkpoint postgres then fsync()s all the files. This unfortunately often causes latency spikes because a) the fsyncs at the end might have to write back a lot of data, unnecessarily stalling other IO b) before the fsync a lot of dirty data might accumulate kernel-side, which then also can trigger latency spikes. Often this also leads to irregular IO with periods of no IO. What postgres is going to do on linux is to issue sync_file_range(SYNC_FILE_RANGE_WRITE) every few (32 seems to work well) blocks during the checkpoint. That makes it rather likely that there's little dirty data remaining when the fsync()s at the end are executed, making them fast. It also prevents large amounts of dirty buffers from accumulating. We've considered some alternative approaches to this for other operating systems. For one there's posix_fadvise(POSIX_FADV_DONTNEED), but that does more than just writeout dirty data. I've also tried mmap();msync(MS_ASYNC);munmap(); - but at least on linux that doesn't do anything. Using MS_SYNC flushes to disk on linux, but it's synchronous, which isn't what we want here. I find the sync_file_range() API to be rather useful - so I think it'd make sense to implement it. But baring that, could you possibly clarify somewhere public whether msync(MS_ASYNC) does what we'd need it to do on freebsd? I.e. initiate writeback, without blocking? Regards, Andres Freund -- You are receiving this mail because: You are the assignee for the bug.