From owner-freebsd-fs@freebsd.org Sun Jun 30 00:46:05 2019 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0D92E15C5891 for ; Sun, 30 Jun 2019 00:46:05 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from CAN01-QB1-obe.outbound.protection.outlook.com (mail-eopbgr660056.outbound.protection.outlook.com [40.107.66.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "GlobalSign Organization Validation CA - SHA256 - G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id F185869E3A; Sun, 30 Jun 2019 00:46:01 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) ARC-Seal: i=1; a=rsa-sha256; s=testarcselector01; d=microsoft.com; cv=none; b=Gccu+dd7SylRUtpLdgkL9rys5lFG93bCo4P/ggabYq8MrQ8FYfMSr1RlQYLm0RT78AS3zo2vItOsV/9Afc5Ps2EJEwar3zCZXOiwFRP6bepbc+K5LUZhmGOdWSCNYAEnBHREYnQCspjrghqs+FytZhORNUOoazM+FwWK74pf4is= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=testarcselector01; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=WVl8zO8MhPeiLPY7Xktnn6EGnPFw6Nk3VyAz1VIJwps=; b=T4PlMbcIJSHS3lOHqoMaqbIflDQaI0RT4q4wP6quJcSpB+h5NvG7Y/lx15HLGDT3eEvhOVxwnpEESLkcva2xxMSzpcomUPxMNCbe3HKM0asCRfm86KPoWCSwWbfAib3KF6NCqwezwjufM8o3KkE8C6ViOMwvbDmHRM6bsmAp7D8= ARC-Authentication-Results: i=1; test.office365.com 1;spf=none;dmarc=none;dkim=none;arc=none Received: from YTXPR01MB0285.CANPRD01.PROD.OUTLOOK.COM (10.165.219.7) by YTXPR01MB0285.CANPRD01.PROD.OUTLOOK.COM (10.165.219.7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2008.16; Sun, 30 Jun 2019 00:45:59 +0000 Received: from YTXPR01MB0285.CANPRD01.PROD.OUTLOOK.COM ([fe80::7db7:14c1:c4d0:5ecc]) by YTXPR01MB0285.CANPRD01.PROD.OUTLOOK.COM ([fe80::7db7:14c1:c4d0:5ecc%4]) with mapi id 15.20.2008.014; Sun, 30 Jun 2019 00:45:59 +0000 From: Rick Macklem To: Alan Somers CC: "freebsd-fs@freebsd.org" , Sean Fagan Subject: Re: RFC: What should a copy_file_range(2) syscall do by default? Thread-Topic: RFC: What should a copy_file_range(2) syscall do by default? Thread-Index: AQHVKREqKzvk0WaXu025/Chow1UOHaan7Y6AgAt533g= Date: Sun, 30 Jun 2019 00:45:58 +0000 Message-ID: References: , In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: e24f78d5-e2f7-49dd-34c7-08d6fcf4519c x-microsoft-antispam: BCL:0; PCL:0; RULEID:(2390118)(7020095)(4652040)(8989299)(4534185)(4627221)(201703031133081)(201702281549075)(8990200)(5600148)(711020)(4605104)(1401327)(2017052603328)(7193020); SRVR:YTXPR01MB0285; x-ms-traffictypediagnostic: YTXPR01MB0285: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:10000; x-forefront-prvs: 008421A8FF x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(136003)(346002)(366004)(376002)(39850400004)(396003)(199004)(189003)(6246003)(486006)(102836004)(11346002)(5660300002)(446003)(305945005)(53936002)(68736007)(66446008)(64756008)(66556008)(4326008)(74482002)(76116006)(76176011)(91956017)(229853002)(66476007)(14454004)(66946007)(86362001)(73956011)(6916009)(186003)(9686003)(6436002)(8936002)(52536014)(54906003)(476003)(33656002)(256004)(8676002)(81156014)(81166006)(6506007)(53546011)(55016002)(7696005)(478600001)(71200400001)(71190400001)(74316002)(46003)(14444005)(786003)(316002)(25786009)(99286004)(2906002); DIR:OUT; SFP:1101; SCL:1; SRVR:YTXPR01MB0285; H:YTXPR01MB0285.CANPRD01.PROD.OUTLOOK.COM; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; A:1; MX:1; received-spf: None (protection.outlook.com: uoguelph.ca does not designate permitted sender hosts) x-ms-exchange-senderadcheck: 1 x-microsoft-antispam-message-info: xktRgXAbwUwqC1BTwmGtrhG54wUtd4VMEgrqOspJUdKUVPs9gXWjG+LjFYutjDM321K2cdTyu0gHwiAbRb2XXznENhFY79KepFlC6zXrrl8J0NLqPfJzCJKpZakSqjS5q5gfejI4RGCyWiReD4FCow9fN2V5DRXsclgxoCiU6ZgfttbudQps5DH+QgDMc3c6f1r7pGkJjdsA/31nVlKIQ1ktZaiHMOYpY3JUWUyxI+4K84rG/Uc7qk+YxuSG6K25cYtbvUdO+9dUY8Kk9+xs5H9gl+fQWiwGhFKEZ6YkjnRjfcKDpe/nLhvRzvFoh3z11xaofmtAbN9HL8/5eYnxn4fZLFp7WhDLQgvQ/zkjEWClFVlOpRJ1WCwh+LAXcWs4oJYsVY2JZDhS8cyhSdVGmpzeLteN1EfikDoj+F1TM64= Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: uoguelph.ca X-MS-Exchange-CrossTenant-Network-Message-Id: e24f78d5-e2f7-49dd-34c7-08d6fcf4519c X-MS-Exchange-CrossTenant-originalarrivaltime: 30 Jun 2019 00:45:58.8363 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: rmacklem@uoguelph.ca X-MS-Exchange-Transport-CrossTenantHeadersStamped: YTXPR01MB0285 X-Rspamd-Queue-Id: F185869E3A X-Spamd-Bar: +++ Authentication-Results: mx1.freebsd.org; spf=pass (mx1.freebsd.org: domain of rmacklem@uoguelph.ca designates 40.107.66.56 as permitted sender) smtp.mailfrom=rmacklem@uoguelph.ca X-Spamd-Result: default: False [3.00 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; R_SPF_ALLOW(-0.20)[+ip4:40.107.0.0/16]; NEURAL_HAM_LONG(-0.27)[-0.265,0]; MIME_GOOD(-0.10)[text/plain]; RCVD_TLS_LAST(0.00)[]; DMARC_NA(0.00)[uoguelph.ca]; TO_DN_SOME(0.00)[]; ARC_REJECT(2.00)[signature check failed: fail, {[1] = sig:microsoft.com:reject}]; RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_SHORT(0.21)[0.205,0]; MX_GOOD(-0.01)[mx2.hc184-76.ca.iphmx.com,mx1.hc184-76.ca.iphmx.com,mx2.hc184-76.ca.iphmx.com,mx1.hc184-76.ca.iphmx.com,mx2.hc184-76.ca.iphmx.com,mx1.hc184-76.ca.iphmx.com,mx2.hc184-76.ca.iphmx.com,mx1.hc184-76.ca.iphmx.com,mx2.hc184-76.ca.iphmx.com,mx1.hc184-76.ca.iphmx.com,mx2.hc184-76.ca.iphmx.com,mx1.hc184-76.ca.iphmx.com,mx2.hc184-76.ca.iphmx.com,mx1.hc184-76.ca.iphmx.com,mx2.hc184-76.ca.iphmx.com,mx1.hc184-76.ca.iphmx.com,mx2.hc184-76.ca.iphmx.com,mx1.hc184-76.ca.iphmx.com,mx2.hc184-76.ca.iphmx.com,mx1.hc184-76.ca.iphmx.com,mx2.hc184-76.ca.iphmx.com,mx1.hc184-76.ca.iphmx.com,mx2.hc184-76.ca.iphmx.com,mx1.hc184-76.ca.iphmx.com,mx2.hc184-76.ca.iphmx.com,mx1.hc184-76.ca.iphmx.com,mx2.hc184-76.ca.iphmx.com,mx1.hc184-76.ca.iphmx.com,mx2.hc184-76.ca.iphmx.com,mx1.hc184-76.ca.iphmx.com]; RCVD_IN_DNSWL_NONE(0.00)[56.66.107.40.list.dnswl.org : 127.0.3.0]; NEURAL_SPAM_MEDIUM(0.37)[0.371,0]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; SUBJECT_ENDS_QUESTION(1.00)[]; MIME_TRACE(0.00)[0:+] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 30 Jun 2019 00:46:05 -0000 Well, asomers@ prefers the current patch, despite its complexity. sef@ expressed concerns w.r.t. the complexity and it being done in the kernel. However he did not state a preference for any other specific variant, such as the ones I suggested. As such, I am sticking with the current patch unless I hear otherwise from others. Thanks for your comments, rick ps: I felt a top post was reasonable here, as it summarized separate posts in the thread. ________________________________________ From: Alan Somers Sent: Saturday, June 22, 2019 1:28:03 PM To: Rick Macklem Cc: freebsd-fs@freebsd.org; Sean Fagan Subject: Re: RFC: What should a copy_file_range(2) syscall do by default? On Sat, Jun 22, 2019 at 10:02 AM Rick Macklem wrote: > > Hi, > > sef@ made this comment on phabricator. I don't believe phabricator is the= correct > place for "big picture" discussions, so I'm posting it here (I'm assuming= sef@ doesn't > mind, since the phabricator comments are public). > sef@ wrote: > >This much work in the kernel for what //should// be user-space makes me = twitchy... >but there is lots of precedent for it, so I obviously have to g= et with the times. > > > > I've done a quick review of the code; it seems most of the complexity = is in the hole->detection. I'm also annoyed that linux used size_t for the= amount to copy, when >off_t would have been more appropriate. But not muc= h to do about that now. > > > > Having a default implementation means that user-space can't fall back = if it's not >supported, and do it better (e.g., parallel I/O). Should we a= lso have a pathconf for >the feature? > > > > WRT your question on -fs, I have no objections to this working cross-f= ilesystem, >although I think I might ask to have a flag to fail in that cas= e. > > Well, all I am interested in is a system call/VOP call so the NFSv4.2 cli= ent can do > a file copy locally on the NFS server instead of doing Reads/Writes acros= s the wire. > The current code has gotten fairly complex, so I'll try and ask "how comp= lex" this > syscall/VOP call should be? > > The range of variants I can think of are: > 0) - Don't do it at all. > 1) - The syscall could just do a VOP_COPY_FILE_RANGE() and return whateve= r error > it returns. > --> This implies an error return for all file systems for now, wi= th support for > NFSv4.2mounts being added later (FreeBSD13 hopefully). This option would require applications or the C library to fallback to a copy loop. While doable, nothing in userland would be able to range-lock the file, making the copy loop non-atomic. So the in-kernel copy is superior. > 2) - The syscall could fall back on a simple copy loop, but not try to de= al with holes. > --> The Linux man page mentions using copy_file_range(2) in a loop= with > lseek(SEEK_DATA)/lseek(SEEK_HOLE) for sparse files. This sug= gests that > the Linux fallback code doesn't try to handle holes. Same problem as 1. Or if you do the copy loop in-kernel it would waste CPU time and expand sparse files, which isn't good either. > 3) - The current patch which tries to handle holes and copy the entire by= te range > in one call. Definitely the best option, despite its complexity. I would argue that the complexity calls for a robust test suite, rather than abandoning the feature. -Alan