From owner-freebsd-hackers@freebsd.org Wed Sep 23 01:24:26 2020 Return-Path: Delivered-To: freebsd-hackers@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 71EE53E80AB for ; Wed, 23 Sep 2020 01:24:26 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from CAN01-QB1-obe.outbound.protection.outlook.com (mail-eopbgr660047.outbound.protection.outlook.com [40.107.66.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "GlobalSign Organization Validation CA - SHA256 - G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Bx0p14Dy3z42P4; Wed, 23 Sep 2020 01:24:25 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Hv7+WwKFKH0A1buXKmHIgim/h0Nv8pxf4Vr8qUrud1YbXLY+rNiCOj5AV69Mp7CJm+ysFkdAqw/a/ZRTJdJUqrPNjqK3IBpLNXMMYGGjiLG0e93rmZBKecTg2M6dh/A9MWJfzeneRtZm788aVi+x7K2k/sJ219ZHCWTu+m5DMdb0+G6fvN/jZmbxSA3+aeHh61YIulKxZsd+S74jivA5qWvVFewSLMyjgtL9Y4L1P2jYm3TTHLe/7z6aK7WOLx2v1egfYypHgcb4YM+X4ohD/4nrpLp+PBIfJrTamq/C/u2/yDYVFRQqnKWpJUpVkGEuT8VTGmO8g/Yy3n8hYoy60A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=5Rybob+cDlQU3rJBtI9Gm5jsQEoI7P40wQ6BjvkeT6E=; b=UigLobXu2YcBiw2+2UAVyIb7RY7hOpLcY7O9rRX8nZRLsgiKzwPXdhrWyh9tWroITI7kIBOesJmflWpzYP9H1jJg/dRG5tH9RgrUgB8hZqk8fNZyhYlEdw8WISs5iDLQq2zs3U5C6o4KYS9KTDkMlFhjk8oU19YkXlkwgie5qNixvF+4bLfb2pqwcTyZV2YozFhPQZo1qOuAHKsVCgKvlKmWJ62A32D1nakj/97II0eXllYuVYG1hSIh4q4P5S+0VNcNks6vxWLGKl6WhA/9tJHlC8+2nAm3+PhVzCZVnYUZADGTA8AeidJ0oGEwUxex8tmqf2r8UzEZR7jwEud1Sw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=uoguelph.ca; dmarc=pass action=none header.from=uoguelph.ca; dkim=pass header.d=uoguelph.ca; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=uoguelph.ca; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=5Rybob+cDlQU3rJBtI9Gm5jsQEoI7P40wQ6BjvkeT6E=; b=kkjOM+8Ha2jhlq2ELSvd1ai4lH9EQFCmCiSa85L8xeTFunyHIUDGm2e1UFQV5v3hgkk0J1YHFLIc6sH9H60U+zWyOPCjS3/9Vb1Edcb5UDji0f5OHZs68s2qM4HaLlw0IhQFfqdaS0mMLjuYV9hlgxnSpC7DtrM8zGxKrPNyYvtE89FtBl32G/XuHdWxXhsJJ1BaLB7tEjUoWH/by7qAxgQeRirAOiC3KpGNSKep9Yv/mE9EE6txabgeZ7E1NJ6oiTkXw7JKKYdlGbwQV6ln3ONHAoLLoCTXfjYOOi2PnnJHcr5OTLzO6oV9QnndeXMf95Uw0zrH325bOx8EgxqtHw== Received: from YTBPR01MB3966.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:b01:24::27) by YTBPR01MB3439.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:b01:16::23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3391.14; Wed, 23 Sep 2020 01:24:24 +0000 Received: from YTBPR01MB3966.CANPRD01.PROD.OUTLOOK.COM ([fe80::687f:d85a:a0a3:bd20]) by YTBPR01MB3966.CANPRD01.PROD.OUTLOOK.COM ([fe80::687f:d85a:a0a3:bd20%6]) with mapi id 15.20.3391.027; Wed, 23 Sep 2020 01:24:24 +0000 From: Rick Macklem To: Alan Somers CC: FreeBSD Hackers , Konstantin Belousov Subject: Re: RFC: copy_file_range(3) Thread-Topic: RFC: copy_file_range(3) Thread-Index: AQHWj2Uep8NVOqCTP0KuS7h/nlgLPqlxrARqgAAEfICAAHAflYAAMOYAgAMOVByAABEBfw== Date: Wed, 23 Sep 2020 01:24:24 +0000 Message-ID: References: , , In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: fbf80873-912f-444a-00dd-08d85f5f67dd x-ms-traffictypediagnostic: YTBPR01MB3439: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:9508; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: uavKB0fk01/v+oCnXiIpcE/n4741BdaqlumcMWnoEX4KdHT6PXj8ac8eUMBFivge8ru5Gompfvr419q1Eyg2qpPlPQmOpkgpE8Sqs7WEyVe2vLR3BLI7rPhnF/C6f9HLCari/ABYYBf5zXGEm/jYpb5Yqt9bKACwq0f4pTR8sHuNGvoPHFZRp2jrn7ulLT9LMDs4mtWNsJ2srcByd3Eqll0AptkA0NNVakdZg/ir/ef3l3eWeFqLSJLGIoZoSAY/8iq3Q69UM6Zk5V4o7k1etLTjW9dFcc9G8D53sITk+Y0KwLUAU0KrbXYxb8eeKDF0XR9EH24HBzuH+kiYCci+zmVcE31qZeWoCAMVC8lhsJDm9K+g9R3Ep7R4r2B06i9i x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:YTBPR01MB3966.CANPRD01.PROD.OUTLOOK.COM; PTR:; CAT:NONE; SFS:(136003)(366004)(39830400003)(396003)(376002)(346002)(8936002)(71200400001)(52536014)(9686003)(91956017)(64756008)(66446008)(2940100002)(55016002)(83380400001)(66476007)(2906002)(53546011)(478600001)(86362001)(8676002)(66556008)(66946007)(6506007)(76116006)(5660300002)(6916009)(33656002)(786003)(54906003)(450100002)(7696005)(186003)(316002)(4326008); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: qqFOnQTxA/KgTNnBrA9LWGP4qrAZc4cTzHCNpznNXCXuuRsvUtYQZIDYYPUO3RUDLD/AUHtn/LcKXvMDcSWvyJsVNCYisxiELjgELFX5JjTbzqCG6nX9I0nuwgJQuSesBh3R1ulMfNd6+Uw+7V0WkyNpcA04YWaCai9x7eB6RyVFzuh7NcLyuk1OSPiGNU+oqLVK8QGT4fvH25GXN04YcITNOTPJOrSihTbjkxOARVD7uTp0duuK2kgYF2zGVzegDO7ruw1mLLxkbGQmRLKdPEyS8GoP6VKuC+NdFTM36I58XcrUgMrxgQjW4PnDB5CPvTnuuKQKCa0+BOMzdc9U3vsQBdIy7DJ5JAR74ckTXZ4TF/oSybw26/xd2LataW9sGSwm3iUKIvcVZMlvg4zf2Wu34CQ0+pOqFfrXxgho0Myjlm3mTJv06NCiE6g+FqcHtu7W5Sn0Atk/Jf42EBBh1e3myjD+l/5aoVJJaRWEDL/Sv0RZomGyMt18If4jLYPP6aRRRUb98YNQjN7NWFlDfMJRIUUZmqDZe9b+ILWD1jW0Ost59jSiRCTi3RDZhh+xbYnxDN5vxwBsLcsUtojfes4LFXi4rBt9JdqoQhWXEQHaV4iyPGNk9Y7mGsZFTO/wGTtOM208ILjUsubdRR0zlPT+QMkHyi3dVloXjMivkSgxBaov4D5ysA44Sh9BpN4mG/uoNxdiUQOwILN+TChWtg== x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: uoguelph.ca X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: YTBPR01MB3966.CANPRD01.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-Network-Message-Id: fbf80873-912f-444a-00dd-08d85f5f67dd X-MS-Exchange-CrossTenant-originalarrivaltime: 23 Sep 2020 01:24:24.1274 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: 2QMqm1do+LNTA2RAJMvzHkbGIAiy0SSOTlxR8+0Z95jcL8KR8kA9uS/Y8rDKVNXCtaXOL79MNvEZ6kUgkJ9spQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: YTBPR01MB3439 X-Rspamd-Queue-Id: 4Bx0p14Dy3z42P4 X-Spamd-Bar: ----- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=uoguelph.ca header.s=selector1 header.b=kkjOM+8H; dmarc=pass (policy=none) header.from=uoguelph.ca; spf=pass (mx1.freebsd.org: domain of rmacklem@uoguelph.ca designates 40.107.66.47 as permitted sender) smtp.mailfrom=rmacklem@uoguelph.ca X-Spamd-Result: default: False [-5.94 / 15.00]; NEURAL_HAM_MEDIUM(-0.98)[-0.983]; R_DKIM_ALLOW(-0.20)[uoguelph.ca:s=selector1]; FREEFALL_USER(0.00)[rmacklem]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_ALL(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:40.107.0.0/16]; MIME_GOOD(-0.10)[text/plain]; NEURAL_HAM_LONG(-1.00)[-0.997]; DWL_DNSWL_LOW(-1.00)[uoguelph.ca:dkim]; RCVD_COUNT_THREE(0.00)[3]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[uoguelph.ca:+]; DMARC_POLICY_ALLOW(-0.50)[uoguelph.ca,none]; RCVD_IN_DNSWL_NONE(0.00)[40.107.66.47:from]; NEURAL_HAM_SHORT(-0.96)[-0.965]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_TLS_LAST(0.00)[]; ASN(0.00)[asn:8075, ipnet:40.104.0.0/14, country:US]; ARC_ALLOW(-1.00)[microsoft.com:s=arcselector9901:i=1]; MAILMAN_DEST(0.00)[freebsd-hackers]; RWL_MAILSPIKE_POSSIBLE(0.00)[40.107.66.47:from] X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Sep 2020 01:24:26 -0000 Oh, and I set=0A= vfs.nfs.maxcopyrange=3D134217728=0A= on the server.=0A= =0A= The current default is only 10Mbytes, but I think 128Mbytes=0A= is a more reasonable setting.=0A= =0A= rick=0A= ps: The server and client are only somewhat old Dell Latitude 6420=0A= laptops, so the tests were not done on server grade hardware.=0A= =0A= =0A= ________________________________________=0A= From: owner-freebsd-hackers@freebsd.org = on behalf of Rick Macklem =0A= Sent: Tuesday, September 22, 2020 9:18 PM=0A= To: Alan Somers=0A= Cc: FreeBSD Hackers; Konstantin Belousov=0A= Subject: Re: RFC: copy_file_range(3)=0A= =0A= Alan Somers wrote:=0A= [lots of stuff snipped]=0A= >1) In order to quickly respond to a signal, a program must use a modest le= n with >copy_file_range=0A= For the programs you have mentioned, I think the only signal handling would= =0A= be termination (C or SIGTERM if you prefer).=0A= I'm not sure what is a reasonable response time for this.=0A= I'd like to hear comments from others?=0A= - 1sec, less than 1sec, a few seconds, ...=0A= =0A= > 2) If a hole is larger than len, that will cause vn_generic_copy_file_ran= ge to=0A= > truncate the output file to the middle of the hole. Then, in the next in= vocation,=0A= > truncate it again to a larger size.=0A= > 3) The result is a file that is not as sparse as the original.=0A= Yes. So, the trick is to use the largest "len" you can live with, given how= long you=0A= are willing to wait for signal processing.=0A= =0A= > For example, on UFS:=0A= > $ truncate -s 1g sparsefile=0A= Not a very interesting sparse file. I wrote a little program to create one.= =0A= > $ cp sparsefile sparsefile2=0A= > $ du -sh sparsefile*=0A= > 96K sparsefile=0A= > 32M sparsefile2=0A= >=0A= > My idea for a userland wrapper would solve this problem by using=0A= > SEEK_HOLE/SEEK_DATA to copy holes in their entirety, and use copy_file_ra= nge for=0A= > everything else with a modest len. Alternatively, we could eliminate the= need for=0A= > the wrapper by enabling copy_file_range for every file system, and making= =0A= > vn_generic_copy_file_range interruptible, so copy_file_range can be calle= d with=0A= > large len without penalizing signal handling performance.=0A= =0A= Well, I ran some quick benchmarks using the attached programs, plus "cp" bo= th=0A= before and with your copy_file_range() patch.=0A= copya - Does what I think your plan is above, with a limit of 2Mbytes for "= len".=0A= copyb -Just uses copy_file_range() with 128Mbytes for "len".=0A= =0A= I first created the sparse file with createsparse.c. It is admittedly a wor= st case,=0A= creating alternating holes and data blocks of the minimum size supported by= =0A= the file system. (I ran it on a UFS file system created with defaults, so t= he minimum=0A= hole size is 32Kbytes.)=0A= The file is 1Gbyte in size with an Allocation size of 524576 ("ls -ls").=0A= =0A= I then ran copya, copyb, old-cp and new-cp. For NFS, I redid the mount befo= re=0A= each copy to avoid data caching in the client.=0A= Here's what I got:=0A= Elapsed time #RPCs Allocat= ion size ("ls -ls" on server)=0A= NFSv4.2=0A= copya 39.7sec 16384copy+32768seek 524576=0A= copyb 10.2sec 104copy 524= 576=0A= old-cp 21.9sec 16384read+16384write 1048864=0A= new-cp 10.5sec 1024copy 5245= 76=0A= =0A= NFSv4.1=0A= copya 21.8sec 16384read+16384write 1048864=0A= copyb 21.0sec 16384read+16384write 1048864=0A= old-cp 21.8sec 16384read+16384write 1048864=0A= new-cp 21.4sec 16384read+16384write 1048864=0A= =0A= Local on the UFS file system=0A= copya 9.2sec n/a = 524576=0A= copyb 8.0sec n/a = 524576=0A= old-cp 15.9sec n/a = 1048864=0A= new-cp 7.9sec n/a = 524576=0A= =0A= So, for a NFSv4.2 mount, using SEEK_DATA/SEEK_HOLE is definitely=0A= a performance hit, due to all the RPC rtts.=0A= Your patched "cp" does fine, although a larger "len" reduces the=0A= RPC count against the server.=0A= All variants using copy_file_range() retain the holes.=0A= =0A= For NFSv4.1, it (not surprisingly) doesn't matter, since only NFSv4.2=0A= supports SEEK_DATA/SEEK_HOLE and VOP_COPY_FILE_RANGE().=0A= =0A= For UFS, everything using copy_file_range() works pretty well and=0A= retains the holes.=0A= Although "copya" is guaranteed to retain the holes, it does run noticably= =0A= slower than the others. Not sure why? Does the extra SEEK_DATA/SEEK_HOLE=0A= syscalls cost that much?=0A= =0A= The limitation of not using SEEK_DATA/SEEK_HOLE is that you will not=0A= retain holes that straddle the byte range copied by two subsequent=0A= copy_file_range(2) calls.=0A= --> This can be minimized by using a large "len", but that large "len"=0A= results in slower response to signal handling.=0A= =0A= I've attached the little programs, so you can play with them.=0A= (Maybe try different sparse schemes/sizes? It might be fun to=0A= make the holes/blocks some random multiple of hole size up=0A= to a limit?)=0A= =0A= rick=0A= ps: In case he isn't reading hackers these days, I've added kib@=0A= as a cc. He might know why UFS is 15% slower when SEEK_HOLE=0A= SEEK_DATA is used.=0A= =0A= =0A= -Alan=0A=