Date: Sat, 2 Jan 2021 15:08:56 -0700 From: Alan Somers <asomers@freebsd.org> To: Rick Macklem <rmacklem@uoguelph.ca> Cc: Matthias Apitz <guru@unixarea.de>, FreeBSD CURRENT <freebsd-current@freebsd.org>, Konstantin Belousov <kib@freebsd.org>, Kirk McKusick <mckusick@mckusick.com> Subject: Re: cp(1) of large files is causing 100% CPU utilization and poor transfer Message-ID: <CAOtMX2jrT1urLf12ugk-FTpr7aZ0Q_a%2BuT2gd4g1D1M8uVLV4w@mail.gmail.com> In-Reply-To: <YQXPR0101MB096840F865CBD05DFC931A8BDDD40@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM> References: <X/CKQFbpbWDdLXvw@c720-r368166.fritz.box> <CAOtMX2gd6vaBF=6Z6stefGRN8A7S4Gtf4drO-YgAbd=KXPwKNg@mail.gmail.com> <X/CY/kuKUJHUVEbB@c720-r368166.fritz.box> <CAOtMX2hFupzf-MD84eo_-n9OzfYX6b6tWRsHPECZGKaq5QCUVw@mail.gmail.com> <X/CbUu4tVQG81ItJ@c720-r368166.fritz.box> <CAOtMX2iRS6XVLkABSMQdcDCUHRXhEHEjyzuOqkMHudK=he33GA@mail.gmail.com> <YQXPR0101MB0968022D574AA673BB80DEC6DDD40@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM> <X/C7RQmjHISfRMA/@c720-r368166.fritz.box> <CAOtMX2hVZHFLOJ5fOJJqT18uc_g3wcFnM1h3vp_e_O-_h8PzBQ@mail.gmail.com> <X/DSHeynpeIxjs3w@c720-r368166.fritz.box> <YQXPR0101MB096840F865CBD05DFC931A8BDDD40@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM>
next in thread | previous in thread | raw e-mail | index | archive | help
LGTM! This patch also fixes another problem: the previous version of cp, when copying a large sparse file on UFS, would create some UFS indirect blocks (because it would keep truncating the file to larger sizes). The output file would still be sparse, but it would take up more space than the original. IIRC about 0.2% of the empty space would get used by UFS indirect blocks. But your patch fixes it. What I said earlier about needing to modify vn_generic_copy_file_range wasn't quite correct. I confused len with xfer when I was reading the code. The change I proposed to vn_generic_copy_file_range would only make a difference if the process receives many interrupts. And here's some background for other people reading the thread: the reason that the initial copy_file_range implementation in cp only used a 2 MB block size is because vn_generic_copy_file_range wasn't always interruptible, and we didn't want cp to block for minutes or even hours during a long transfer. Subsequently rmacklem made vn_generic_copy_file_range interruptible, but we never raised the block size in cp. -Alan On Sat, Jan 2, 2021 at 2:42 PM Rick Macklem <rmacklem@uoguelph.ca> wrote: > The attached small patch seems to fix the problem. > My hunch is that, for a large non-sparse file, SEEK_DATA > SEEK_HOLE takes a fairly long time. > These are done for each copy_file_range(2) syscall. > > cp was doing lots of them because of the small len argument. > Bumping the len up to SSIZE_MAX results in far fewer sycalls > and, therefore, SEEK_DATAs and SEEK_HOLEs. > > Without the patch, cp took 6 times as long as dd. > With the patch, cp takes less time than dd. > > I'll put the patch on the bug report. Matthias, can you test > the patch? > > Thanks for reporting this, rick > ps: All my test programs use SSIZE_MAX unless they were > not supposed to copy to eof, which explains why I > missed this. My bad, for the testing.;-) > > ________________________________________ > From: owner-freebsd-current@freebsd.org <owner-freebsd-current@freebsd.or= g> > on behalf of Matthias Apitz <guru@unixarea.de> > Sent: Saturday, January 2, 2021 3:05 PM > To: Alan Somers > Cc: Rick Macklem; FreeBSD CURRENT; Konstantin Belousov; Kirk McKusick > Subject: Re: cp(1) of large files is causing 100% CPU utilization and poo= r > transfer > > CAUTION: This email originated from outside of the University of Guelph. > Do not click links or open attachments unless you recognize the sender an= d > know the content is safe. If in doubt, forward suspicious emails to > IThelp@uoguelph.ca > > > El d=C3=ADa s=C3=A1bado, enero 02, 2021 a las 11:29:36a. m. -0700, Alan S= omers > escribi=C3=B3: > > > > El d=C3=ADa s=C3=A1bado, enero 02, 2021 a las 05:06:05p. m. +0000, Ri= ck Macklem > > > escribi=C3=B3: > > > > > > > Just fyi, I've reproduced the problem. > > > > All I did was create a 20Gbyte file > > > > on UFS on a slow (4Gbyte or RAM, > > > > slow spinning disk) laptop. > > > > (The UFS file system is just what the installer creates these days.= ) > > > > > > > > cp still hasn't finished and is definitely > > > > taking a looott longer than dd did. > > > > > > > > I'll start drilling down later to-day. > > > > > > > > I'll admit doing lots of testing of copy_file_range(2) > > > > with large sparse files, but I may have missed testing > > > > a large non-sparse file. > > > > > > > > rick > > > > ps: I've added Kostik and Kirk to the cc. > > > > > > As the problem seems to be clear now, should I still file a PR? > > > I'm happy to do so. > > > > > > > Yes please . That will help ensure that we don't lose track of it. > > Here we go: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D252358 > > Thanks > > matthias > > -- > Matthias Apitz, =E2=9C=89 guru@unixarea.de, http://www.unixarea.de/ > +49-176-38902045 > Public GnuPG key: http://www.unixarea.de/key.pub > =C2=A1Con Cuba no te metas! =C2=AB=C2=BB Don't mess with Cuba! =C2=AB= =C2=BB Leg Dich nicht mit > Kuba an! > http://www.cubadebate.cu/noticias/2020/12/25/en-video-con-cuba-no-te-meta= s/ > _______________________________________________ > freebsd-current@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org= " >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAOtMX2jrT1urLf12ugk-FTpr7aZ0Q_a%2BuT2gd4g1D1M8uVLV4w>