Date: Fri, 5 Jul 2019 19:30:54 +0200 From: Jilles Tjoelker <jilles@stack.nl> To: Rick Macklem <rmacklem@uoguelph.ca> Cc: "freebsd-current@FreeBSD.org" <freebsd-current@FreeBSD.org>, "kib@freebsd.org" <kib@FreeBSD.org>, Alan Somers <asomers@freebsd.org> Subject: Re: should a copy_file_range(2) syscall be interrupted via a signal Message-ID: <20190705173054.GA30404@stack.nl> In-Reply-To: <YTXPR01MB0285E79DFAAE250FD7A7A181DDF50@YTXPR01MB0285.CANPRD01.PROD.OUTLOOK.COM> References: <YTXPR01MB0285E79DFAAE250FD7A7A181DDF50@YTXPR01MB0285.CANPRD01.PROD.OUTLOOK.COM>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Jul 05, 2019 at 12:28:51AM +0000, Rick Macklem wrote: > I have been working on a Linux compatible copy_file_range(2) syscall > (the current code can be found at https://reviews.freebsd.org/D20584). > One outstanding issue is how it should deal with signals. Right now, I > have vn_start_write() without PCATCH, so that it won't be interrupted > by a signal, but I notice that vn_write() {ie. write syscall } does > have PCATCH on vn_start_write() and so does vn_rdwr() when it is > called without IO_NODELOCKED. A regular write() is only interruptible when writing to a terminal, pseudo-terminal master, pipe, socket, or, under certain conditions, a file on an NFS intr mount. Therefore, applications may not have the code to resume interrupted writes to regular files gracefully. > I am thinking that copy_file_range(2) should do this also. > However, if it returns an error, it is impossible for the caller to > know how much of the data range got copied. A regular write() returns partial success if interrupted by a signal when it has already written something. Therefore, the application can resume the operation by adjusting pointers and counts. Something similar applies to "deterministic" errors like [EFBIG] where the first call will write as far as possible (if this is not nothing) successfully and the next attempt will return the error. > What do you think the copy_file_range(2) code should do? I'm not sure it should actually be done, but the need for adjusting pointers and counts could be avoided with a little extra kernel and libc code. The system call would receive an additional argument pointing to an off_t that indicates how many bytes previous calls have already written. A libc wrapper would initialize this to 0. With this, the system call can be restarted automatically after a signal. In any case, [EINTR] and the internal ERESTART must not be returned unless it is safe to repeat the call with the same (direct) arguments. -- Jilles Tjoelker
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20190705173054.GA30404>