Date: Sun, 05 Dec 2021 13:38:52 -0600 From: Mike Karels <mike@karels.net> To: Jessica Clarke <jrtc27@freebsd.org> Cc: Konstantin Belousov <kostikbel@gmail.com>, src-committers@freebsd.org, dev-commits-src-all@freebsd.org, dev-commits-src-main@freebsd.org Subject: Re: git: a4e4132fa3bf - main - swapoff(2): replace special device name argument with a structure Message-ID: <9FA0F081-7F17-479B-96D6-F0754A19029B@karels.net> In-Reply-To: <6AA150D7-483E-4F11-B35A-23D6F28ECABB@freebsd.org> References: <202112042221.1B4ML7Ov002151@gitrepo.freebsd.org> <EE06FFF1-7587-4F6E-8649-63454155F2C8@freebsd.org> <Yay8/x8lTm59vTlo@kib.kiev.ua> <CCBD810D-80DB-43ED-9957-4F9A9CB950E5@freebsd.org> <Ya0KTUavns2cN/0z@kib.kiev.ua> <6AA150D7-483E-4F11-B35A-23D6F28ECABB@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On 5 Dec 2021, at 12:56, Jessica Clarke wrote: > On 5 Dec 2021, at 18:51, Konstantin Belousov <kostikbel@gmail.com> wrot= e: >> On Sun, Dec 05, 2021 at 05:14:54PM +0000, Jessica Clarke wrote: >>> On 5 Dec 2021, at 13:22, Konstantin Belousov <kostikbel@gmail.com> wr= ote: >>>> >>>> On Sun, Dec 05, 2021 at 03:03:26AM +0000, Jessica Clarke wrote: >>>>> On 4 Dec 2021, at 22:21, Konstantin Belousov <kib@FreeBSD.org> wrot= e: >>>>>> >>>>>> The branch main has been updated by kib: >>>>>> >>>>>> URL: https://cgit.FreeBSD.org/src/commit/?id=3Da4e4132fa3bfadb6047= fc0fa5f399f4640460300 >>>>>> >>>>>> commit a4e4132fa3bfadb6047fc0fa5f399f4640460300 >>>>>> Author: Konstantin Belousov <kib@FreeBSD.org> >>>>>> AuthorDate: 2021-11-29 16:26:31 +0000 >>>>>> Commit: Konstantin Belousov <kib@FreeBSD.org> >>>>>> CommitDate: 2021-12-04 22:20:58 +0000 >>>>>> >>>>>> swapoff(2): replace special device name argument with a structure= >>>>>> >>>>>> For compatibility, add a placeholder pointer to the start of the >>>>>> added struct swapoff_new_args, and use it to distinguish old vs. = new >>>>>> style of syscall invocation. >>>>>> >>>>>> Reviewed by: markj >>>>>> Discussed with: alc >>>>>> Sponsored by: The FreeBSD Foundation >>>>>> MFC after: 1 week >>>>>> Differential revision: https://reviews.freebsd.org/D33165 >>>>>> --- >>>>>> sys/vm/swap_pager.c | 27 +++++++++++++++++++++++++-- >>>>>> sys/vm/swap_pager.h | 8 ++++++++ >>>>>> 2 files changed, 33 insertions(+), 2 deletions(-) >>>>>> >>>>>> diff --git a/sys/vm/swap_pager.c b/sys/vm/swap_pager.c >>>>>> index 165373d1b527..dc1df79f4fcd 100644 >>>>>> --- a/sys/vm/swap_pager.c >>>>>> +++ b/sys/vm/swap_pager.c >>>>>> @@ -2491,15 +2491,38 @@ sys_swapoff(struct thread *td, struct swap= off_args *uap) >>>>>> struct vnode *vp; >>>>>> struct nameidata nd; >>>>>> struct swdevt *sp; >>>>>> - int error; >>>>>> + struct swapoff_new_args sa; >>>>>> + int error, probe_byte; >>>>>> >>>>>> error =3D priv_check(td, PRIV_SWAPOFF); >>>>>> if (error) >>>>>> return (error); >>>>>> >>>>>> + /* >>>>>> + * Detect old vs. new-style swapoff(2) syscall. The first >>>>>> + * pointer in the memory pointed to by uap->name is NULL for >>>>>> + * the new variant. >>>>>> + */ >>>>>> + probe_byte =3D fubyte(uap->name); >>>>>> + switch (probe_byte) { >>>>>> + case -1: >>>>>> + return (EFAULT); >>>>>> + case 0: >>>>>> + error =3D copyin(uap->name, &sa, sizeof(sa)); >>>>>> + if (error !=3D 0) >>>>>> + return (error); >>>>>> + if (sa.flags !=3D 0) >>>>>> + return (EINVAL); >>>>>> + break; >>>>>> + default: >>>>>> + bzero(&sa, sizeof(sa)); >>>>>> + sa.name =3D uap->name; >>>>>> + break; >>>>>> + } >>>>> >>>>> Doesn=E2=80=99t this change the semantics of swapoff("")? >>>>> >>>>> Previously it would fail deterministically, presumably with ENOENT = or >>>>> something, but now it reinterprets whatever follows that string in >>>>> memory as the new argument structure. It probably doesn=E2=80=99t m= atter, but >>>>> this approach is ugly. Can we not just define a new syscall rather = than >>>>> this kind of bodge? >>>> >>>> Having two swapoff() syscalls is worse, and having them only differ = in >>>> semantic by single flag is kind of crime. >>>> >>>> I do not see swapoff("") as problematic, we are changing a minor sem= antic of >>>> the management syscall. I only wanted to avoid flag day for swapoff= binaries. >>>> >>>> BTW, I considered requiring proper alignment for uap->name, and then= checking >>>> the whole uap->name_old_syscall for NULL, but then decided that this= is >>>> overkill. If you think that swapoff("") that important, I can add t= hat >>>> additional verification. >>> >>> Why=E2=80=99s it worse? It=E2=80=99s just a syscall number, you depre= cate the old one >>> and move on, we do that for things relatively regularly. This is real= ly >>> not a good solution; harder to use as a caller since the prototype is= >>> wrong, impossible to ensure you preserve the semantics for the existi= ng >>> interface in all cases, and ugly to implement. You don=E2=80=99t need= a flag >>> day for a new syscall, either, you can continue to only use the new >>> method for -f for a release and then switch over to the new syscall >>> entirely. Or switch over to the new syscall entirely now and fall bac= k >>> on the old syscall if -f isn=E2=80=99t passed. Defining a new syscall= also lets >>> you not need the name_old_syscall member in the struct, and gives you= a >>> clean, fully-extensible syscall to which future features can be added= >>> in a backwards-compatible way, rather than forever keeping around thi= s >>> legacy mess. >> >> I disagree, it is not just a syscall number, it is whole user/kernel >> interface that bloats, which means cognitive efforts from anybody usin= g >> this interfaces, and for which we must maintain ABI compatibility. > > Which is just as true of this approach; you have the same two > interfaces here, just smashed together into a single harder-to-use > syscall rather than two separate syscalls. Having a separate syscall at= > least allows the old one to return ENOSYS in the future, whereas if you= > ever want to deprecate the old interface with this method then you=E2=80= =99ll > need some other weird error response that=E2=80=99s harder to interpret= as > meaning =E2=80=9Cthat variant of this syscall doesn=E2=80=99t exist any= more=E2=80=9D. > >> New syscall allocation should be done only as a last resort, when exis= ting >> interfaces cannot be adopted for new functionality. > > Which this can=E2=80=99t without breaking the existing well-defined sem= antics, > as I=E2=80=99ve stated. > >> Good (or rather, bad) example of the uglyness that is backed by the at= titude >> that syscalls are free, is whole *at() mess, or specific stat*() mess = (old, >> other bsds, pre-ino64, ino64, at, then stat vs fstat, then Linux statx= which >> probably fixes the interface ultimately). > > It=E2=80=99s better than this approach. I have less resistance to adding new syscalls, and I agree with Jess that= it is the right thing to do in this case. Adding a syscall means the kernel= supports either old or new interface, so there is no flag day. And it is= easier to clean up; maintaining two syscalls should only be needed for a while on -current, and not in any release. Mike > Jess
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?9FA0F081-7F17-479B-96D6-F0754A19029B>