Date: Mon, 25 Aug 2014 20:27:55 +0300 From: Konstantin Belousov <kostikbel@gmail.com> To: Mateusz Guzik <mjguzik@gmail.com>, freebsd-hackers@freebsd.org Subject: Re: atomic_load_acq_int in sequential_heuristic Message-ID: <20140825172755.GD2737@kib.kiev.ua> In-Reply-To: <20140825130433.GD14344@dft-labs.eu> References: <20140824115729.GC2045@dft-labs.eu> <20140824162331.GW2737@kib.kiev.ua> <20140824164236.GX2737@kib.kiev.ua> <20140825005659.GA14344@dft-labs.eu> <20140825073404.GZ2737@kib.kiev.ua> <20140825081526.GB14344@dft-labs.eu> <20140825083539.GB2737@kib.kiev.ua> <20140825091056.GC14344@dft-labs.eu> <20140825111000.GC2737@kib.kiev.ua> <20140825130433.GD14344@dft-labs.eu>
next in thread | previous in thread | raw e-mail | index | archive | help
--Y3cq2BYpkEb43po+ Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Aug 25, 2014 at 03:04:33PM +0200, Mateusz Guzik wrote: > On Mon, Aug 25, 2014 at 02:10:01PM +0300, Konstantin Belousov wrote: > > On Mon, Aug 25, 2014 at 11:10:56AM +0200, Mateusz Guzik wrote: > > > On Mon, Aug 25, 2014 at 11:35:39AM +0300, Konstantin Belousov wrote: > > > > > + atomic_set_int(&fp->f_flag, FHASLOCK); > > > > You misspelled FRDAHEAD as FHASLOCK, below as well. > > > > Was this tested ? > > > >=20 > > >=20 > > > Oops, damn copy-pasto. Sorry. > > >=20 > > > > > + VOP_UNLOCK(vp, 0); > > > > > } else { > > > > > - do { > > > > > - new =3D old =3D fp->f_flag; > > > > > - new &=3D ~FRDAHEAD; > > > > > - } while (!atomic_cmpset_rel_int(&fp->f_flag, old, new)); > > > > > + atomic_clear_int(&fp->f_flag, FHASLOCK); > > > > So what about extending the vnode lock to cover the flag reset ? > > > >=20 > > >=20 > > > Sure. > > >=20 > > > So this time I tested it properly and found out it is impossible to > > > disable the hint. The test is: > > >=20 > > > -1 is truncated and then read from intptr_t which yields a big positi= ve > > > number instead. Other users in the function use int tmp to work around > > > this issue. > > Could you provide me with the test case which demonstrates the problem ? > >=20 >=20 > Nothing special: > https://people.freebsd.org/~mjg/patches/F_READAHEAD.c And how did you verified that fcntl(F_READAHEAD, -1) did not worked ? I ended up with adding printf() to kern_fcntl() to see arg value. >=20 > > The fcntl(2) prototype in sys/fcntl.h is variadic, so int arg argument > > is not promoted. On the other hand, syscalls.master declares arg as lo= ng. > > Did you tried to pass -1L as third argument to disable ? > >=20 >=20 > Yes, -1L deals with the problem. I would still argue that using 'tmp' > like the rest of the function would not hurt as a cheap solution. This would deliberately break the current ABI (which takes the argument as long for F_READAHEAD), which is not acceptable. I do think that there is bug in the "-1" stuff, but it is in compat32 shims. The compat/freebsd32/syscalls.master does not provide the compat for fcntl(2), which means that 32bit fcntl(2) does not work when either signed extension is needed (the F_READAHEAD case), or on the big-endian machines. On i386, it did not practically matter before F_READAHEAD, since x86 is little-endian and flags passed as arg did not touch the sign bit. Note that fcntl(2) man page is wrong, it claims that optional argument arg is int. It cannot be true since pointer on LP64 platform cannot fit into int. The SUSv4 is explicit in describing which command takes which type; our man page must be fixed, but this is for later. See the patch at the end of the reply for the fix. It needs sysent regen for actual build. >=20 > > >=20 > > > diff --git a/sys/kern/kern_descrip.c b/sys/kern/kern_descrip.c > > > index 7abdca0..774f647 100644 > > > --- a/sys/kern/kern_descrip.c > > > +++ b/sys/kern/kern_descrip.c > > > @@ -760,7 +760,8 @@ kern_fcntl(struct thread *td, int fd, int cmd, in= tptr_t arg) > > > error =3D EBADF; > > > break; > > > } > > > - if (arg >=3D 0) { > > > + tmp =3D arg; > > > + if (tmp >=3D 0) { > > > vp =3D fp->f_vnode; > > > error =3D vn_lock(vp, LK_SHARED); > > > if (error !=3D 0) { > > > @@ -769,7 +770,7 @@ kern_fcntl(struct thread *td, int fd, int cmd, in= tptr_t arg) > > > } > > > bsize =3D fp->f_vnode->v_mount->mnt_stat.f_iosize; > > > VOP_UNLOCK(vp, 0); > > > - fp->f_seqcount =3D (arg + bsize - 1) / bsize; > > > + fp->f_seqcount =3D (tmp + bsize - 1) / bsize; > > > do { > > > new =3D old =3D fp->f_flag; > > > new |=3D FRDAHEAD; > > >=20 > > > Then the patch in question: > > >=20 > > > diff --git a/sys/kern/kern_descrip.c b/sys/kern/kern_descrip.c > > > index 774f647..4efadb0 100644 > > > --- a/sys/kern/kern_descrip.c > > > +++ b/sys/kern/kern_descrip.c > > > @@ -476,7 +476,6 @@ kern_fcntl(struct thread *td, int fd, int cmd, in= tptr_t arg) > > > struct vnode *vp; > > > cap_rights_t rights; > > > int error, flg, tmp; > > > - u_int old, new; > > > uint64_t bsize; > > > off_t foffset; > > > =20 > > > @@ -760,27 +759,25 @@ kern_fcntl(struct thread *td, int fd, int cmd, = intptr_t arg) > > > error =3D EBADF; > > > break; > > > } > > > + vp =3D fp->f_vnode; > > > + /* > > > + * Exclusive lock synchronizes against > > > + * sequential_heuristic(). > > I would also add a sentence that we care about f_seqcount update in > > seq_heur(). > >=20 >=20 > /* > * Exclusive lock synchronizes against f_seqcount reads and writes in > * sequential_heuristic(). > */ >=20 > > Another place to add the locking annotation is the struct file in > > sys/file.h. Now f_seqcount is 'protected' by the vnode lock. > > I am not sure how to express the locking model shortly. > >=20 >=20 > /* > * (a) f_vnode lock required (shared allows both reads and writes) > */ Ok. diff --git a/sys/compat/freebsd32/freebsd32_misc.c b/sys/compat/freebsd32/f= reebsd32_misc.c index 815a9b7..fb8736c 100644 --- a/sys/compat/freebsd32/freebsd32_misc.c +++ b/sys/compat/freebsd32/freebsd32_misc.c @@ -2980,3 +2980,28 @@ freebsd32_procctl(struct thread *td, struct freebsd3= 2_procctl_args *uap) return (kern_procctl(td, uap->idtype, PAIR32TO64(id_t, uap->id), uap->com, data)); } + +int +freebsd32_fcntl(struct thread *td, struct freebsd32_fcntl_args *uap) +{ + intptr_t tmp; + + switch (uap->cmd) { + /* + * Do unsigned conversion for arg when operation + * interprets it as flags or pointer. + */ + case F_SETLK_REMOTE: + case F_SETLKW: + case F_SETLK: + case F_GETLK: + case F_SETFD: + case F_SETFL: + tmp =3D (unsigned int)(uap->arg); + break; + default: + tmp =3D uap->arg; + break; + } + return (kern_fcntl(td, uap->fd, uap->cmd, tmp)); +} diff --git a/sys/compat/freebsd32/syscalls.master b/sys/compat/freebsd32/sy= scalls.master index 3339690..161f69d 100644 --- a/sys/compat/freebsd32/syscalls.master +++ b/sys/compat/freebsd32/syscalls.master @@ -200,7 +200,8 @@ 89 AUE_GETDTABLESIZE NOPROTO { int getdtablesize(void); } 90 AUE_DUP2 NOPROTO { int dup2(u_int from, u_int to); } 91 AUE_NULL UNIMPL getdopt -92 AUE_FCNTL NOPROTO { int fcntl(int fd, int cmd, long arg); } +92 AUE_FCNTL STD { int freebsd32_fcntl(int fd, int cmd, \ + int arg); } 93 AUE_SELECT STD { int freebsd32_select(int nd, fd_set *in, \ fd_set *ou, fd_set *ex, \ struct timeval32 *tv); } --Y3cq2BYpkEb43po+ Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAEBAgAGBQJT+3IbAAoJEJDCuSvBvK1BdzQP/jEbCtmdwVzlWQBLJtWRkAhn C7idTvbhGdRWi1a8G1yEa4fNnxm7z5KGl+WRW+mO7npj8B7w/taX0pCt97J/TCIA 9iJI6qNuzylKXezuiGBrTx78pysmQ/6Y2ozQa3rGwIN4c03802Z+v5bz7uFbKFCP 9THL4Jzi7l8sNicGoG7upd0/gYEOYHg9sNuEgS7204Mn3NzRd5CQqouStm77kAea byx4IWoJrSB66bVSqb8bZcgsUn0xdM0iUcHLQQrF5lEyL6B52K7t+BC5Y5abVA78 XbUjDV1jlzjxxXLidx2ZyABjriK8FGFDBpjXquFjS1YCoz+wxb009pikmRgQP2Wf BBkibiZI/kkSS5rnqKJL2c75CJOOf7ONd/Ye0zvPVPtZbFgF9vmwGrOUwQCKhn+Q OWuQCIvMKDShL9GncrSjHP4cBHzHmxZOozUb8ysO30gWV+4tihPHKakMw9ssKnbv OqAP+qK1oPznhWlIXiciD3oRw/2SpIHErWDJUwXtMhFZvAuWwLZzntUYE5kUjTkD plVRi8gAzAazDVZYDhEa7ycP27P/9PvEzw5URpuxT6ABdzfK9vKpFEFoqiAqcXuE WdBG6cJQTzROrV0cDwfgowmW23FYhdijiHZdv/DccTbKIu5oMrS53bTXZne/ODhC k6nfxbtxS8DHD/KMJcIO =fHHK -----END PGP SIGNATURE----- --Y3cq2BYpkEb43po+--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20140825172755.GD2737>