Date: Mon, 25 Aug 2014 22:35:31 +0300 From: Konstantin Belousov <kostikbel@gmail.com> To: Mateusz Guzik <mjguzik@gmail.com>, freebsd-hackers@freebsd.org Subject: Re: atomic_load_acq_int in sequential_heuristic Message-ID: <20140825193531.GE2737@kib.kiev.ua> In-Reply-To: <20140825180417.GB23088@dft-labs.eu> References: <20140824164236.GX2737@kib.kiev.ua> <20140825005659.GA14344@dft-labs.eu> <20140825073404.GZ2737@kib.kiev.ua> <20140825081526.GB14344@dft-labs.eu> <20140825083539.GB2737@kib.kiev.ua> <20140825091056.GC14344@dft-labs.eu> <20140825111000.GC2737@kib.kiev.ua> <20140825130433.GD14344@dft-labs.eu> <20140825172755.GD2737@kib.kiev.ua> <20140825180417.GB23088@dft-labs.eu>
next in thread | previous in thread | raw e-mail | index | archive | help
--znFrcFTOc9PDN8kJ Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Aug 25, 2014 at 08:04:17PM +0200, Mateusz Guzik wrote: > On Mon, Aug 25, 2014 at 08:27:55PM +0300, Konstantin Belousov wrote: > > On Mon, Aug 25, 2014 at 03:04:33PM +0200, Mateusz Guzik wrote: > > > On Mon, Aug 25, 2014 at 02:10:01PM +0300, Konstantin Belousov wrote: > > > > On Mon, Aug 25, 2014 at 11:10:56AM +0200, Mateusz Guzik wrote: > > > > > On Mon, Aug 25, 2014 at 11:35:39AM +0300, Konstantin Belousov wro= te: > > > > > > > + atomic_set_int(&fp->f_flag, FHASLOCK); > > > > > > You misspelled FRDAHEAD as FHASLOCK, below as well. > > > > > > Was this tested ? > > > > > >=20 > > > > >=20 > > > > > Oops, damn copy-pasto. Sorry. > > > > >=20 > > > > > > > + VOP_UNLOCK(vp, 0); > > > > > > > } else { > > > > > > > - do { > > > > > > > - new =3D old =3D fp->f_flag; > > > > > > > - new &=3D ~FRDAHEAD; > > > > > > > - } while (!atomic_cmpset_rel_int(&fp->f_flag, old, new)); > > > > > > > + atomic_clear_int(&fp->f_flag, FHASLOCK); > > > > > > So what about extending the vnode lock to cover the flag reset ? > > > > > >=20 > > > > >=20 > > > > > Sure. > > > > >=20 > > > > > So this time I tested it properly and found out it is impossible = to > > > > > disable the hint. The test is: > > > > >=20 > > > > > -1 is truncated and then read from intptr_t which yields a big po= sitive > > > > > number instead. Other users in the function use int tmp to work a= round > > > > > this issue. > > > > Could you provide me with the test case which demonstrates the prob= lem ? > > > >=20 > > >=20 > > > Nothing special: > > > https://people.freebsd.org/~mjg/patches/F_READAHEAD.c > > And how did you verified that fcntl(F_READAHEAD, -1) did not worked ? > > I ended up with adding printf() to kern_fcntl() to see arg value. > >=20 >=20 > 3 uprintfs. one with the value, and then one in each if branch. >=20 > > >=20 > > > > The fcntl(2) prototype in sys/fcntl.h is variadic, so int arg argum= ent > > > > is not promoted. On the other hand, syscalls.master declares arg a= s long. > > > > Did you tried to pass -1L as third argument to disable ? > > > >=20 > > >=20 > > > Yes, -1L deals with the problem. I would still argue that using 'tmp' > > > like the rest of the function would not hurt as a cheap solution. > > This would deliberately break the current ABI (which takes the argument > > as long for F_READAHEAD), which is not acceptable. > >=20 >=20 > Ok. >=20 > > I do think that there is bug in the "-1" stuff, but it is in compat32 > > shims. The compat/freebsd32/syscalls.master does not provide the compat > > for fcntl(2), which means that 32bit fcntl(2) does not work when either > > signed extension is needed (the F_READAHEAD case), or on the big-endian > > machines. On i386, it did not practically matter before F_READAHEAD, > > since x86 is little-endian and flags passed as arg did not touch the > > sign bit. > >=20 > > Note that fcntl(2) man page is wrong, it claims that optional argument > > arg is int. It cannot be true since pointer on LP64 platform cannot > > fit into int. The SUSv4 is explicit in describing which command > > takes which type; our man page must be fixed, but this is for later. > >=20 > > See the patch at the end of the reply for the fix. It needs sysent > > regen for actual build. > >=20 >=20 > I tested the patch and it fixes the problem. Which patch ? Your's or mine ? >=20 > > > /* > > > * Exclusive lock synchronizes against f_seqcount reads and writes in > > > * sequential_heuristic(). > > > */ > > >=20 > > > > Another place to add the locking annotation is the struct file in > > > > sys/file.h. Now f_seqcount is 'protected' by the vnode lock. > > > > I am not sure how to express the locking model shortly. > > > >=20 > > >=20 > > > /* > > > * (a) f_vnode lock required (shared allows both reads and writes) > > > */ > > Ok. > >=20 >=20 > diff --git a/sys/kern/kern_descrip.c b/sys/kern/kern_descrip.c > index 7abdca0..52fc01a 100644 > --- a/sys/kern/kern_descrip.c > +++ b/sys/kern/kern_descrip.c > @@ -476,7 +476,6 @@ kern_fcntl(struct thread *td, int fd, int cmd, intptr= _t arg) > struct vnode *vp; > cap_rights_t rights; > int error, flg, tmp; > - u_int old, new; > uint64_t bsize; > off_t foffset; > =20 > @@ -760,26 +759,24 @@ kern_fcntl(struct thread *td, int fd, int cmd, intp= tr_t arg) > error =3D EBADF; > break; > } > + vp =3D fp->f_vnode; > + /* > + * Exclusive lock synchronizes against f_seqcount reads and > + * writes in sequential_heuristic(). > + */ > + error =3D vn_lock(vp, LK_EXCLUSIVE); > + if (error !=3D 0) { > + fdrop(fp, td); > + break; > + } > if (arg >=3D 0) { > - vp =3D fp->f_vnode; > - error =3D vn_lock(vp, LK_SHARED); > - if (error !=3D 0) { > - fdrop(fp, td); > - break; > - } > bsize =3D fp->f_vnode->v_mount->mnt_stat.f_iosize; > - VOP_UNLOCK(vp, 0); > fp->f_seqcount =3D (arg + bsize - 1) / bsize; > - do { > - new =3D old =3D fp->f_flag; > - new |=3D FRDAHEAD; > - } while (!atomic_cmpset_rel_int(&fp->f_flag, old, new)); > + atomic_set_int(&fp->f_flag, FRDAHEAD); > } else { > - do { > - new =3D old =3D fp->f_flag; > - new &=3D ~FRDAHEAD; > - } while (!atomic_cmpset_rel_int(&fp->f_flag, old, new)); > + atomic_clear_int(&fp->f_flag, FRDAHEAD); > } > + VOP_UNLOCK(vp, 0); > fdrop(fp, td); > break; > =20 > diff --git a/sys/kern/vfs_vnops.c b/sys/kern/vfs_vnops.c > index f1d19ac..98823f3 100644 > --- a/sys/kern/vfs_vnops.c > +++ b/sys/kern/vfs_vnops.c > @@ -438,7 +438,8 @@ static int > sequential_heuristic(struct uio *uio, struct file *fp) > { > =20 > - if (atomic_load_acq_int(&(fp->f_flag)) & FRDAHEAD) > + ASSERT_VOP_LOCKED(fp->f_vnode, __func__); > + if (fp->f_flag & FRDAHEAD) > return (fp->f_seqcount << IO_SEQSHIFT); > =20 > /* > diff --git a/sys/sys/file.h b/sys/sys/file.h > index b7d358b..856f799 100644 > --- a/sys/sys/file.h > +++ b/sys/sys/file.h > @@ -143,6 +143,7 @@ struct fileops { > * > * Below is the list of locks that protects members in struct file. > * > + * (a) f_vnode lock required (shared allows both reads and writes) > * (f) protected with mtx_lock(mtx_pool_find(fp)) > * (d) cdevpriv_mtx > * none not locked > @@ -168,7 +169,7 @@ struct file { > /* > * DTYPE_VNODE specific fields. > */ > - int f_seqcount; /* Count of sequential accesses. */ > + int f_seqcount; /* (a) Count of sequential accesses. */ > off_t f_nextoff; /* next expected read/write offset. */ > union { > struct cdev_privdata *fvn_cdevpriv; >=20 I think this patch is fine. --znFrcFTOc9PDN8kJ Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIbBAEBAgAGBQJT+5ACAAoJEJDCuSvBvK1BxfEP+IdeVV1PuLP9N8fnGXAVTnxq h3TsT1CZeQzR0J+DK07FFspWsDzDtPIOletJZa63G7bcV9mMYxfeXdFNaxmjy3sZ 7eD7+VEUfp45HwFSeuQN8I0Nq5qLRv1XvWnU1h9Z9pYqSirXD9N/gMTkb/Mtc3AG XCgu+DmHjoo8mf8p9cB+C1t68I6s8Lc3s6eMI/D9yVv/SFLJj1gVtTO/XMtJfpPV LZbo1bVUzgCIxek5VSfrxAS3tCdplZb2Wjszk+3JLtX+RXr959QZjYvj1jrWU1XK +7KnuiJFc+8WwtGoYKy+PE5H5ioTZippp6weEjsC0W4OpZ0LUuNdAeb0ofzfnpP8 Y0XZl02X2NqsROxW9+kOqmn/ddYA3zBzLNZMF9ouCtxYUMYxQK8ZBDM5FNC+P+Fk u1T8p0kogpTjsVd43XalljNAGChdh4eqHAi64/NGHft0qqcChFSixaao9ZPFeHhK 11rfV8HMMaop0YHLdM+vIz51avC7ejpsX3YC8n7Rfi7ard94sMhWnNkJp+Ub6Y+9 VfglO0PkftkYkVTcnvTrd2eodRiglhFoFRU2GcrPALrAC+8x+O5cDW84mIEKm/ZR Q13KO3OZj/V/NEqB+/0ZQtAOCAHrjyceW21X7cz/LOB6Ns111jx1ozPalvA9aVN7 kpFo3v0WQmOXdV91MDU= =3iw4 -----END PGP SIGNATURE----- --znFrcFTOc9PDN8kJ--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20140825193531.GE2737>