Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 25 Aug 2014 22:35:31 +0300
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        Mateusz Guzik <mjguzik@gmail.com>, freebsd-hackers@freebsd.org
Subject:   Re: atomic_load_acq_int in sequential_heuristic
Message-ID:  <20140825193531.GE2737@kib.kiev.ua>
In-Reply-To: <20140825180417.GB23088@dft-labs.eu>
References:  <20140824164236.GX2737@kib.kiev.ua> <20140825005659.GA14344@dft-labs.eu> <20140825073404.GZ2737@kib.kiev.ua> <20140825081526.GB14344@dft-labs.eu> <20140825083539.GB2737@kib.kiev.ua> <20140825091056.GC14344@dft-labs.eu> <20140825111000.GC2737@kib.kiev.ua> <20140825130433.GD14344@dft-labs.eu> <20140825172755.GD2737@kib.kiev.ua> <20140825180417.GB23088@dft-labs.eu>

next in thread | previous in thread | raw e-mail | index | archive | help

--znFrcFTOc9PDN8kJ
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Mon, Aug 25, 2014 at 08:04:17PM +0200, Mateusz Guzik wrote:
> On Mon, Aug 25, 2014 at 08:27:55PM +0300, Konstantin Belousov wrote:
> > On Mon, Aug 25, 2014 at 03:04:33PM +0200, Mateusz Guzik wrote:
> > > On Mon, Aug 25, 2014 at 02:10:01PM +0300, Konstantin Belousov wrote:
> > > > On Mon, Aug 25, 2014 at 11:10:56AM +0200, Mateusz Guzik wrote:
> > > > > On Mon, Aug 25, 2014 at 11:35:39AM +0300, Konstantin Belousov wro=
te:
> > > > > > > +			atomic_set_int(&fp->f_flag, FHASLOCK);
> > > > > > You misspelled FRDAHEAD as FHASLOCK, below as well.
> > > > > > Was this tested ?
> > > > > >=20
> > > > >=20
> > > > > Oops, damn copy-pasto. Sorry.
> > > > >=20
> > > > > > > +			VOP_UNLOCK(vp, 0);
> > > > > > >  		} else {
> > > > > > > -			do {
> > > > > > > -				new =3D old =3D fp->f_flag;
> > > > > > > -				new &=3D ~FRDAHEAD;
> > > > > > > -			} while (!atomic_cmpset_rel_int(&fp->f_flag, old, new));
> > > > > > > +			atomic_clear_int(&fp->f_flag, FHASLOCK);
> > > > > > So what about extending the vnode lock to cover the flag reset ?
> > > > > >=20
> > > > >=20
> > > > > Sure.
> > > > >=20
> > > > > So this time I tested it properly and found out it is impossible =
to
> > > > > disable the hint. The test is:
> > > > >=20
> > > > > -1 is truncated and then read from intptr_t which yields a big po=
sitive
> > > > > number instead. Other users in the function use int tmp to work a=
round
> > > > > this issue.
> > > > Could you provide me with the test case which demonstrates the prob=
lem ?
> > > >=20
> > >=20
> > > Nothing special:
> > > https://people.freebsd.org/~mjg/patches/F_READAHEAD.c
> > And how did you verified that fcntl(F_READAHEAD, -1) did not worked ?
> > I ended up with adding printf() to kern_fcntl() to see arg value.
> >=20
>=20
> 3 uprintfs. one with the value, and then one in each if branch.
>=20
> > >=20
> > > > The fcntl(2) prototype in sys/fcntl.h is variadic, so int arg argum=
ent
> > > > is not promoted.  On the other hand, syscalls.master declares arg a=
s long.
> > > > Did you tried to pass -1L as third argument to disable ?
> > > >=20
> > >=20
> > > Yes, -1L deals with the problem. I would still argue that using 'tmp'
> > > like the rest of the function would not hurt as a cheap solution.
> > This would deliberately break the current ABI (which takes the argument
> > as long for F_READAHEAD), which is not acceptable.
> >=20
>=20
> Ok.
>=20
> > I do think that there is bug in the "-1" stuff, but it is in compat32
> > shims. The compat/freebsd32/syscalls.master does not provide the compat
> > for fcntl(2), which means that 32bit fcntl(2) does not work when either
> > signed extension is needed (the F_READAHEAD case), or on the big-endian
> > machines.  On i386, it did not practically matter before F_READAHEAD,
> > since x86 is little-endian and flags passed as arg did not touch the
> > sign bit.
> >=20
> > Note that fcntl(2) man page is wrong, it claims that optional argument
> > arg is int.  It cannot be true since pointer on LP64 platform cannot
> > fit into int.  The SUSv4 is explicit in describing which command
> > takes which type; our man page must be fixed, but this is for later.
> >=20
> > See the patch at the end of the reply for the fix.  It needs sysent
> > regen for actual build.
> >=20
>=20
> I tested the patch and it fixes the problem.
Which patch ? Your's or mine ?

>=20
> > > /*
> > >  * Exclusive lock synchronizes against f_seqcount reads and writes in
> > >  * sequential_heuristic().
> > >  */
> > >=20
> > > > Another place to add the locking annotation is the struct file in
> > > > sys/file.h.  Now f_seqcount is 'protected' by the vnode lock.
> > > > I am not sure how to express the locking model shortly.
> > > >=20
> > >=20
> > > /*
> > >  * (a) f_vnode lock required (shared allows both reads and writes)
> > >  */
> > Ok.
> >=20
>=20
> diff --git a/sys/kern/kern_descrip.c b/sys/kern/kern_descrip.c
> index 7abdca0..52fc01a 100644
> --- a/sys/kern/kern_descrip.c
> +++ b/sys/kern/kern_descrip.c
> @@ -476,7 +476,6 @@ kern_fcntl(struct thread *td, int fd, int cmd, intptr=
_t arg)
>  	struct vnode *vp;
>  	cap_rights_t rights;
>  	int error, flg, tmp;
> -	u_int old, new;
>  	uint64_t bsize;
>  	off_t foffset;
> =20
> @@ -760,26 +759,24 @@ kern_fcntl(struct thread *td, int fd, int cmd, intp=
tr_t arg)
>  			error =3D EBADF;
>  			break;
>  		}
> +		vp =3D fp->f_vnode;
> +		/*
> +		 * Exclusive lock synchronizes against f_seqcount reads and
> +		 * writes in sequential_heuristic().
> +		 */
> +		error =3D vn_lock(vp, LK_EXCLUSIVE);
> +		if (error !=3D 0) {
> +			fdrop(fp, td);
> +			break;
> +		}
>  		if (arg >=3D 0) {
> -			vp =3D fp->f_vnode;
> -			error =3D vn_lock(vp, LK_SHARED);
> -			if (error !=3D 0) {
> -				fdrop(fp, td);
> -				break;
> -			}
>  			bsize =3D fp->f_vnode->v_mount->mnt_stat.f_iosize;
> -			VOP_UNLOCK(vp, 0);
>  			fp->f_seqcount =3D (arg + bsize - 1) / bsize;
> -			do {
> -				new =3D old =3D fp->f_flag;
> -				new |=3D FRDAHEAD;
> -			} while (!atomic_cmpset_rel_int(&fp->f_flag, old, new));
> +			atomic_set_int(&fp->f_flag, FRDAHEAD);
>  		} else {
> -			do {
> -				new =3D old =3D fp->f_flag;
> -				new &=3D ~FRDAHEAD;
> -			} while (!atomic_cmpset_rel_int(&fp->f_flag, old, new));
> +			atomic_clear_int(&fp->f_flag, FRDAHEAD);
>  		}
> +		VOP_UNLOCK(vp, 0);
>  		fdrop(fp, td);
>  		break;
> =20
> diff --git a/sys/kern/vfs_vnops.c b/sys/kern/vfs_vnops.c
> index f1d19ac..98823f3 100644
> --- a/sys/kern/vfs_vnops.c
> +++ b/sys/kern/vfs_vnops.c
> @@ -438,7 +438,8 @@ static int
>  sequential_heuristic(struct uio *uio, struct file *fp)
>  {
> =20
> -	if (atomic_load_acq_int(&(fp->f_flag)) & FRDAHEAD)
> +	ASSERT_VOP_LOCKED(fp->f_vnode, __func__);
> +	if (fp->f_flag & FRDAHEAD)
>  		return (fp->f_seqcount << IO_SEQSHIFT);
> =20
>  	/*
> diff --git a/sys/sys/file.h b/sys/sys/file.h
> index b7d358b..856f799 100644
> --- a/sys/sys/file.h
> +++ b/sys/sys/file.h
> @@ -143,6 +143,7 @@ struct fileops {
>   *
>   * Below is the list of locks that protects members in struct file.
>   *
> + * (a) f_vnode lock required (shared allows both reads and writes)
>   * (f) protected with mtx_lock(mtx_pool_find(fp))
>   * (d) cdevpriv_mtx
>   * none	not locked
> @@ -168,7 +169,7 @@ struct file {
>  	/*
>  	 *  DTYPE_VNODE specific fields.
>  	 */
> -	int		f_seqcount;	/* Count of sequential accesses. */
> +	int		f_seqcount;	/* (a) Count of sequential accesses. */
>  	off_t		f_nextoff;	/* next expected read/write offset. */
>  	union {
>  		struct cdev_privdata *fvn_cdevpriv;
>=20
I think this patch is fine.

--znFrcFTOc9PDN8kJ
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQIbBAEBAgAGBQJT+5ACAAoJEJDCuSvBvK1BxfEP+IdeVV1PuLP9N8fnGXAVTnxq
h3TsT1CZeQzR0J+DK07FFspWsDzDtPIOletJZa63G7bcV9mMYxfeXdFNaxmjy3sZ
7eD7+VEUfp45HwFSeuQN8I0Nq5qLRv1XvWnU1h9Z9pYqSirXD9N/gMTkb/Mtc3AG
XCgu+DmHjoo8mf8p9cB+C1t68I6s8Lc3s6eMI/D9yVv/SFLJj1gVtTO/XMtJfpPV
LZbo1bVUzgCIxek5VSfrxAS3tCdplZb2Wjszk+3JLtX+RXr959QZjYvj1jrWU1XK
+7KnuiJFc+8WwtGoYKy+PE5H5ioTZippp6weEjsC0W4OpZ0LUuNdAeb0ofzfnpP8
Y0XZl02X2NqsROxW9+kOqmn/ddYA3zBzLNZMF9ouCtxYUMYxQK8ZBDM5FNC+P+Fk
u1T8p0kogpTjsVd43XalljNAGChdh4eqHAi64/NGHft0qqcChFSixaao9ZPFeHhK
11rfV8HMMaop0YHLdM+vIz51avC7ejpsX3YC8n7Rfi7ard94sMhWnNkJp+Ub6Y+9
VfglO0PkftkYkVTcnvTrd2eodRiglhFoFRU2GcrPALrAC+8x+O5cDW84mIEKm/ZR
Q13KO3OZj/V/NEqB+/0ZQtAOCAHrjyceW21X7cz/LOB6Ns111jx1ozPalvA9aVN7
kpFo3v0WQmOXdV91MDU=
=3iw4
-----END PGP SIGNATURE-----

--znFrcFTOc9PDN8kJ--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20140825193531.GE2737>