Date: Thu, 28 Mar 2013 09:52:09 +0200 From: Konstantin Belousov <kostikbel@gmail.com> To: Maksim Yevmenkin <maksim.yevmenkin@gmail.com> Cc: current@freebsd.org Subject: Re: [RFC] vfs.read_min proposal Message-ID: <20130328075209.GL3794@kib.kiev.ua> In-Reply-To: <CAFPOs6rNDZTqWJZ3hK=px5RX5G44Z3hfzCLQcfceQ2n_7oU3GA@mail.gmail.com> References: <CAFPOs6rNDZTqWJZ3hK=px5RX5G44Z3hfzCLQcfceQ2n_7oU3GA@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
--DX8dQmZ7BQ17qNYG Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Mar 27, 2013 at 01:43:32PM -0700, Maksim Yevmenkin wrote: > Hello, >=20 > i would like to get some reviews, opinions and/or comments on the patch b= elow. >=20 > a little bit background, as far as i understand, cluster_read() can > initiate two disk i/o's: one for exact amount of data being requested > (rounded up to a filesystem block size) and another for a configurable > read ahead. read ahead data are always extra and do not super set data > being requested. also, read ahead can be controlled via f_seqcount (on > per descriptor basis) and/or vfs.read_max (global knob). >=20 > in some cases and/or on some work loads it can be beneficial to bundle > original data and read ahead data in one i/o request. in other words, > read more than caller has requested, but only perform one larger i/o, > i.e. super set data being requested and read ahead. The totread argument to the cluster_read() is supplied by the filesystem to indicate how many data in the current request is specified. Always overriding this information means two things: - you fill the buffer and page cache with potentially unused data. For some situations, like partial reads, it would be really bad. - you increase the latency by forcing the reader to wait for the whole cluster which was not asked for. So it looks as very single- and special-purpose hack. Besides, the global knob is obscure and probably would not have any use except your special situation. Would a file flag be acceptable for you ? What is the difference in the numbers you see, and what numbers ? Is it targeted for read(2) optimizations, or are you also concerned with the read-ahead done at the fault time ? >=20 > =3D=3D=3D >=20 > Index: trunk/cache/src/sys/kern/vfs_cluster.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > diff -u -N -r515 -r1888 > --- trunk/cache/src/sys/kern/vfs_cluster.c (.../vfs_cluster.c) (revision = 515) > +++ trunk/cache/src/sys/kern/vfs_cluster.c (.../vfs_cluster.c) (revision = 1888) > @@ -75,6 +75,10 @@ > SYSCTL_INT(_vfs, OID_AUTO, read_max, CTLFLAG_RW, &read_max, 0, > "Cluster read-ahead max block count"); >=20 > +static int read_min =3D 1; > +SYSCTL_INT(_vfs, OID_AUTO, read_min, CTLFLAG_RW, &read_min, 0, > + "Cluster read min block count"); > + > /* Page expended to mark partially backed buffers */ > extern vm_page_t bogus_page; >=20 > @@ -169,13 +173,21 @@ > } else { > off_t firstread =3D bp->b_offset; > int nblks; > + long minread; >=20 > KASSERT(bp->b_offset !=3D NOOFFSET, > ("cluster_read: no buffer offset")); >=20 > ncontig =3D 0; >=20 > /* > + * Adjust totread if needed > + */ > + minread =3D read_min * size; > + if (minread > totread) > + totread =3D minread; > + > + /* > * Compute the total number of blocks that we should read > * synchronously. > */ >=20 > =3D=3D=3D >=20 > thanks, > max > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" --DX8dQmZ7BQ17qNYG Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iQIbBAEBAgAGBQJRU/aoAAoJEJDCuSvBvK1B+4oP9160js6W/5YvNaBOzagF54r/ wkLqFtQaBKU9QAMBY2mFOsTAvdVObrWnyHRW5PJxiHH3ZSb5qrwx6ChzcQcG73y1 ZsoUCVAKvqdEhbMpC5rgPIAxMgg96FdxkmojCDl4GBgXGOPd0tari/EJmYdnoBLB QmusXB8cnSTp162SClUGviKUpMMiON05aBH03Xx+xmPYuuwmTsoHCrTE2vTM39o2 6J/iC7wPHUcZdBKwGjAj3fDdgl0ptrh0fwPDYjSzQHhTKnZO8d8vs6MpKVaLdtUg yx8XMzme39piEGMBZN24hVNnqCyox6o/mhuD6VEC0n5G4p/OpgJmP9EEQfSi62Mv s+C58LEQuvTqgOZ3h31AhiqeL8yxf+B43mY/x6TodeKMlWz2831KIeNQ2avm5HQM U6nXeWLsbjeYMvuWLqwVH3guVcoaQwYxm6GDOLBtfNbu3J7kuYba0T8a72eEFWtj +yfopxKzGRTc9wAaKHK17hepb36YdNY6hoKt0qTGbrnRPQbuUeieV1OZ/53ufyDs WIgJHO8+4TLVey6bIXS+CqaeV4dWLYc6mdJmeMgK1vqJY7yDZT7GJaQ7SmXmfwdJ qfkq82IAdlxbttdhM83TTp0cozP/QoPPeszgsJw6/GBjCLhBGbyC0RhFYM51WXBM wo9UgUdATPhTX5c4sAQ= =I/Jw -----END PGP SIGNATURE----- --DX8dQmZ7BQ17qNYG--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130328075209.GL3794>