Date: Thu, 17 Sep 2009 14:15:26 +0400 From: Igor Sysoev <is@rambler-co.ru> To: freebsd-hackers@freebsd.org Subject: fcntl(F_RDAHEAD) Message-ID: <20090917101526.GF57619@rambler-co.ru>
next in thread | raw e-mail | index | archive | help
--VS++wcV0S1rZb1Fb Content-Type: text/plain; charset=koi8-r Content-Disposition: inline Hi, nginx-0.8.15 can use completely non-blocking sendfile() using SF_NODISKIO flag. When sendfile() returns EBUSY, nginx calls aio_read() to read single byte. The first aio_read() preloads the first 128K part of a file in VM cache, however, all successive aio_read()s preload just 16K parts of the file. This makes non-blocking sendfile() usage ineffective for files larger than 128K. I've created a small patch for Darwin compatible F_RDAHEAD fcntl: fcntl(fd, F_RDAHEAD, preload_size) There is small incompatibilty: Darwin's fcntl allows just to enable/disable read ahead, while the proposed patch allows to set exact preload size. Currently the preload size affects vn_read() code path only and does not affect on sendfile() code path. However, it can be easy extended on sendfile() part too. The preload size is still limited by sysctl vfs.read_max. The patch is against FreeBSD 7.2 and was tested on FreeBSD 7.2-STABLE only. -- Igor Sysoev http://sysoev.ru/en/ --VS++wcV0S1rZb1Fb Content-Type: text/plain; charset=koi8-r Content-Disposition: attachment; filename="patch.rdahead" --- sys/sys/fcntl.h 2009-06-02 19:05:17.000000000 +0400 +++ sys/sys/fcntl.h 2009-09-12 20:29:34.000000000 +0400 @@ -118,6 +118,10 @@ #if __BSD_VISIBLE /* Attempt to bypass buffer cache */ #define O_DIRECT 0x00010000 +#ifdef _KERNEL +/* Read ahead */ +#define O_RDAHEAD 0x00020000 +#endif #endif /* @@ -187,6 +191,7 @@ #define F_SETLK 12 /* set record locking information */ #define F_SETLKW 13 /* F_SETLK; wait if blocked */ #define F_SETLK_REMOTE 14 /* debugging support for remote locks */ +#define F_RDAHEAD 15 /* read ahead */ /* file descriptor flags (F_GETFD, F_SETFD) */ #define FD_CLOEXEC 1 /* close-on-exec flag */ --- sys/kern/vfs_vnops.c 2009-06-02 19:05:00.000000000 +0400 +++ sys/kern/vfs_vnops.c 2009-09-12 20:24:00.000000000 +0400 @@ -305,6 +305,9 @@ sequential_heuristic(struct uio *uio, struct file *fp) { + if (fp->f_flag & O_RDAHEAD) + return(fp->f_seqcount << IO_SEQSHIFT); + if ((uio->uio_offset == 0 && fp->f_seqcount > 0) || uio->uio_offset == fp->f_nextoff) { /* --- sys/kern/kern_descrip.c 2009-08-28 18:50:11.000000000 +0400 +++ sys/kern/kern_descrip.c 2009-09-12 20:23:36.000000000 +0400 @@ -411,6 +411,7 @@ u_int newmin; int error, flg, tmp; int vfslocked; + uint64_t bsize; vfslocked = 0; error = 0; @@ -694,6 +695,31 @@ vfslocked = 0; fdrop(fp, td); break; + + case F_RDAHEAD: + FILEDESC_SLOCK(fdp); + if ((fp = fdtofp(fd, fdp)) == NULL) { + FILEDESC_SUNLOCK(fdp); + error = EBADF; + break; + } + if (fp->f_type != DTYPE_VNODE) { + FILEDESC_SUNLOCK(fdp); + error = EBADF; + break; + } + FILE_LOCK(fp); + if (arg) { + bsize = fp->f_vnode->v_mount->mnt_stat.f_iosize; + fp->f_seqcount = (arg + bsize - 1) / bsize; + fp->f_flag |= O_RDAHEAD; + } else { + fp->f_flag &= ~O_RDAHEAD; + } + FILE_UNLOCK(fp); + FILEDESC_SUNLOCK(fdp); + break; + default: error = EINVAL; break; --VS++wcV0S1rZb1Fb--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20090917101526.GF57619>