From owner-freebsd-hackers@FreeBSD.ORG Thu Sep 17 10:33:33 2009 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4A445106566C for ; Thu, 17 Sep 2009 10:33:33 +0000 (UTC) (envelope-from is@rambler-co.ru) Received: from mailrelay1.rambler.ru (mailrelay1.rambler.ru [81.19.66.239]) by mx1.freebsd.org (Postfix) with ESMTP id D7E838FC08 for ; Thu, 17 Sep 2009 10:33:32 +0000 (UTC) Received: from kas30pipe.localhost (localhost [127.0.0.1]) by mailrelay1.rambler.ru (Postfix) with ESMTP id 430BB130D59 for ; Thu, 17 Sep 2009 14:15:46 +0400 (MSD) Received: from localhost (is1.park.rambler.ru [81.19.64.121]) by mailrelay1.rambler.ru (Postfix) with ESMTP id EAF1B130D19 for ; Thu, 17 Sep 2009 14:15:45 +0400 (MSD) Date: Thu, 17 Sep 2009 14:15:26 +0400 From: Igor Sysoev To: freebsd-hackers@freebsd.org Message-ID: <20090917101526.GF57619@rambler-co.ru> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="VS++wcV0S1rZb1Fb" Content-Disposition: inline User-Agent: Mutt/1.5.13 (2006-08-11) X-Anti-Virus: Kaspersky Anti-Virus for MailServers 5.5.33/RELEASE, bases: 02092009 #2738642, status: clean X-SpamTest-Envelope-From: is@rambler-co.ru X-SpamTest-Group-ID: 00000000 X-SpamTest-Info: Profiles 9536 [Sen 02 2009] X-SpamTest-Info: {received from trusted relay: common white list} X-SpamTest-Method: white ip list X-SpamTest-Rate: 0 X-SpamTest-Status: Trusted X-SpamTest-Status-Extended: trusted X-SpamTest-Version: SMTP-Filter Version 3.0.0 [0284], KAS30/Release Subject: fcntl(F_RDAHEAD) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 17 Sep 2009 10:33:33 -0000 --VS++wcV0S1rZb1Fb Content-Type: text/plain; charset=koi8-r Content-Disposition: inline Hi, nginx-0.8.15 can use completely non-blocking sendfile() using SF_NODISKIO flag. When sendfile() returns EBUSY, nginx calls aio_read() to read single byte. The first aio_read() preloads the first 128K part of a file in VM cache, however, all successive aio_read()s preload just 16K parts of the file. This makes non-blocking sendfile() usage ineffective for files larger than 128K. I've created a small patch for Darwin compatible F_RDAHEAD fcntl: fcntl(fd, F_RDAHEAD, preload_size) There is small incompatibilty: Darwin's fcntl allows just to enable/disable read ahead, while the proposed patch allows to set exact preload size. Currently the preload size affects vn_read() code path only and does not affect on sendfile() code path. However, it can be easy extended on sendfile() part too. The preload size is still limited by sysctl vfs.read_max. The patch is against FreeBSD 7.2 and was tested on FreeBSD 7.2-STABLE only. -- Igor Sysoev http://sysoev.ru/en/ --VS++wcV0S1rZb1Fb Content-Type: text/plain; charset=koi8-r Content-Disposition: attachment; filename="patch.rdahead" --- sys/sys/fcntl.h 2009-06-02 19:05:17.000000000 +0400 +++ sys/sys/fcntl.h 2009-09-12 20:29:34.000000000 +0400 @@ -118,6 +118,10 @@ #if __BSD_VISIBLE /* Attempt to bypass buffer cache */ #define O_DIRECT 0x00010000 +#ifdef _KERNEL +/* Read ahead */ +#define O_RDAHEAD 0x00020000 +#endif #endif /* @@ -187,6 +191,7 @@ #define F_SETLK 12 /* set record locking information */ #define F_SETLKW 13 /* F_SETLK; wait if blocked */ #define F_SETLK_REMOTE 14 /* debugging support for remote locks */ +#define F_RDAHEAD 15 /* read ahead */ /* file descriptor flags (F_GETFD, F_SETFD) */ #define FD_CLOEXEC 1 /* close-on-exec flag */ --- sys/kern/vfs_vnops.c 2009-06-02 19:05:00.000000000 +0400 +++ sys/kern/vfs_vnops.c 2009-09-12 20:24:00.000000000 +0400 @@ -305,6 +305,9 @@ sequential_heuristic(struct uio *uio, struct file *fp) { + if (fp->f_flag & O_RDAHEAD) + return(fp->f_seqcount << IO_SEQSHIFT); + if ((uio->uio_offset == 0 && fp->f_seqcount > 0) || uio->uio_offset == fp->f_nextoff) { /* --- sys/kern/kern_descrip.c 2009-08-28 18:50:11.000000000 +0400 +++ sys/kern/kern_descrip.c 2009-09-12 20:23:36.000000000 +0400 @@ -411,6 +411,7 @@ u_int newmin; int error, flg, tmp; int vfslocked; + uint64_t bsize; vfslocked = 0; error = 0; @@ -694,6 +695,31 @@ vfslocked = 0; fdrop(fp, td); break; + + case F_RDAHEAD: + FILEDESC_SLOCK(fdp); + if ((fp = fdtofp(fd, fdp)) == NULL) { + FILEDESC_SUNLOCK(fdp); + error = EBADF; + break; + } + if (fp->f_type != DTYPE_VNODE) { + FILEDESC_SUNLOCK(fdp); + error = EBADF; + break; + } + FILE_LOCK(fp); + if (arg) { + bsize = fp->f_vnode->v_mount->mnt_stat.f_iosize; + fp->f_seqcount = (arg + bsize - 1) / bsize; + fp->f_flag |= O_RDAHEAD; + } else { + fp->f_flag &= ~O_RDAHEAD; + } + FILE_UNLOCK(fp); + FILEDESC_SUNLOCK(fdp); + break; + default: error = EINVAL; break; --VS++wcV0S1rZb1Fb--