From owner-freebsd-hackers@FreeBSD.ORG Mon Nov 12 13:48:12 2007 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 599C516A420 for ; Mon, 12 Nov 2007 13:48:12 +0000 (UTC) (envelope-from mail@maxlor.com) Received: from xmail04.myhosting.com (xmail04.myhosting.com [168.144.250.19]) by mx1.freebsd.org (Postfix) with ESMTP id F163F13C4DA for ; Mon, 12 Nov 2007 13:48:10 +0000 (UTC) (envelope-from mail@maxlor.com) Received: (qmail 32171 invoked from network); 12 Nov 2007 12:48:00 -0000 Received: from unknown (HELO [192.168.10.184]) (Authenticated-user:_benjamin.lutz@assentis.com@[212.4.72.186]) (envelope-sender ) by xmail04.myhosting.com (qmail-ldap-1.03) with ESMTPA for ; 12 Nov 2007 12:48:00 -0000 Message-ID: <47384B75.5090806@maxlor.com> Date: Mon, 12 Nov 2007 13:47:49 +0100 From: Benjamin Lutz User-Agent: Thunderbird 2.0.0.6 (Windows/20070728) MIME-Version: 1.0 To: Randall Hyde References: <000701c82253$b3a8c030$6302a8c0@pentiv> In-Reply-To: <000701c82253$b3a8c030$6302a8c0@pentiv> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-hackers@freebsd.org Subject: Re: Some FreeBSD performance Issues X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 Nov 2007 13:48:12 -0000 Randall Hyde wrote: > Hi All, > > I recently ported my HLA (High Level Assembler) compiler to FreeBSD and, > along with it, the HLA Standard Library. I have a performance-related > question concerning file I/O. > > It appears that character-at-a-time file I/O is *exceptionally* slow. Yes, I > realize that when processing large files I really ought to be doing > block/buffered I/O to get the best performance, but for certain library > routines I've written it's been far more convenient to do > character-at-a-time I/O rather than deal with all the buffering issues. In > the past, while slower, this character-at-a-time paradigm has provided > reasonable, though not stellar, performance under Windows and Linux. > However, with the port to FreeBSD I'm seeing a three-orders-of-magnitude > performance loss. Here's my little test program: > > program t; > #include( "stdlib.hhf" ) > //#include( "bsd.hhf" ) > > static > f :dword; > buffer :char[64*1024]; > > begin t; > > fileio.open( "socket.h", fileio.r ); > mov( eax, f ); > #if( false ) > > // Windows: 0.25 seconds > // BSD: 5.2 seconds > > while( !fileio.eof( f )) do > > fileio.getc( f ); > //stdout.put( (type char al )); > > endwhile; > > #elseif( false ) > > // Windows: 0.0 seconds (below 1ms threshold) > // BSD: 5.2 seconds > > forever > > fileio.read( f, buffer, 1 ); > breakif( eax <> 1 ); > //stdout.putc( buffer[0] ); > > endfor; > > #elseif( false ) > > // BSD: 5.1 seconds > > forever > > bsd.read( f, buffer, 1 ); > breakif( @c ); > breakif( eax <> 1 ); > //stdout.putc( buffer[0] ); > > endfor; > > #else > > // BSD: 0.016 seconds > > bsd.read( f, buffer, 64*1024 ); > //stdout.write( buffer, eax ); > > #endif > > fileio.close( f ); > > end t; > > (I selectively set one of the conditionals to true to run a different test; > yeah, this is HLA assembly code, but I suspect that most people who can read > C can *mostly* figure out what's going on here). > > The "fileio.open" call is basically a bsd.open( "socket.h", bsd.O_RDONLY ); > API call. The socket.h file is about 19K long (it's from the FreeBSD > include file set). In particular, I would draw your attention to the first > two tests that do character-at-a-time I/O. The difference in performance > between Windows and FreeBSD is dramatic (note: Linux numbers are comparable > to Windows). Just to make sure that the library code wasn't doing something > incredibly stupid, the third test makes a direct FreeBSD API call to read > the data a byte at a time -- the results are comparable to the first two > tests. Finally, I read the whole file at once, just to make sure the problem > was character-at-a-time I/O (which obviously is the problem). Naturally, at > one point I'd uncommented all the output statements to verify that I was > reading the entire file -- no problem there. > > Is this really the performance I can expect from FreeBSD when doing > character I/O this way? Is is there some tuning parameter I can set to > change internal buffering or something? From this numbers, if I had to > guess, I'd suspect that FreeBSD was re-reading the entire 4K (or whatever) > block from the file cache everytime I read a single character. Can anyone > explain what's going on here? I'm loathe to change my fileio module to add > buffering as that will create some subtle semantic differences that could > break existing code (I do have an object-oriented file I/O class that I'm > going to use to implement buffered I/O, I would prefer to leave the fileio > module unbuffered, if possible). > > And a more general question: if this is the way FreeBSD works, should > something be done about it? > Thanks, > Randy Hyde Hello Randy, First, let me out myself as a fan of yours. It was your book that got me started on ASM and taught me a lot about computers and logic, plus it provided some entertainment and mental sustenance in pretty boring times, so thanks! Now, as for your problem: I think I have to agree with the others in this thread when they say that the problem likely isn't in FreeBSD. The following C program, which uses the read(2) call to read socket.h byte-by-byte, runs quickly (0.05 secs on my 2.1GHz system, measured with time(1)): #include #include #include #include #include #include int main(int argc, char** argv) { int f; char c; ssize_t result; f = open("/usr/include/sys/socket.h", O_RDONLY); if (f < 0) { perror("open"); exit(1); } do { result = read(f, &c, 1); if (result < 0) { perror("read"); exit(1); } //printf("%c", c); } while (result >= 1); return 0; } This should be quite equivalent to your second and third code fragment; it does one read system call per byte, no buffering involved. This leads me to believe that the slowdown occurs in your fileio.read wrapper, or maybe in the process setup/teardown process. Cheers Benjamin