Date: Mon, 8 Nov 2010 00:13:53 -0800 (PST) From: DJ <fusionfoto@yahoo.com> To: freebsd-questions@freebsd.org Subject: zfs performance issues with iscsi (istgt) Message-ID: <340170.48770.qm@web113309.mail.gq1.yahoo.com>
next in thread | raw e-mail | index | archive | help
After scratching my head for a few weeks, I've decided to ask for some help= . First, I've got two machines connected by gigabit ethernet, network perform= ance is not a problem as I am able to substantially saturate the wire when = not using iscsi [say iperf] or ftp. Both systems are 8.1-RELENG. They are b= oth multi-core, 8G of RAM.=20 Symptoms: When doing writes (size relatively independent) from a client to = a server via iSCSI I seem to be=0A hitting a wall between 18-26MB/s of writ= e. This can be repeated continuously whether doing a newfs on a 2TB iscsi v= olume or doing a dd from /dev/zero to the iscsi target. I haven't compared = read performance. What originally put me on to this was watching the newfs = *fly* across the screen, and then hang for several seconds, and then *fly* = again, and=0A then pause.=20 This looked like a write-delay problem, so I tweaked txgwrite values and/or= the synctime values. This showed some improvements (iostat showed somethin= g closer to continuous write performance to the server but there was still = a delay whether the write_limit was 384MB all the way up to 4GB. This tells= me the spindles weren't holding the throughput back. The iostat size was n= ever much beyond 20-26MB/s, peaks were frequently two-three times that, but= then it would be 1MB/s for a few seconds which would bring us back to this= average). CPU and network load were never the limiting factor, nor did the= spindles ever get above 20-30% busy.=20 So I added two USB keys that write at around 30-40MB/s, and mirrored them a= s a ZIL log. iostat verifies they are being used, but not continuously, it = seems that the txgwrite value applies to writing to the ZIL. I also tried t= urning off the ZIL log and saw no particular performance increase (or=0A de= crease). When newfs (which jumps around a lot more than dd) the performance= throughput does not change much at all. Even at 26K-40K pps, interrupt loa= ds and such are not problematic, turning on polling does not change the per= formance appreciably. The "server" is a RAIDZ2 of 15 drives @ 2TB each. So *write* throughput sho= uld be pretty fast sequentially (i.e. the dd case), but it is returning ide= ntically. This server does nothing much but istgt -- tried NCQ values from = 255 down to 32 to no improvement. Even though network performance was not showing a particular limit, I *did*= get from 18MB/s to 26MB/s by tweaking tcp sendbuf* and tcp send* values wa= y beyond reason even though the TCP throughput hadn't been a problem in non= iscsi operations. So whatever i'm doing is not addressing the particular problem. The drives = have plenty of available I/O, but instead of using it, or the RAM in the sy= stem, or the ZIL in the system, it seems=0A largely idle, pegs the system w= ith continuous (but not max speed) writes and halts the network transfers, = and then continues on its way.=20 Even if its a threading issue (i.e. we are single threading) there should b= e some way to make this behave like a normal system considering how much RA= M, SSD, and other resources I'm trying to through at this thing. For exampl= e, after the buffer starts to empty, additional writes from the client shou= ld be accepted and NCQ should help reorder to process them in an efficient = fashion, etc, etc.=20 istgt settings: istgt version 0.3 istgt extra version 20100707 =A0=A0=A0 MaxSessions=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 32 =A0=A0=A0 MaxConnections=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 32 =A0=A0=A0 FirstBurstLength=A0=A0=A0=A0=A0=A0=A0=A0 65536 =A0=A0=A0 MaxBurstLength=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 262144 =A0=A0=A0 MaxRecvDataSegmentLength 262144 Local benchmarks like dd if=3D/dev/zero of=3D/tank/dump bs=3D1M count=3D120= 00 returns like 200MB/s. 12582912000 bytes transferred in 61.140903 secs (2= 05801867 bytes/sec), and show continuous (as expected) writes to the spindl= es. (200MB/s is pretty close to the max I/O speed we can expect given the p= ort the controller is in and RAID overhead, etc with 7200 RPM drives, at 59= 00 RPM the number is about 80MB/s).=20 If this is an istgt problem, is there a way to get reasonable performance o= ut of it? I know I'm not losing my mind here, so if someone has tackled this particul= ar problem (or its sort), please chime in and let me know what tunable I'm = missing. :) Thanks very much, in advance, DJ =0A=0A=0A
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?340170.48770.qm>