Date: Tue, 13 Oct 2009 15:33:56 +0200 From: Ivan Voras <ivoras@freebsd.org> To: Robert Watson <rwatson@freebsd.org> Cc: freebsd-stable@freebsd.org Subject: Re: Extreme console latency during disk IO (8.0-RC1, previous releases also affected according to others) Message-ID: <9bbcef730910130633w150571a0k461fb4e67a51fb1d@mail.gmail.com> In-Reply-To: <alpine.BSF.2.00.0910131406340.26071@fledge.watson.org> References: <E316139E-FFCF-432F-8DCE-62B120C38E55@exscape.org> <CC16B639-7A75-4016-A8A8-5C59E9CD5E95@exscape.org> <hb1qs0$qjd$1@ger.gmane.org> <alpine.BSF.2.00.0910131406340.26071@fledge.watson.org>
next in thread | previous in thread | raw e-mail | index | archive | help
2009/10/13 Robert Watson <rwatson@freebsd.org>: > > On Tue, 13 Oct 2009, Ivan Voras wrote: > >> Thomas Backman wrote: >>> >>> I'm copying this over from the freebsd-performance list, as I'm looking >>> for a few more opinions - not on the problems *I* am having, but rather= to >>> check whether the problem is universal or not, and if not, find a possi= ble >>> common factor. In other words: I want to hear about your experiences, *= good >>> or bad*! >>> >>> Here's the original thread (not from the beginning, though): >>> http://lists.freebsd.org/pipermail/freebsd-performance/2009-October/003= 843.html >>> >>> Long story short, my version: when the disk is stressed hard enough, >>> console IO becomes COMPLETELY unbearable. 10+ seconds to switch between >>> windows in screen(1), running (or even typing) simple commands, etc. Th= is >>> happens both via SSH and the serial console. >> >> Hmm, this looks familiar - I've noticed it before on the physical (VGA) >> console and I notice it all the time under VMWare. It sort of looks like >> disk IO really blocks network IO in this case - I use the VMs over ssh. > > Real hardware and virtual hardware have vastly different performance > properties, so I'd be careful not to assume that the issue described by t= he > original reporter and the issue you're experiencing are the same. =C2=A0I= n our > kernel, low level network protocols will essentially always take preceden= ce > over disk I/O activity. =C2=A0So on face value "disk IO really blocks net= work IO" > is highly unlikely. Yes, I agree for both reasons and that is why I wasn't complaining until encountering this thread. > There are two much more likely possibilities: (1) poor VM implementation > causes the virtual CPU to be suspended behind synchronous host OS I/O or = (2) > the network stack is running fine but the interactive user application is > getting I/O or locks scheduled behind a bulk process. > > A useful diagnostic here is to compare the behavior of three kinds of > network latency tests: > > (1) ping from the host OS to the guest OS > (2) netperf TCP_RR from the host OS to the guest OS > (3) ssh interactive latency > > If (1) is highly variable during I/O, it's almost certainly a property of > the VM technology you're using, and there's nought to be done about it in > the guest OS. Here's an example of a ping session with 0.1s resolution during a few seconds-stall in ssh: 64 bytes from 161.53.72.188: icmp_seq=3D1576 ttl=3D64 time=3D0.383 ms 64 bytes from 161.53.72.188: icmp_seq=3D1577 ttl=3D64 time=3D0.405 ms 64 bytes from 161.53.72.188: icmp_seq=3D1578 ttl=3D64 time=3D0.360 ms 64 bytes from 161.53.72.188: icmp_seq=3D2304 ttl=3D64 time=3D4.194 ms 64 bytes from 161.53.72.188: icmp_seq=3D2305 ttl=3D64 time=3D0.454 ms 64 bytes from 161.53.72.188: icmp_seq=3D2306 ttl=3D64 time=3D0.376 ms note huge packet loss. It looks like it's VM fault or something like it.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?9bbcef730910130633w150571a0k461fb4e67a51fb1d>