Date: Fri, 24 Jul 2009 08:55:46 -0700 (PDT) From: Richard Mahlerwein <mahlerrd@yahoo.com> To: freebsd-questions@freebsd.org Subject: Re: VMWare ESX and FBSD 7.2 AMD64 guest Message-ID: <571888.63015.qm@web51010.mail.re2.yahoo.com>
next in thread | raw e-mail | index | archive | help
> From: John Nielsen <lists@jnielsen.net>=0A> Subject: Re: VMWare ESX and F= BSD 7.2 AMD64 guest=0A> To: freebsd-questions@freebsd.org=0A> Cc: "Steve Be= rtrand" <steve@ibctech.ca>=0A> Date: Friday, July 24, 2009, 10:22 AM=0A> On= Thursday 23 July 2009 19:44:15=0A> Steve Bertrand wrote:=0A> > This messag= e has a foot that has nearly touched down=0A> over the OT=0A> > borderline.= =0A> >=0A> > We received an HP Proliant DL360G5 collocation box=0A> yesterd= ay that has=0A> > two processors, and 8GB of memory.=0A> >=0A> > All the cl= ient wants to use this box for is a single=0A> instance of Windows=0A> > we= b hosting. Knowing the sites the client wants to=0A> aggregate into IIS, I= =0A> > know that the box is far over-rated.=0A> >=0A> > Making a long story= short, they have agreed to allow=0A> us to put their=0A> > Windows server = inside of a virtual-ized container, so=0A> we can use the=0A> > unused hors= epower for other vm's (test servers etc).=0A> >=0A> > My problem is perform= ance. I'm only willing to make=0A> this box virtual if=0A> > I can keep the= abstraction performance loss to <25%=0A> (my ultimate goal=0A> > would be = 15%).=0A> >=0A> > The following is what I have, followed by my benchmark=0A= > findings:=0A> >=0A> > # 7.2-RELEASE AMD64=0A> >=0A> > FreeBSD 7.2-RELEASE= #0: Fri May=A0 1 07:18:07 UTC=0A> 2009=0A> >=A0 =A0=A0=A0root@driscoll.cse= .buffalo.edu:/usr/obj/usr/src/sys/GENERIC=0A> >=0A> > Timecounter "i8254" f= requency 1193182 Hz quality 0=0A> > CPU: Intel(R) Xeon(R) CPU=A0 =A0 =A0 = =A0=0A> =A0 =A0 5150=A0 @ 2.66GHz (2666.78-MHz=0A> > K8-class CPU)=0A> >=A0= =A0=A0Origin =3D "GenuineIntel"=A0 Id =3D=0A> 0x6f6=A0 Stepping =3D 6=0A> >= =0A> > usable memory =3D 8575160320 (8177 MB)=0A> > avail memory=A0 =3D 827= 3620992 (7890 MB)=0A> >=0A> > FreeBSD/SMP: Multiprocessor System Detected: = 4 CPUs=0A> >=A0 cpu0 (BSP): APIC ID:=A0 0=0A> >=A0 cpu1 (AP): APIC ID:=A0 1= =0A> >=A0 cpu2 (AP): APIC ID:=A0 6=0A> >=A0 cpu3 (AP): APIC ID:=A0 7:=0A> = =0A> Did you give the VM 4 virtual processors as well? How much=0A> RAM did= it have? =0A> What type of storage does the server have? Did the VM just= =0A> get a .vmdk on =0A> VMFS? What version of ESX?=0A> =0A> > Benchmarks:= =0A> >=0A> > # time make -j4 buildworld (under vmware)=0A> >=0A> > 5503.038= u 3049.500s 1:15:46.25=0A> 188.1%=A0=A0=A05877+1961k 3298+586716io 2407pf+0= w=0A> >=0A> > # time make -j4 buildworld (native)=0A> >=0A> > 4777.568u 992= .422s 33:02.12 291.1%=A0=A0=A0=0A> 6533+2099k 25722+586485io 3487pf+0w=0A> = =0A> Note that the "user" time is within your 15% margin (if you=0A> round = to the =0A> nearest percent). The system time is what's running away.=0A> M= y guess is that =0A> that is largely due to disk I/O and virtualization of = same.=0A> What you can do =0A> to address this depends on what hardware you= have. Giving=0A> the VM a raw =0A> slice/LUN/disk instead of a .vmdk file = may improve matters=0A> somewhat. If you =0A> do use a disk file be sure th= at it lives on a stripe (or=0A> whatever unit is =0A> relevant) boundary of= the underlying storage. Ways to do=0A> that (if any) depend =0A> on the st= orage. Improving the RAID performance, etc. of the=0A> storage will =0A> im= prove your benchmark overall, and may or may not narrow=0A> the divide.=0A>= =0A> The (virtual) storage driver (mpt IIRC) might have some=0A> parameter= s you could =0A> tweak, but I don't know about that off the top of my head.= =0A> =0A> > ...both builds were from the exact same sources, and=0A> both r= uns were=0A> > running with the exact same environment. I was=0A> extremely= careful to=0A> > ensure that the environments were exactly the same.=0A> >= =0A> > I'd appreciate any feedback on tweaks that I can make=0A> (either to= VMWare,=0A> > or FreeBSD itself) to make the virtualized environment=0A> m= uch more efficient.=0A> =0A> See above about storage. Similar questions com= e up=0A> periodically; searching the =0A> archives if you haven't already m= ay prove fruitful. You may=0A> want to try =0A> running with different kern= el HZ settings for instance.=0A> =0A> I would also try to isolate the perfo= rmance of different=0A> components and =0A> evaluate their importance for y= our actual intended load.=0A> CPU and RAM probably =0A> perform like you ex= pect out of the box. Disk and network=0A> I/O won't be as =0A> close to nat= ive speed, but the difference and the impact=0A> are variable =0A> dependin= g on your hardware and load.=0A> =0A> A lightly-loaded Windows server is th= e poster child of=0A> virtualization =0A> candidates. If your decision is t= o dedicate the box to=0A> Winders or to =0A> virtualize and use the excess = capacity for something else I=0A> would say it's a =0A> no-brainer if the c= ost of ESX isn't a factor (or if ESXi=0A> gives you similar =0A> performanc= e). If that's already a given and your decision=0A> is between running =0A>= a specific FreeBSD instance on the ESX host or on its own=0A> hardware the= n =0A> you're wise to spec out the performance differences.=0A> =0A> HTH,= =0A> =0A> JN=0A=0AIf I recall correctly from ESX (well, VI) training*, ther= e may be a minor scheduling issue affecting things here. If you set up the= VM with 4 processors, ESX schedules time on the CPU only when there's 4 th= ings to execute (well, there's another time period it also uses, so even a = single thread will get run eventually, but anyway...). The physical instan= ce will run one thread immediately even if there's nothing else waiting, wh= ereas the VM will NOT execute a single thread necessarily immediately. I w= ould retry using perhaps -j8 or even -j12 to make sure the 4 CPUs see plent= y of work to do and see if the numbers don't slide closer to one another. = =0A=0AFor what it's worth, if there were a raw LUN available and made avail= able to the VM, the disk performance of that LUN should very nearly match n= ative performance, because it IS native performance. VMWare (if I understo= od right in the first place and remember correctly as well, I supposed I sh= ould * this as well. :) ) doesn't add anything to slow that down. Plugging= in a USB drive to the Host and making it available to the guest would also= be at native USB/drive speeds, assuming you can do that (I've never tried = to use USB drives on our blade center!).=0A=0A-Rich=0A=0A*Since I'm recalli= ng it, the standard caveats about my bad memory apply. In this case, there= 's also the caveats about the VI instructor's bad memory, too. :)=0A=0A=0A=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?571888.63015.qm>