From owner-freebsd-emulation@FreeBSD.ORG Wed Jun 6 23:08:00 2012 Return-Path: Delivered-To: freebsd-emulation@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 036E3106566B for ; Wed, 6 Jun 2012 23:08:00 +0000 (UTC) (envelope-from kob6558@gmail.com) Received: from mail-wi0-f178.google.com (mail-wi0-f178.google.com [209.85.212.178]) by mx1.freebsd.org (Postfix) with ESMTP id 841E28FC12 for ; Wed, 6 Jun 2012 23:07:59 +0000 (UTC) Received: by wibhn6 with SMTP id hn6so4548403wib.13 for ; Wed, 06 Jun 2012 16:07:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=MEUNzuFgJcizQDFG1+BbGfz6O7zEwF0ATFfJ6XR0SxA=; b=xHWD26dJ3SoggtqZFENOda6cD+Gtiss3FaBlAEC+Qip4zfv0M3p0bi2K1Qjv1uhu3Q tzAh/nfQAT3C2rob0H9zNVzfNwGHcBjXWH5z7PWFNaeCR+ZQgQINFjosLj5FhOVpaMuh yWqm3mMkjFfKo8ZvhqMlc5Ms42eoXYgL3SXHblXNrxCiYgH+qGGhryJWJyMNEHy4uKuK 2jdCQPmoWaCyjKkV554V/6GTllIY39hYE32VgeUtxrNc8vV0ufN2JN/sWS3jvb/HL6TY x9pShp6InrdbjxQuithoik1O5a2PCX3sUCCGkaNU4VMaCvCXT00TAWqZ2BXsIkegnqLN 7y6A== MIME-Version: 1.0 Received: by 10.216.194.93 with SMTP id l71mr2480279wen.169.1339024073143; Wed, 06 Jun 2012 16:07:53 -0700 (PDT) Received: by 10.223.155.4 with HTTP; Wed, 6 Jun 2012 16:07:52 -0700 (PDT) In-Reply-To: References: Date: Wed, 6 Jun 2012 16:07:52 -0700 Message-ID: From: Kevin Oberman To: Steve Tuts Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-emulation@freebsd.org Subject: Re: one virtualbox vm disrupts all vms and entire network X-BeenThere: freebsd-emulation@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Development of Emulators of other operating systems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Jun 2012 23:08:00 -0000 On Wed, Jun 6, 2012 at 3:46 PM, Steve Tuts wrote: > On Wed, Jun 6, 2012 at 3:50 AM, Bernhard Froehlich wro= te: > >> On 05.06.2012 20:16, Bernhard Froehlich wrote: >> >>> On 05.06.2012 19:05, Steve Tuts wrote: >>> >>>> On Mon, Jun 4, 2012 at 4:11 PM, Rusty Nejdl >>>> wrote: >>>> >>>> =A0On 2012-06-02 12:16, Steve Tuts wrote: >>>>> >>>>> =A0Hi, we have a Dell poweredge server with a dozen interfaces. =A0It= hosts >>>>>> a >>>>>> few guests of web app and email servers with VirtualBox-4.0.14. =A0T= he >>>>>> host >>>>>> and all guests are FreeBSD 9.0 64bit. =A0Each guest is bridged to a >>>>>> distinct >>>>>> interface. =A0The host and all guests are set to 10.0.0.0 network NA= T'ed >>>>>> to >>>>>> a >>>>>> cicso router. >>>>>> >>>>>> This runs well for a couple months, until we added a new guest >>>>>> recently. >>>>>> Every few hours, none of the guests can be connected. =A0We can only >>>>>> connect >>>>>> to the host from outside the router. =A0We can also go to the consol= e of >>>>>> the >>>>>> guests (except the new guest), but from there we can't ping the gate= way >>>>>> 10.0.0.1 any more. =A0The new guest just froze. >>>>>> >>>>>> Furthermore, on the host we can see a vboxheadless process for each >>>>>> guest, >>>>>> including the new guest. =A0But we can not kill it, not even with "k= ill >>>>>> -9". >>>>>> We looked around the web and someone suggested we should use "kill >>>>>> -SIGCONT" first since the "ps" output has the "T" flag for that >>>>>> vboxheadless process for that new guest, but that doesn't help. =A0W= e >>>>>> also >>>>>> tried all the VBoxManager commands to poweroff/reset etc that new >>>>>> guest, >>>>>> but they all failed complaining that vm is in Aborted state. =A0We a= lso >>>>>> tried >>>>>> VBoxManager commands to disconnect the network cable for that new >>>>>> guest, >>>>>> it >>>>>> didn't complain, but there was no effect. >>>>>> >>>>>> For a couple times, on the host we disabled the interface bridging t= hat >>>>>> new >>>>>> guest, then that vboxheadless process for that new guest disappeared >>>>>> (we >>>>>> attempted to kill it before that). =A0And immediately all other vms >>>>>> regained >>>>>> connection back to normal. >>>>>> >>>>>> But there is one time even the above didn't help - the vboxheadless >>>>>> process >>>>>> for that new guest stubbonly remains, and we had to reboot the host. >>>>>> >>>>>> This is already a production server, so we can't upgrade virtualbox = to >>>>>> the >>>>>> latest version until we obtain a test server. >>>>>> >>>>>> Would you advise: >>>>>> >>>>>> 1. is there any other way to kill that new guest instead of rebootin= g? >>>>>> 2. what might cause the problem? >>>>>> 3. what setting and test I can do to analyze this problem? >>>>>> ______________________________****_________________ >>>>>> >>>>>> >>>>> I haven't seen any comments on this and don't want you to think you a= re >>>>> being ignored but I haven't seen this but also, the 4.0 branch was >>>>> buggier >>>>> for me than the 4.1 releases so yeah, upgrading is probably what you = are >>>>> looking at. >>>>> >>>>> Rusty Nejdl >>>>> ______________________________****_________________ >>>>> >>>>> >>>>> =A0sorry, just realize my reply yesterday didn't go to the list, so a= m >>>> re-sending with some updates. >>>> >>>> Yes, we upgraded all ports and fortunately everything went back and >>>> especially all vms has run peacefully for two days now. =A0So upgradin= g to >>>> the latest virtualbox 4.1.16 solved that problem. >>>> >>>> But now we got a new problem with this new version of virtualbox: >>>> whenever >>>> we try to vnc to any vm, that vm will go to Aborted state immediately. >>>> Actually, merely telnet from within the host to the vnc port of that v= m >>>> will immediately Abort that vm. =A0This prevents us from adding new vm= s. >>>> Also, when starting vm with vnc port, we got this message: >>>> >>>> rfbListenOnTCP6Port: error in bind IPv6 socket: Address already in use >>>> >>>> , which we found someone else provided a patch at >>>> http://permalink.gmane.org/**gmane.os.freebsd.devel.**emulation/10237<= http://permalink.gmane.org/gmane.os.freebsd.devel.emulation/10237> >>>> >>>> So looks like when there are multiple vms on a ipv6 system (we have 64= bit >>>> FreeBSD 9.0) will get this problem. >>>> >>> >>> Glad to hear that 4.1.16 helps for the networking problem. The VNC prob= lem >>> is also a known one but the mentioned patch does not work at least for = a >>> few people. It seems the bug is somewhere in libvncserver so downgradin= g >>> net/libvncserver to an earlier version (and rebuilding virtualbox) shou= ld >>> help until we come up with a proper fix. >>> >> >> You are right about the "Address already in use" problem and the patch f= or >> it so I will commit the fix in a few moments. >> >> I have also tried to reproduce the VNC crash but I couldn't. Probably >> because >> my system is IPv6 enabled. flo@ has seen the same crash and has no IPv6 = in >> his kernel which lead him to find this commit in libvncserver: >> >> >> commit 66282f58000c8863e104666c30cb67**b1d5cbdee3 >> Author: Kyle J. McKay >> Date: =A0 Fri May 18 00:30:11 2012 -0700 >> =A0 =A0 libvncserver/sockets.c: do not segfault when listenSock/listen6S= ock =3D=3D >> -1 >> >> http://libvncserver.git.**sourceforge.net/git/gitweb.**cgi?p=3Dlibvncser= ver/ >> **libvncserver;a=3Dcommit;h=3D**66282f5 >> >> >> It looks promising so please test this patch if you can reproduce the >> crash. >> >> >> -- >> Bernhard Froehlich >> http://www.bluelife.at/ >> > > Sorry, I tried to try this patch, but couldn't figure out how to do that. > I use ports to compile everything, and can see the file is at > /usr/ports/net/libvncserver/work/LibVNCServer-0.9.9/libvncserver/sockets.= c > . =A0However, if I edit this file and do make clean, this patch is wiped = out > before I can do "make" out of it. =A0How to apply this patch in the ports= ? To apply patches to ports: # make clean # make patch # make # make deinstall # make reinstall Note that the final two steps assume a version of the port is already installed. If not: 'make install' I you use portmaster, after applying the patch: 'portmaster -C net/libvncse= rver' --=20 R. Kevin Oberman, Network Engineer E-mail: kob6558@gmail.com