Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 6 Jun 2012 16:07:52 -0700
From:      Kevin Oberman <kob6558@gmail.com>
To:        Steve Tuts <yiz5hwi@gmail.com>
Cc:        freebsd-emulation@freebsd.org
Subject:   Re: one virtualbox vm disrupts all vms and entire network
Message-ID:  <CAN6yY1sVaqw3NPAraW452Rg7807q8zi1O=g2D_QZ93q4UE_Gqw@mail.gmail.com>
In-Reply-To: <CAEXKtDqR0F7btne62C%2Bw90qnWfP3kkU-E851b8WOf26yKeGBhg@mail.gmail.com>
References:  <CAEXKtDreCQ0O4NAi5opGm_KnR4As=dDvc-zP5Z0z5g84GQQuyg@mail.gmail.com> <assp.050217e07a.6a54445e6fa4183cff3692d9deed5635@ringofsaturn.com> <CAEXKtDrG%2Byj%2B4vOhhKrQcC6h9mEeFOtzHtJaV-UgPMrdn3xisQ@mail.gmail.com> <e1037b202b887d93142a4e693784f874@bluelife.at> <b8f2877663fb73d56222180f8b74cc81@bluelife.at> <CAEXKtDqR0F7btne62C%2Bw90qnWfP3kkU-E851b8WOf26yKeGBhg@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Jun 6, 2012 at 3:46 PM, Steve Tuts <yiz5hwi@gmail.com> wrote:
> On Wed, Jun 6, 2012 at 3:50 AM, Bernhard Froehlich <decke@freebsd.org>wro=
te:
>
>> On 05.06.2012 20:16, Bernhard Froehlich wrote:
>>
>>> On 05.06.2012 19:05, Steve Tuts wrote:
>>>
>>>> On Mon, Jun 4, 2012 at 4:11 PM, Rusty Nejdl <rnejdl@ringofsaturn.com>
>>>> wrote:
>>>>
>>>> =A0On 2012-06-02 12:16, Steve Tuts wrote:
>>>>>
>>>>> =A0Hi, we have a Dell poweredge server with a dozen interfaces. =A0It=
 hosts
>>>>>> a
>>>>>> few guests of web app and email servers with VirtualBox-4.0.14. =A0T=
he
>>>>>> host
>>>>>> and all guests are FreeBSD 9.0 64bit. =A0Each guest is bridged to a
>>>>>> distinct
>>>>>> interface. =A0The host and all guests are set to 10.0.0.0 network NA=
T'ed
>>>>>> to
>>>>>> a
>>>>>> cicso router.
>>>>>>
>>>>>> This runs well for a couple months, until we added a new guest
>>>>>> recently.
>>>>>> Every few hours, none of the guests can be connected. =A0We can only
>>>>>> connect
>>>>>> to the host from outside the router. =A0We can also go to the consol=
e of
>>>>>> the
>>>>>> guests (except the new guest), but from there we can't ping the gate=
way
>>>>>> 10.0.0.1 any more. =A0The new guest just froze.
>>>>>>
>>>>>> Furthermore, on the host we can see a vboxheadless process for each
>>>>>> guest,
>>>>>> including the new guest. =A0But we can not kill it, not even with "k=
ill
>>>>>> -9".
>>>>>> We looked around the web and someone suggested we should use "kill
>>>>>> -SIGCONT" first since the "ps" output has the "T" flag for that
>>>>>> vboxheadless process for that new guest, but that doesn't help. =A0W=
e
>>>>>> also
>>>>>> tried all the VBoxManager commands to poweroff/reset etc that new
>>>>>> guest,
>>>>>> but they all failed complaining that vm is in Aborted state. =A0We a=
lso
>>>>>> tried
>>>>>> VBoxManager commands to disconnect the network cable for that new
>>>>>> guest,
>>>>>> it
>>>>>> didn't complain, but there was no effect.
>>>>>>
>>>>>> For a couple times, on the host we disabled the interface bridging t=
hat
>>>>>> new
>>>>>> guest, then that vboxheadless process for that new guest disappeared
>>>>>> (we
>>>>>> attempted to kill it before that). =A0And immediately all other vms
>>>>>> regained
>>>>>> connection back to normal.
>>>>>>
>>>>>> But there is one time even the above didn't help - the vboxheadless
>>>>>> process
>>>>>> for that new guest stubbonly remains, and we had to reboot the host.
>>>>>>
>>>>>> This is already a production server, so we can't upgrade virtualbox =
to
>>>>>> the
>>>>>> latest version until we obtain a test server.
>>>>>>
>>>>>> Would you advise:
>>>>>>
>>>>>> 1. is there any other way to kill that new guest instead of rebootin=
g?
>>>>>> 2. what might cause the problem?
>>>>>> 3. what setting and test I can do to analyze this problem?
>>>>>> ______________________________****_________________
>>>>>>
>>>>>>
>>>>> I haven't seen any comments on this and don't want you to think you a=
re
>>>>> being ignored but I haven't seen this but also, the 4.0 branch was
>>>>> buggier
>>>>> for me than the 4.1 releases so yeah, upgrading is probably what you =
are
>>>>> looking at.
>>>>>
>>>>> Rusty Nejdl
>>>>> ______________________________****_________________
>>>>>
>>>>>
>>>>> =A0sorry, just realize my reply yesterday didn't go to the list, so a=
m
>>>> re-sending with some updates.
>>>>
>>>> Yes, we upgraded all ports and fortunately everything went back and
>>>> especially all vms has run peacefully for two days now. =A0So upgradin=
g to
>>>> the latest virtualbox 4.1.16 solved that problem.
>>>>
>>>> But now we got a new problem with this new version of virtualbox:
>>>> whenever
>>>> we try to vnc to any vm, that vm will go to Aborted state immediately.
>>>> Actually, merely telnet from within the host to the vnc port of that v=
m
>>>> will immediately Abort that vm. =A0This prevents us from adding new vm=
s.
>>>> Also, when starting vm with vnc port, we got this message:
>>>>
>>>> rfbListenOnTCP6Port: error in bind IPv6 socket: Address already in use
>>>>
>>>> , which we found someone else provided a patch at
>>>> http://permalink.gmane.org/**gmane.os.freebsd.devel.**emulation/10237<=
http://permalink.gmane.org/gmane.os.freebsd.devel.emulation/10237>;
>>>>
>>>> So looks like when there are multiple vms on a ipv6 system (we have 64=
bit
>>>> FreeBSD 9.0) will get this problem.
>>>>
>>>
>>> Glad to hear that 4.1.16 helps for the networking problem. The VNC prob=
lem
>>> is also a known one but the mentioned patch does not work at least for =
a
>>> few people. It seems the bug is somewhere in libvncserver so downgradin=
g
>>> net/libvncserver to an earlier version (and rebuilding virtualbox) shou=
ld
>>> help until we come up with a proper fix.
>>>
>>
>> You are right about the "Address already in use" problem and the patch f=
or
>> it so I will commit the fix in a few moments.
>>
>> I have also tried to reproduce the VNC crash but I couldn't. Probably
>> because
>> my system is IPv6 enabled. flo@ has seen the same crash and has no IPv6 =
in
>> his kernel which lead him to find this commit in libvncserver:
>>
>>
>> commit 66282f58000c8863e104666c30cb67**b1d5cbdee3
>> Author: Kyle J. McKay <mackyle@gmail.com>
>> Date: =A0 Fri May 18 00:30:11 2012 -0700
>> =A0 =A0 libvncserver/sockets.c: do not segfault when listenSock/listen6S=
ock =3D=3D
>> -1
>>
>> http://libvncserver.git.**sourceforge.net/git/gitweb.**cgi?p=3Dlibvncser=
ver/
>> **libvncserver;a=3Dcommit;h=3D**66282f5<http://libvncserver.git.sourcefo=
rge.net/git/gitweb.cgi?p=3Dlibvncserver/libvncserver;a=3Dcommit;h=3D66282f5=
>
>>
>>
>> It looks promising so please test this patch if you can reproduce the
>> crash.
>>
>>
>> --
>> Bernhard Froehlich
>> http://www.bluelife.at/
>>
>
> Sorry, I tried to try this patch, but couldn't figure out how to do that.
> I use ports to compile everything, and can see the file is at
> /usr/ports/net/libvncserver/work/LibVNCServer-0.9.9/libvncserver/sockets.=
c
> . =A0However, if I edit this file and do make clean, this patch is wiped =
out
> before I can do "make" out of it. =A0How to apply this patch in the ports=
?

To apply patches to ports:
# make clean
# make patch
<Apply patch>
# make
# make deinstall
# make reinstall

Note that the final two steps assume a version of the port is already
installed. If not: 'make install'
I you use portmaster, after applying the patch: 'portmaster -C net/libvncse=
rver'
--=20
R. Kevin Oberman, Network Engineer
E-mail: kob6558@gmail.com



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAN6yY1sVaqw3NPAraW452Rg7807q8zi1O=g2D_QZ93q4UE_Gqw>