Date: Thu, 30 Jun 2005 23:00:56 +0200 From: =?ISO-8859-1?Q?Eirik_=D8verby?= <eirik@unicore.no> To: Brian Fundakowski Feldman <green@freebsd.org> Cc: stable@freebsd.org Subject: Re: Jails that won't die... Message-ID: <67DA5F6F-62D2-4371-8707-CFB06B16E269@unicore.no> In-Reply-To: <20050630205629.GG1074@green.homeunix.org> References: <92135CB3-5540-4D06-A991-708C8AAD6AC7@unicore.no> <20050628145859.GC1074@green.homeunix.org> <CA38D1F9-3976-4DE9-BED1-DB8935EDD1D4@unicore.no> <20050629185803.GE1074@green.homeunix.org> <23ED6035-A1AE-4F38-853F-D0D42D42E934@unicore.no> <20050630205629.GG1074@green.homeunix.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On 30. jun. 2005, at 22.56, Brian Fundakowski Feldman wrote: > On Thu, Jun 30, 2005 at 03:53:56PM +0200, Eirik =D8verby wrote: > >> >> On 29. jun. 2005, at 20.58, Brian Fundakowski Feldman wrote: >> >> >>> On Wed, Jun 29, 2005 at 03:28:09PM +0200, Eirik =D8verby wrote: >>> >>> >>>> >>>> On 28. jun. 2005, at 16.58, Brian Fundakowski Feldman wrote: >>>> >>>> >>>> >>>>> On Tue, Jun 28, 2005 at 10:37:29AM +0200, Eirik =D8verby wrote: >>>>> >>>>> >>>>> >>>>>> Hi, >>>>>> >>>>>> I have, since upgrading to 5.x and updating my management tools, >>>>>> seen >>>>>> a number of problems relating to stopping jails. >>>>>> >>>>>> I'm maintaining several hosts with a number of full-featured =20 >>>>>> jails >>>>>> (i.e. full virtual FreeBSD installations in each jail), and in >>>>>> general this works fine. However, whenever I stop a jail using >>>>>> 'jexec >>>>>> <id> kill -SIGNAL -1' or 'jexec <id> /bin/sh /etc/=20 >>>>>> rc.shutdown' (in >>>>>> various combinations), jails have a tendency to stick around for >>>>>> minutes or hours - according to 'jls'. Often I see an entry in >>>>>> 'netstat -a' indicating that there is one or more sockets in >>>>>> FIN_WAIT >>>>>> state, preventing the jail from coming down. Taking the virtual >>>>>> network interface (alias) down does not help. All I can do at =20 >>>>>> this >>>>>> point is wait. >>>>>> >>>>>> I normally use 'jls' to determine whether or not a jail can be >>>>>> restarted (i.e. it's not running), but this is pretty useless in >>>>>> such >>>>>> cases. And right now I have a case where 'netstat -a' shows me >>>>>> nothing pertaining to the jail, though it has no processes >>>>>> running. I >>>>>> have therefore force-started the jail again, which seems to work >>>>>> nicely, but now 'jls' gives me two entries for this jail, with >>>>>> different JIDs. >>>>>> >>>>>> What am I doing wrong here? >>>>>> >>>>>> >>>>>> >>>>> >>>>> You could just use ps to check for jailed processes and check =20 >>>>> their >>>>> respective jails using the procfs status entry (at least according >>>>> to the ps manpage...) >>>>> >>>>> >>>> >>>> My jailctl script can do both - list by jls and list by =20 >>>> processes in >>>> the jail. There are NO processes running in the jail. >>>> >>>> >>> >>> So it's obviously not running, and you can mark its state as such. >>> >> >> ...which is what I do on FreeBSD 4.x, but on 5.x the 'jls' command >> still claims the jail is running. I think this is unbelieveably >> dirty. Also, using /proc to determine if a jail is still running is a >> bad idea, as mounting /proc is depreceated. >> > > The deprecation is due to security concerns, not bit-rot. You can > just mount it with root-readable-only permissions. The jls for > current isn't incorrect, you're just expecting a different criteria to > mean "alive" than it is using. It would take increased kernel > complexity to do what you want if you're not going to do it in > userland. I am aware of that. However, I have seen instabilities with /proc as =20 well, but that's another story. > Anyway, why aren't you just using a /var/run file in the "real" system > to tell whether the jail is running or not? It's the corollary to > pid files versus doing "killall"... Just seems like something really > trivial to implement as you like it in the userland. Sure, this is what I fall back on when running my jailctl script (/=20 usr/ports/sysutils/jailctl) on 4.x. However, I NEED 'jls' to be =20 correct, because I use it to inject other processes (like executing =20 shutdown scripts inside the jails when taking them down, etc.). I =20 suppose I could sort the output of jls on jail id and always use =20 whichever instance of a jail has the highest ID, but I don't know how =20= these IDs work - if they are recycled, if they "wrap around" at some =20 point, etc. In any case it would be nice to know which criteria exactly jls uses =20 - and perhaps a way to remove whichever criteria that keeps it =20 thinking the jail is still running. Thing is - sometimes jails stop just fine. Other times they don't. It =20= all depends. Perhaps I should get lsof or something, see if there are =20= any open files (though I think I tried once without finding any)... /Eirik > > --=20 > Brian Fundakowski Feldman =20 > \'[ FreeBSD ]''''''''''\ > <> green@FreeBSD.org \ The Power =20 > to Serve! \ > Opinions expressed are my own. =20 > \,,,,,,,,,,,,,,,,,,,,,,\ >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?67DA5F6F-62D2-4371-8707-CFB06B16E269>