Date: Thu, 05 Apr 2007 03:44:27 -0700 From: "Chris H." <chris#@1command.com> To: freebsd-stable@freebsd.org Subject: Re: NFS == lock && reboot Message-ID: <20070405034427.7apn8a1lc8s4wkok@webmail.1command.com> In-Reply-To: <200704050800.l35805AQ086224@lurza.secnetix.de> References: <200704050800.l35805AQ086224@lurza.secnetix.de>
next in thread | previous in thread | raw e-mail | index | archive | help
Quoting Oliver Fromme <olli@lurza.secnetix.de>: > Chris H. <chris#@1command.com> wrote: > > Oliver Fromme wrote: > > > [...] > > > However, I don't think that your actual problem (lock-up > > > and panics) is related to rpc.lockd or rpc.statd. It > > > rather sounds like something else is wrong with your > > > machine. NFS works perfectly fine for me, including > > > copying huge files. > > > > > > You wrote that you had a lot of crashes that accumulated > > > many files in lost+found. Well, maybe your filesystem > > > was somehow damaged in the process. It is possible to > > > damage file systems in a way that can lead to panics, and > > > it's not necessarily detected and repaired by fsck. > > > > Indeed. I /too/ considered this. However, I largely dismissed this > > as a possibility as most all of them are 0 length in size. The others > > are fragments of logs. I'm not /completely/ ruling this out though. > > The files in lost+found aren't the problem. The problem > is the things that you cannot see, and fsck won't move > those to lost+found. > > In particular, if you use softupdates on drives that have > write-caching enabled, or on drives that illegally cache > data even if it's disabled (be it intentionally or because > of bugs in the firmware), it's almost guaranteed that the > FS will take damage beyond repair on a crash, and even more > so after several crashes. > > Another potential cause of problems is the background fsck > feature in FreeBSD 6. I'm not sure if it has been fixed > in 6-stable, maybe it has. I don't want to spread FUD. > But in the past, if a machine crashed and rebooted during > a background fsck, that was almost a guarantee for damage > beyond repair, too. That's why I always disable background > fsck on my machines. (Let me repeat: It _might_ be fixed > in 6-stable, I don't know. I haven't seen a definitive > confirmation of it being fixed on the mailing lists so > far. If somebody knows otherwise, please correct me.) Greetings, and thank you for your thoughtful reply. Understood on all points. As mentioned; I wasn't /completely/ ruling that out. I have always refused to permit background fsck. /Not/ because of any lack of faith I have in FBSD. Frankly, I have nothing /but/ faith - perhaps more than I ought to. But rather, because I insist on keeping tabs on what's going on /at all times/. So, should the system crash/shutdown, or halt for any reason; the BIOS will keep it in a "shutdown" state should it gain control. In the case of a kernel reboot/crash; the loader simply sits and awaits my confirmation before starting the system. That way I am always guaranteed the opportunity to start in single user mode and answer to any anomalies that the system reports with an affirmative/negative. So. In summary, I am /not/ completely ruling out your suggestion that irreparable damage has been done as a result of the multitude of crashes imposed upon it. I am also grateful for your taking the time to share your experiences and insight with me. I simply haven't found anything /definitive/ yet. Kris might argue here that NFS seems to be working fine for everyone else, which would also add credence to your theory. Both of you may indeed be correct. :) I just think it'd be worth the time to follow through and make a dump device and crash it to find the /definitive/ reason for this. It may in fact turn out to be some obscure/near impossible anomaly in the NFS code. That /I/ was just (un)lucky enough to stub my toe on. :) At any rate, as this is a production server - and a /real/ busy one at that; I want to get a (confirmed) good backup off of it before willingly bashing it any further. It currently serves the largest Netscape browser client archive on the net. They are all the 0.x - 4.x series browser clients. You'd be amazed how popular/ how many people still use them. So as backing it up onto the NFS mounted backup server is currently out of the question, and there's more than a Terra byte of browser clients alone, it's going to take me a little longer to follow through with the dump device > crash > dump > back trace, than it would otherwise - but it will be done. :) Thank you again for taking the time to share your thoughts, suggestions and experiences. I really appreciate it. --Chris > > Best regards > Oliver > > -- > Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M. > Handelsregister: Registergericht Muenchen, HRA 74606, Gesch=E4ftsfuehrun= g: > secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht M=FC= n- > chen, HRB 125758, Gesch=E4ftsf=FChrer: Maik Bachmann, Olaf Erb, Ralf Geb= hart > > FreeBSD-Dienstleistungen, -Produkte und mehr: http://www.secnetix.de/bsd > > "Python is an experiment in how much freedom programmers need. > Too much freedom and nobody can read another's code; too little > and expressiveness is endangered." > -- Guido van Rossum > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" > -- panic: kernel trap (ignored) ----------------------------------------------------------------- FreeBSD 5.4-RELEASE-p12 (SMP - 900x2) Tue Mar 7 19:37:23 PST 2006 /////////////////////////////////////////////////////////////////
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070405034427.7apn8a1lc8s4wkok>