Date: Tue, 17 Nov 2015 18:57:32 +0100 From: Julien Cigar <jcigar@ulb.ac.be> To: Gerhard Schmidt <schmidt@ze.tum.de> Cc: Adam Vande More <amvandemore@gmail.com>, FreeBSD Questions <freebsd-questions@freebsd.org> Subject: Re: Random Lockup with FreeBSD 10.2 on SuperMicro Boards Message-ID: <20151117175731.GY2604@mordor.lan> In-Reply-To: <564B5D83.5000909@ze.tum.de> References: <56498205.3060806@ze.tum.de> <20151116094334.GS2604@mordor.lan> <5649A761.7040303@ze.tum.de> <20151116111609.a9757a4a.freebsd@edvax.de> <5649AEC3.5090104@ze.tum.de> <20151116164507.GA87691@neutralgood.org> <CA%2BtpaK3065Tw_NC=VXa0Pq3ZD_mXUcHhvoVrSOw8feHr7i5gaw@mail.gmail.com> <564B5D83.5000909@ze.tum.de>
next in thread | previous in thread | raw e-mail | index | archive | help
--SbbROFN+SMqu6LKU Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Nov 17, 2015 at 06:01:55PM +0100, Gerhard Schmidt wrote: > Am 17.11.2015 17:23, schrieb Adam Vande More: > > On Mon, Nov 16, 2015 at 10:45 AM, <kpneal@pobox.com > > <mailto:kpneal@pobox.com>> wrote: > >=20 > > When in doubt use 'fsck -f' to force a check despite the filesystem > > being marked clean. > >=20 > > =20 > > Yes, but a full fsck should be run on a regular basis regardless of > > suspicion. > >=20 > > Personally, I got bit by SU (plain) a long time ago and I've never > > really > > trusted it since. I strongly advise you to 'fsck -f' on your /var > > just to > > rule out _any_ corruption there. > >=20 > >=20 > > A lower level fs error isn't going be to detected by a background > > fsck(only does preening) or SUJ fsck(trusts the journal). Such errors > > can occur on *any* journaled fs. Periodically doing a full fsck on fs's > > is actually something Linux does better. > >=20 > > https://lists.freebsd.org/pipermail/freebsd-current/2013-July/042951.ht= ml > >=20 > > Many think SU or SUJ obviate the need for a periodic full fsck. It does > > not. SU and SUJ devs have repeated this since their respective > > inception. [1] Hardware still lies, bitrot still occurs, do a full > > fsck. Vague reports of "I don't trust this" aren't helpful. If you > > know of a bug, please report it so it can be addressed.=20 > >=20 > > [1] > > https://lists.freebsd.org/pipermail/freebsd-arch/2010-January/009872.ht= ml -- > > Well initially it's claimed "eliminate fsck after an unclean shutdown" > > but details it later showing fsck using journal isn't a full fsck. >=20 > Let's get back to Topic. There is no corruption. And still if there is > that's software bug and has to be fixed. This is not biology where > something happens spontaneously. This is computer science. If there is > something wrong there are only three explanations. The User done > something wrong, not likely here. There is an Hardware error, on three > different Servers roughly after the same amount of time not very likely > either. So it's cause number three: Bug in the Software. >=20 > As I said. I have 76 Servers running FreeBSD (various versions from 8.4 > to 10.2) only 3 of them are 10.2 (5 since yesterday) and of this three > running 10.2 longer than a month 100% had this Problem at least once. > out of the 73 other servers 0% had this Problem and 45 of them are the > exact same Hardware and all of them running considerably longer than one > Month. >=20 > And for the fscks. The last time i had to do a fsck on any partition, > beside Hardware failures, was about 2 and 1/2 years ago when your UPS > died and killed the power. And besides from some logfiles even than > there was no corruption. I have filesystems that are 8 years without a > fsck, that are production servers. I have never had problems with UFS SU > and UFS SU-J. >=20 > Sorry guys there is no problem with UFS on FreeBSD. couldn't you disable SU+J only on one of them? It would be worth trying at least. I never had any problem with SU, but I'm sorry to say that SU+J almost never worked for me (see PR 203588 for latest problem that I had). I'll repeat myself but I had random lock ups on some HP Proliant servers here too (without any corruption) with SU+J. Since I disabled journaling lock ups "automagically" disappeared. >=20 > I agree if there is an unclean shutdown you might want to do an complete > fsck. But in the case discussed here the unclean shutdown was an result > of the lockup not the other way round. >=20 > Regards > Estartu >=20 >=20 >=20 > --=20 > ------------------------------------------------- > Gerhard Schmidt | E-Mail and JabberID: > TU-M=C3=BCnchen | schmidt@ze.tum.de > WWW & Online Services | PGP-Publickey on Request --=20 Julien Cigar Belgian Biodiversity Platform (http://www.biodiversity.be) PGP fingerprint: EEF9 F697 4B68 D275 7B11 6A25 B2BB 3710 A204 23C0 No trees were killed in the creation of this message. However, many electrons were terribly inconvenienced. --SbbROFN+SMqu6LKU Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCgAGBQJWS2qIAAoJEAi2KiTKQR5piNkP/iy1x+MnF9Ro7QdwFABQCG73 UYSjPEzDQFeAzwutNdHzbVwMUuRCg/+6wrB3Ras6j21taJDRbkYFZ4MMUHzwaK56 Un1UmyWBrt9f8AGwD65VEv5iVzSWDDix3u2uZplzfs/3uxYLRmX/14pBNxzba1pK 3jVE/V/zFVdzUo935Wbd5yFZAc4RHWw+2bsKpfdDeQj0hZK/M7OFM7jD6JJ2oQG4 PoxtNfYsMIBRB4b4wS2bFB0o/oPRtfL+JE8PM1gWgx71Oic2+Xqzw5GZjc5yfoXx dy3/R4eu+63zZ78Bh0zOLP8z+N2dJ4edb3zexZw9SoDarHS/eH4jsjRcFB6kgYSg FxjUZ7JFKdD+mFX+cKN9TINY/aMLhBRCJ4eOp6iO4PixHGCIv7Y/yOjyONapEait bWqDWR1x6sZrU1BIEovGCdCouSrnus1jhzYzGcpUXWKhaftMYO/3ObLFl9IoNgtk yjlnVEFiAKdSD3UMnq4Yq2gX7O6cylwZfs7jf8xpPRr0qJEvjUC0p41+W6oQ+Oj2 ffU/5+E4HkKo/QwvkQBTbMylR8NWVihfOiBgSZzaIaQvF/TQ3Z5BWm+QFoC6RUSR UhKKBIEvyeUB+qpsBHryeQ/S0+Jkg4G4K5xbUyzenkK6ZSdDMAGv3EOQ6rnzpCRH L7OTng+xnrZOlBQHtPvs =L+sY -----END PGP SIGNATURE----- --SbbROFN+SMqu6LKU--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20151117175731.GY2604>