From owner-freebsd-stable Thu Mar 22 11:10:44 2001 Delivered-To: freebsd-stable@freebsd.org Received: from gilmore.nas.nasa.gov (gilmore.nas.nasa.gov [129.99.32.17]) by hub.freebsd.org (Postfix) with ESMTP id 98D7A37B71F for ; Thu, 22 Mar 2001 11:10:37 -0800 (PST) (envelope-from tweten@nas.nasa.gov) Received: from gilmore.nas.nasa.gov (IDENT:wd9CCYxvU/Nc5isOaRuFTImbJLXi5BnO@localhost.nas.nasa.gov [127.0.0.1]) by gilmore.nas.nasa.gov (8.11.3/8.11.3) with ESMTP id f2MJA4673370; Thu, 22 Mar 2001 11:10:05 -0800 (PST) (envelope-from tweten@gilmore.nas.nasa.gov) Message-Id: <200103221910.f2MJA4673370@gilmore.nas.nasa.gov> X-Mailer: exmh version 2.3.1 01/18/2001 with nmh-1.0.4 To: Kris Kennaway Cc: freebsd-stable@freebsd.org Subject: Re: 4.3-BETA makeworld of current STABLE Fails In-Reply-To: Message from Kris Kennaway of "Wed, 21 Mar 2001 03:50:24 PST." <20010321035024.A1159@xor.obsecurity.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Thu, 22 Mar 2001 11:10:03 -0800 From: Dave Tweten Sender: owner-freebsd-stable@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG kris@obsecurity.org said: >Sounds like your source code got corrupted somehow. Indeed. That's the angle I had been working for a couple of weeks after my regular weekly automated cvsup-buildworld-buildkernel stopped working. And I now think it's true -- but with a twist: files were corrupted in the kernel buffers and not on disk. The background: --------------- The kernal I was running, under the build was FreeBSD 4.3-BETA (FLOATER) #2: Tue Mar 6 22:17:30 PST 2001 and the machine (NEC Versa 6050MX with 40 MB of memory) gets used for lots of things, so I'm frequently swapping disks and rebooting. The symptoms: ------------- 1. Repeated attempts to cvsup-buildworld died in stage 4, library creation, and with errors indicating file corruption, though I never found the actual corruption. 2. In desperation, I wiped all of /usr/src and repeated. This time I found the corruption. It started at file offset 1024 in an assembler source file. Worried that I may have just cvsupped an entire source tree laden with corrupted files, I tried to run a find -exec grep of all of /usr/src to detect the corrupted files. It didn't work right, but when I went back to look at the corrupted assembler source file, the file was okay! 3. When I dropped back to an old kernel (from February 16), buildworld worked okay up to the point where it needed floater.mc, my custom sendmail config file, which I blew away when I wiped /usr/src. Diagnosis: ---------- The FreeBSD 4.3-BETA kernal of around March 6 had a bug that resulted in very occasional buffer corruption of files that were read. It seems not to have corrupted buffers for files that were written, since my cvsup of everything escaped unscathed. I'm currently repeating the buildworld, after having replaced floater.mc, and will report whether the kernal problem seems to have gone away in sources cvsupped last night. -- M/S 258-5 | 1024-bit PGP fingerprint: | tweten@nas.nasa.gov NASA Ames Research Center | 41 B0 89 0A 8F 94 6C 59 | (650) 604-4416 Moffett Field, CA 94035-1000 | 7C 80 10 20 25 C7 2F E6 | FAX: (650) 604-4377 We each earn what freedom of speech we defend for those who most offend us. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message