From owner-freebsd-alpha@FreeBSD.ORG Tue Nov 4 14:25:23 2003 Return-Path: Delivered-To: freebsd-alpha@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6D6B116A4CE for ; Tue, 4 Nov 2003 14:25:23 -0800 (PST) Received: from obsecurity.dyndns.org (adsl-63-207-60-234.dsl.lsan03.pacbell.net [63.207.60.234]) by mx1.FreeBSD.org (Postfix) with ESMTP id A101E43FAF for ; Tue, 4 Nov 2003 14:25:21 -0800 (PST) (envelope-from kris@obsecurity.org) Received: from rot13.obsecurity.org (rot13.obsecurity.org [10.0.0.5]) by obsecurity.dyndns.org (Postfix) with ESMTP id 180C266B9B; Tue, 4 Nov 2003 14:25:21 -0800 (PST) Received: by rot13.obsecurity.org (Postfix, from userid 1000) id F3418DAF; Tue, 4 Nov 2003 14:25:20 -0800 (PST) Date: Tue, 4 Nov 2003 14:25:20 -0800 From: Kris Kennaway To: ticso@cicely.de Message-ID: <20031104222520.GA72254@rot13.obsecurity.org> References: <20031101103955.GA42891@rot13.obsecurity.org> <20031104031740.GA67484@rot13.obsecurity.org> <20031104124826.GH42463@cicely12.cicely.de> <20031104175552.GA70699@rot13.obsecurity.org> <20031104221251.GJ42463@cicely12.cicely.de> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="2fHTh5uZTiUOsy+g" Content-Disposition: inline In-Reply-To: <20031104221251.GJ42463@cicely12.cicely.de> User-Agent: Mutt/1.4.1i cc: alpha@freebsd.org cc: Kris Kennaway Subject: Re: New alpha 5.x bug X-BeenThere: freebsd-alpha@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Porting FreeBSD to the Alpha List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 04 Nov 2003 22:25:23 -0000 --2fHTh5uZTiUOsy+g Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Nov 04, 2003 at 11:12:51PM +0100, Bernd Walter wrote: > On Tue, Nov 04, 2003 at 09:55:52AM -0800, Kris Kennaway wrote: > > On Tue, Nov 04, 2003 at 01:48:27PM +0100, Bernd Walter wrote: > > > I can't speak for this problem yet, because my test systems are a bit > > > older, but speaking for the pipe corruption: > > > I did a lots of bzip1, tar, scp, nfs(client) without noticing any > > > sign of problem. > > > What is so special with the port cluster? > > > I have no clue about it's design. > >=20 > > It does lots of parallel package builds (untar, pkg_add, compile, tar) = and NFS copying. >=20 > Any special NFS options? > tcp, udp, v2, v3, IPv4, IPv6? Here is a typical mount -v: axp7# mount -v 216.136.204.23:/a/nfs/alpha/5.dir1 on / (nfs, read-only, fsid 00ff000404000= 000) devfs on /dev (devfs, local, fsid 01ff000303000000) /dev/md0c on /etc (ufs, local, writes: sync 732 async 64400, reads: sync 55= 109 async 8784, fsid 13d19a3f414b62cc) /dev/md1c on /var (ufs, local, writes: sync 338 async 35923, reads: sync 22= 620 async 0, fsid 19d19a3f8ce9f280) /dev/md2c on /tmp (ufs, local, writes: sync 12 async 20, reads: sync 13 asy= nc 0, fsid 1bd19a3fa96729ea) /dev/da0e on /a (ufs, local, soft-updates, writes: sync 137415 async 139508= 00, reads: sync 12817244 async 903873, fsid f1d29a3f5b723dc6) bento:/var/portbuild on /var/portbuild (nfs, fsid 02ff000404000000) bento:/var/portbuild/alpha/5/ports on /a/tmp/5/chroot/24703/a/ports (nfs, r= ead-only, fsid ebff120404000000) bento:/var/portbuild/alpha/5/src on /a/tmp/5/chroot/24703/usr/src (nfs, rea= d-only, fsid ecff120404000000) bento:/var/portbuild/alpha/5/doc on /a/tmp/5/chroot/24703/usr/opt/doc (nfs,= read-only, fsid edff120404000000) devfs on /a/tmp/5/chroot/24703/dev (devfs, local, fsid eeff120303000000) bento:/var/portbuild/alpha/5/ports on /a/tmp/5/chroot/25765/a/ports (nfs, r= ead-only, fsid efff120404000000) bento:/var/portbuild/alpha/5/src on /a/tmp/5/chroot/25765/usr/src (nfs, rea= d-only, fsid f0ff120404000000) bento:/var/portbuild/alpha/5/doc on /a/tmp/5/chroot/25765/usr/opt/doc (nfs,= read-only, fsid f1ff120404000000) devfs on /a/tmp/5/chroot/25765/dev (devfs, local, fsid f2ff120303000000) The NFS mounts are nfsv3,intr,ro. > Just to get the picture complete. > The build is local and the package is then copied to a NFS server on > which t has a corrupted CRC? =46rom my memory of tests I ran a few months ago, the bzip2 CRC is corrupted when the package is created locally. The package is copied to the server via scp. > Is the bzip2 CRC wrong, or the tar CRC (does tar have a CRC?), or both? Again from memory, the file is truncated, and there might be some garbage (e.g. zeros) at the end. > Can you say how likely such a corruption is? On the last build 42 packages were corrupted out of about 7500. > Are other packages compiled during copying a package file to the server? Yes. Typically there are 5 builds running at a time on the client machines. > Are the building machines memory stressed while creating the bz file or > while copying it? The machines are definitely busy (building other packages) while the package is created and copied, although the machines should not be paging. > Really - it's hard to believe that pipe itself is the problem. > I do lots of buildworlds with CFLAGS=3D-pipe and a corruption would > very likely stop building. I know that the problem began between 5.1-R and August 6, but I have not been able to track it down beyond this. There was work on both pipes and VM in that time period, which is why I am suspicious of both. Kris --2fHTh5uZTiUOsy+g Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.3 (FreeBSD) iD8DBQE/qCdQWry0BWjoQKURAidPAKCJ+vh5MxOXO87wMqr+XJbAQ1gH1gCg/rfJ B1Nzhbhmk4kwP9sDVb3Yua0= =TF88 -----END PGP SIGNATURE----- --2fHTh5uZTiUOsy+g--