From owner-freebsd-sparc64@FreeBSD.ORG Sat Jun 11 07:54:22 2005 Return-Path: X-Original-To: sparc64@freebsd.org Delivered-To: freebsd-sparc64@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 98D6016A41C for ; Sat, 11 Jun 2005 07:54:22 +0000 (GMT) (envelope-from yongari@rndsoft.co.kr) Received: from rndsoft.co.kr (michelle.rndsoft.co.kr [211.32.202.209]) by mx1.FreeBSD.org (Postfix) with ESMTP id A73EF43D1F for ; Sat, 11 Jun 2005 07:54:21 +0000 (GMT) (envelope-from yongari@rndsoft.co.kr) Received: by simscan 1.1.0 ppid: 9886, pid: 9887, t: 1.6764s scanners:none Received: from unknown (HELO michelle.rndsoft.co.kr) (192.168.5.90) by 0 with SMTP; 11 Jun 2005 07:50:52 +0900 Received: from michelle.rndsoft.co.kr (localhost.rndsoft.co.kr [127.0.0.1]) by michelle.rndsoft.co.kr (8.13.1/8.13.1) with ESMTP id j5B7spHV021555 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sat, 11 Jun 2005 16:54:51 +0900 (KST) (envelope-from yongari@rndsoft.co.kr) Received: (from yongari@localhost) by michelle.rndsoft.co.kr (8.13.1/8.13.1/Submit) id j5B7sp9f021554; Sat, 11 Jun 2005 16:54:51 +0900 (KST) (envelope-from yongari@rndsoft.co.kr) Date: Sat, 11 Jun 2005 16:54:51 +0900 From: Pyun YongHyeon To: Kris Kennaway Message-ID: <20050611075451.GC19976@rndsoft.co.kr> References: <20050606132756.X16994@carver.gumbysoft.com> <20050611.004435.59726356.hrs@allbsd.org> <20050610211239.GA59402@xor.obsecurity.org> <20050611.154028.102195481.hrs@allbsd.org> <20050611072632.GB19976@rndsoft.co.kr> <20050611073640.GA34243@xor.obsecurity.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050611073640.GA34243@xor.obsecurity.org> User-Agent: Mutt/1.4.2.1i X-Spam-Checker-Version: SpamDetector 1.00 (2004-01-11) on Cc: Hiroki Sato , sparc64@freebsd.org Subject: Re: E4500 with 24GB RAM X-BeenThere: freebsd-sparc64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: yongari@rndsoft.co.kr List-Id: Porting FreeBSD to the Sparc List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 11 Jun 2005 07:54:22 -0000 On Sat, Jun 11, 2005 at 03:36:41AM -0400, Kris Kennaway wrote: > On Sat, Jun 11, 2005 at 04:26:32PM +0900, Pyun YongHyeon wrote: > > On Sat, Jun 11, 2005 at 03:40:28PM +0900, Hiroki Sato wrote: > > > Kris Kennaway wrote > > > in <20050610211239.GA59402@xor.obsecurity.org>: > > > > > > kr> I wonder if it's disk related. I tried to check out a ports tree on > > > kr> this machine and it hung in a few seconds (although this was also > > > kr> checking out using the network via nfs). > > > > > > I do not know why but the freeze occurs only when displaying "invalid > > > packet size xxx; dropping". When I tried "vmstat 1" on the serial > > > console and fetching a large file via ftp at the same time > > > with 12GB RAM configuration, the freeze did not occur. > > > Once "hme0: too may errors; not reporting any more" is displayed, > > > the box seems to work fine and I can check out the ports tree via NFS > > > without problems. > > > > > > > Normally the "invalid packet size" message comes from link mismatch. > > If your HME's PHY is DP83840 there are known issues on link neogotiation. > > AFAIK the issue has nothing to do with panic as I always see that on my > > Ultra2 which has DP83840 PHY too. > > This is on e450 and e4500 machines. I don't think there's a link > mismatch. > > > I wonder how you can use NFS reliably on sparc64. Due to failure of > > alignment(both server and client) it's really easy to get panic on sparc64. > > AFAICR I've never seen a problem with this (except with an i386 4.x > server, which can be panicked by a sparc64 client)..I don't rely on > NFS heavily in most cases, but I do use it on a number of machines > (including two package build machines that netboot and access their > ports trees over NFS, and have been in continuous operation with an > uptime of 110 days). > Do you use NFS orver UDP? NFS over TCP has much better change of getting panic. If you copy a large file(> 100MB) from a NFS exported directory to its sub-directory you probably hit a panic. If my memory serve right there had been several NFS panic reports in current/sparc64 ML. And I don't think it was fixed since the root cause of panic is in nfsm_disct() and nfs_realign()(it was not touched for a long time.) > > > I tried to comment out the HME_WHINE line of hme_read() in if_hme.c, > > > and it seems to make the box work fine so far. > > > > HME_WHINE() just prints a message. I can't think removing the function > > can cure your problem. > > It does seem to have helped though..before it would reliably lock up > in seconds, and DDB break was non-responsive. > Hmm... Then it would be great to get a voluntary core dump when hme(4) hits the condition(e.g. before processing HME_WHINE, invoke panic(9)). > Kris -- Regards, Pyun YongHyeon http://www.kr.freebsd.org/~yongari | yongari@freebsd.org