Date: Sat, 11 Jun 2005 16:54:51 +0900 From: Pyun YongHyeon <yongari@rndsoft.co.kr> To: Kris Kennaway <kris@obsecurity.org> Cc: Hiroki Sato <hrs@freebsd.org>, sparc64@freebsd.org Subject: Re: E4500 with 24GB RAM Message-ID: <20050611075451.GC19976@rndsoft.co.kr> In-Reply-To: <20050611073640.GA34243@xor.obsecurity.org> References: <20050606132756.X16994@carver.gumbysoft.com> <20050611.004435.59726356.hrs@allbsd.org> <20050610211239.GA59402@xor.obsecurity.org> <20050611.154028.102195481.hrs@allbsd.org> <20050611072632.GB19976@rndsoft.co.kr> <20050611073640.GA34243@xor.obsecurity.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Jun 11, 2005 at 03:36:41AM -0400, Kris Kennaway wrote: > On Sat, Jun 11, 2005 at 04:26:32PM +0900, Pyun YongHyeon wrote: > > On Sat, Jun 11, 2005 at 03:40:28PM +0900, Hiroki Sato wrote: > > > Kris Kennaway <kris@obsecurity.org> wrote > > > in <20050610211239.GA59402@xor.obsecurity.org>: > > > > > > kr> I wonder if it's disk related. I tried to check out a ports tree on > > > kr> this machine and it hung in a few seconds (although this was also > > > kr> checking out using the network via nfs). > > > > > > I do not know why but the freeze occurs only when displaying "invalid > > > packet size xxx; dropping". When I tried "vmstat 1" on the serial > > > console and fetching a large file via ftp at the same time > > > with 12GB RAM configuration, the freeze did not occur. > > > Once "hme0: too may errors; not reporting any more" is displayed, > > > the box seems to work fine and I can check out the ports tree via NFS > > > without problems. > > > > > > > Normally the "invalid packet size" message comes from link mismatch. > > If your HME's PHY is DP83840 there are known issues on link neogotiation. > > AFAIK the issue has nothing to do with panic as I always see that on my > > Ultra2 which has DP83840 PHY too. > > This is on e450 and e4500 machines. I don't think there's a link > mismatch. > > > I wonder how you can use NFS reliably on sparc64. Due to failure of > > alignment(both server and client) it's really easy to get panic on sparc64. > > AFAICR I've never seen a problem with this (except with an i386 4.x > server, which can be panicked by a sparc64 client)..I don't rely on > NFS heavily in most cases, but I do use it on a number of machines > (including two package build machines that netboot and access their > ports trees over NFS, and have been in continuous operation with an > uptime of 110 days). > Do you use NFS orver UDP? NFS over TCP has much better change of getting panic. If you copy a large file(> 100MB) from a NFS exported directory to its sub-directory you probably hit a panic. If my memory serve right there had been several NFS panic reports in current/sparc64 ML. And I don't think it was fixed since the root cause of panic is in nfsm_disct() and nfs_realign()(it was not touched for a long time.) > > > I tried to comment out the HME_WHINE line of hme_read() in if_hme.c, > > > and it seems to make the box work fine so far. > > > > HME_WHINE() just prints a message. I can't think removing the function > > can cure your problem. > > It does seem to have helped though..before it would reliably lock up > in seconds, and DDB break was non-responsive. > Hmm... Then it would be great to get a voluntary core dump when hme(4) hits the condition(e.g. before processing HME_WHINE, invoke panic(9)). > Kris -- Regards, Pyun YongHyeon http://www.kr.freebsd.org/~yongari | yongari@freebsd.org
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050611075451.GC19976>