Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 11 Jun 2005 16:54:51 +0900
From:      Pyun YongHyeon <yongari@rndsoft.co.kr>
To:        Kris Kennaway <kris@obsecurity.org>
Cc:        Hiroki Sato <hrs@freebsd.org>, sparc64@freebsd.org
Subject:   Re: E4500 with 24GB RAM
Message-ID:  <20050611075451.GC19976@rndsoft.co.kr>
In-Reply-To: <20050611073640.GA34243@xor.obsecurity.org>
References:  <20050606132756.X16994@carver.gumbysoft.com> <20050611.004435.59726356.hrs@allbsd.org> <20050610211239.GA59402@xor.obsecurity.org> <20050611.154028.102195481.hrs@allbsd.org> <20050611072632.GB19976@rndsoft.co.kr> <20050611073640.GA34243@xor.obsecurity.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Jun 11, 2005 at 03:36:41AM -0400, Kris Kennaway wrote:
 > On Sat, Jun 11, 2005 at 04:26:32PM +0900, Pyun YongHyeon wrote:
 > > On Sat, Jun 11, 2005 at 03:40:28PM +0900, Hiroki Sato wrote:
 > >  > Kris Kennaway <kris@obsecurity.org> wrote
 > >  >   in <20050610211239.GA59402@xor.obsecurity.org>:
 > >  > 
 > >  > kr> I wonder if it's disk related.  I tried to check out a ports tree on
 > >  > kr> this machine and it hung in a few seconds (although this was also
 > >  > kr> checking out using the network via nfs).
 > >  > 
 > >  >  I do not know why but the freeze occurs only when displaying "invalid
 > >  >  packet size xxx; dropping".  When I tried "vmstat 1" on the serial
 > >  >  console and fetching a large file via ftp at the same time
 > >  >  with 12GB RAM configuration, the freeze did not occur.
 > >  >  Once "hme0: too may errors; not reporting any more" is displayed,
 > >  >  the box seems to work fine and I can check out the ports tree via NFS
 > >  >  without problems.
 > >  > 
 > > 
 > > Normally the "invalid packet size" message comes from link mismatch.
 > > If your HME's PHY is DP83840 there are known issues on link neogotiation.
 > > AFAIK the issue has nothing to do with panic as I always see that on my
 > > Ultra2 which has DP83840 PHY too.
 > 
 > This is on e450 and e4500 machines.  I don't think there's a link
 > mismatch.
 > 
 > > I wonder how you can use NFS reliably on sparc64. Due to failure of
 > > alignment(both server and client) it's really easy to get panic on sparc64.
 > 
 > AFAICR I've never seen a problem with this (except with an i386 4.x
 > server, which can be panicked by a sparc64 client)..I don't rely on
 > NFS heavily in most cases, but I do use it on a number of machines
 > (including two package build machines that netboot and access their
 > ports trees over NFS, and have been in continuous operation with an
 > uptime of 110 days).
 > 
Do you use NFS orver UDP? NFS over TCP has much better change of
getting panic.
If you copy a large file(> 100MB) from a NFS exported directory to
its sub-directory you probably hit a panic. If my memory serve right
there had been several NFS panic reports in current/sparc64 ML.
And I don't think it was fixed since the root cause of panic is in
nfsm_disct() and nfs_realign()(it was not touched for a long time.)

 > >  >  I tried to comment out the HME_WHINE line of hme_read() in if_hme.c,
 > >  >  and it seems to make the box work fine so far.
 > > 
 > > HME_WHINE() just prints a message. I can't think removing the function
 > > can cure your problem.
 > 
 > It does seem to have helped though..before it would reliably lock up
 > in seconds, and DDB break was non-responsive.
 > 

Hmm... Then it would be great to get a voluntary core dump when hme(4)
hits the condition(e.g. before processing HME_WHINE, invoke panic(9)).

 > Kris



-- 
Regards,
Pyun YongHyeon
http://www.kr.freebsd.org/~yongari	|	yongari@freebsd.org



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050611075451.GC19976>