Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 16 Apr 2015 15:20:56 +0900
From:      Yonghyeon PYUN <pyunyh@gmail.com>
To:        Gareth Wyn Roberts <g.w.roberts@glyndwr.ac.uk>
Cc:        "freebsd-stable@freebsd.org" <freebsd-stable@freebsd.org>
Subject:   Re: msk msk0 watchdog timeout freeze hang lock stop problem
Message-ID:  <20150416062056.GA970@michelle.fasterthan.com>
In-Reply-To: <A861E9C3B0586445B36C4BB29ABF2DB46B2ECE3A@XCH7.wrexham.local>
References:  <20150413081348.GA965@michelle.fasterthan.com> <A861E9C3B0586445B36C4BB29ABF2DB46B2ECE3A@XCH7.wrexham.local>

next in thread | previous in thread | raw e-mail | index | archive | help

--liOOAslEiF7prFVr
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

On Wed, Apr 15, 2015 at 09:52:09PM +0000, Gareth Wyn Roberts wrote:
> I've inserted code to print some values which show the differences between specifying 4096 or 8192 for MSK_STAT_ALIGN.  In both cases the status buffer has length 0x4000 (8x2048=16K) but the alignments are different as expected, respectively start addresses 0x5c3b000 or 0xbdc2c000.
> 
> The following values were output from functions msk_status_dma_alloc(), msk_dmamap_cb() and msk_handle_events().
> The "Break #n" refer to breaks in msk_handle_events(). "#1" occurs if ((control & HW_OWNER) == 0), "#5" is OP_RXSTAT and "#6" is OP_TXINDEXLE.
> 
> The first output is for MSK_STAT_ALIGN=8192.  It continues normally.  Although not shown here, it reaches cons=2047 then cons=0 as expected.
> 
> The second output is for MSK_STAT_ALIGN=4096.  Although there can be isolated occurences of "Break #1" (e.g. cons=196) (?are these to be expected?),  it continues normally until cons=512. At this point it continually invokes the "#1" block because the msk_control from msk_stat_ring[512] is always zero and the network hangs immediately. This suggests the Yukon Ultra 2 88E8057 can't access the next 4096 memory block, but why not?
> 

Yes, it seems the status LE block is not updated at all for
MSK_STAT_ALIGN == 4096 and some elements of the status block looks
suspicious(put index increases but the value in the location is 0).
I vaguely guess this indicates there are DMA alignment and/or DMA
boundary issues.
The maximum number of elements of the status block is 4096 so the
maximum size of the status block is 32KB.  For i386, msk(4) uses
8KB status block(1024 elements).  For 64bit architectures, the
block size is increased to 16KB(2048 elements).
Probably the safe alignment value for the status block would be
32K.  This looks excessive value to me but it shall avoid guessing
DMA boundary issue.

> Please let me know if any further information would be helpful.
> 

Thanks a lot. I've attached a diff which sets the alignment of
TX/RX ring and status block to 32KB.  Not sure whether this also
addresses other msk(4) related watchdog timeouts.

--liOOAslEiF7prFVr
Content-Type: text/x-diff; charset=us-ascii
Content-Disposition: attachment; filename="msk.align.diff"

Index: sys/dev/msk/if_mskreg.h
===================================================================
--- sys/dev/msk/if_mskreg.h	(revision 281587)
+++ sys/dev/msk/if_mskreg.h	(working copy)
@@ -2175,13 +2175,8 @@
 #define MSK_ADDR_LO(x)	((uint64_t) (x) & 0xffffffffUL)
 #define MSK_ADDR_HI(x)	((uint64_t) (x) >> 32)
 
-/*
- * At first I guessed 8 bytes, the size of a single descriptor, would be
- * required alignment constraints. But, it seems that Yukon II have 4096
- * bytes boundary alignment constraints.
- */
-#define MSK_RING_ALIGN	4096
-#define	MSK_STAT_ALIGN	4096
+#define	MSK_RING_ALIGN	32768
+#define	MSK_STAT_ALIGN	32768
 
 /* Rx descriptor data structure */
 struct msk_rx_desc {

--liOOAslEiF7prFVr--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20150416062056.GA970>