From owner-freebsd-stable@FreeBSD.ORG Thu Apr 16 06:21:08 2015 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 34B1DB88 for ; Thu, 16 Apr 2015 06:21:08 +0000 (UTC) Received: from mail-pd0-x232.google.com (mail-pd0-x232.google.com [IPv6:2607:f8b0:400e:c02::232]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id F06EB277 for ; Thu, 16 Apr 2015 06:21:07 +0000 (UTC) Received: by pdbqa5 with SMTP id qa5so80400120pdb.1 for ; Wed, 15 Apr 2015 23:21:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:date:to:cc:subject:message-id:reply-to:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=NdZ+8YjccJg84V6lgYAbGzkmBa4iN3rGx5tJhLMd7jY=; b=BualWB+nKhxtKSTkFjRpYgX3iOA436TETrnXe7vjxQW1oQB2mdJgjQkfOWg9G0fJTs +jsR4LhO0aW3Zz1IFPbtulEjeE6ZzlvoNbFGUQLKTWRKXODhWUOq1+zX3HLCYwEMRLNo wmofi7ydhl6K3vwOhgDAZRUZXorm1KtKqlIcIy/ELG3BbPzLs2xMhFGzz0d4OUjeXXV0 7WgzJZbL3r4vuEgv8o5Zwy4f51yJfupjAWVKPnr8RvKuB+K9ax1HtiUuypuZLBaplLNw 2hTqYRZCyKWvDbnE/Yq8HqrZYDmVxPVWugh4qUO7qPkpzZbQlmix4V+8ycqVJTPxxzNw mLhw== X-Received: by 10.70.133.194 with SMTP id pe2mr53588093pdb.57.1429165267448; Wed, 15 Apr 2015 23:21:07 -0700 (PDT) Received: from pyunyh@gmail.com ([106.247.248.2]) by mx.google.com with ESMTPSA id da10sm5963771pac.42.2015.04.15.23.21.03 (version=TLSv1 cipher=RC4-SHA bits=128/128); Wed, 15 Apr 2015 23:21:05 -0700 (PDT) From: Yonghyeon PYUN X-Google-Original-From: "Yonghyeon PYUN" Received: by pyunyh@gmail.com (sSMTP sendmail emulation); Thu, 16 Apr 2015 15:20:56 +0900 Date: Thu, 16 Apr 2015 15:20:56 +0900 To: Gareth Wyn Roberts Cc: "freebsd-stable@freebsd.org" Subject: Re: msk msk0 watchdog timeout freeze hang lock stop problem Message-ID: <20150416062056.GA970@michelle.fasterthan.com> Reply-To: pyunyh@gmail.com References: <20150413081348.GA965@michelle.fasterthan.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="liOOAslEiF7prFVr" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Apr 2015 06:21:08 -0000 --liOOAslEiF7prFVr Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Wed, Apr 15, 2015 at 09:52:09PM +0000, Gareth Wyn Roberts wrote: > I've inserted code to print some values which show the differences between specifying 4096 or 8192 for MSK_STAT_ALIGN. In both cases the status buffer has length 0x4000 (8x2048=16K) but the alignments are different as expected, respectively start addresses 0x5c3b000 or 0xbdc2c000. > > The following values were output from functions msk_status_dma_alloc(), msk_dmamap_cb() and msk_handle_events(). > The "Break #n" refer to breaks in msk_handle_events(). "#1" occurs if ((control & HW_OWNER) == 0), "#5" is OP_RXSTAT and "#6" is OP_TXINDEXLE. > > The first output is for MSK_STAT_ALIGN=8192. It continues normally. Although not shown here, it reaches cons=2047 then cons=0 as expected. > > The second output is for MSK_STAT_ALIGN=4096. Although there can be isolated occurences of "Break #1" (e.g. cons=196) (?are these to be expected?), it continues normally until cons=512. At this point it continually invokes the "#1" block because the msk_control from msk_stat_ring[512] is always zero and the network hangs immediately. This suggests the Yukon Ultra 2 88E8057 can't access the next 4096 memory block, but why not? > Yes, it seems the status LE block is not updated at all for MSK_STAT_ALIGN == 4096 and some elements of the status block looks suspicious(put index increases but the value in the location is 0). I vaguely guess this indicates there are DMA alignment and/or DMA boundary issues. The maximum number of elements of the status block is 4096 so the maximum size of the status block is 32KB. For i386, msk(4) uses 8KB status block(1024 elements). For 64bit architectures, the block size is increased to 16KB(2048 elements). Probably the safe alignment value for the status block would be 32K. This looks excessive value to me but it shall avoid guessing DMA boundary issue. > Please let me know if any further information would be helpful. > Thanks a lot. I've attached a diff which sets the alignment of TX/RX ring and status block to 32KB. Not sure whether this also addresses other msk(4) related watchdog timeouts. --liOOAslEiF7prFVr Content-Type: text/x-diff; charset=us-ascii Content-Disposition: attachment; filename="msk.align.diff" Index: sys/dev/msk/if_mskreg.h =================================================================== --- sys/dev/msk/if_mskreg.h (revision 281587) +++ sys/dev/msk/if_mskreg.h (working copy) @@ -2175,13 +2175,8 @@ #define MSK_ADDR_LO(x) ((uint64_t) (x) & 0xffffffffUL) #define MSK_ADDR_HI(x) ((uint64_t) (x) >> 32) -/* - * At first I guessed 8 bytes, the size of a single descriptor, would be - * required alignment constraints. But, it seems that Yukon II have 4096 - * bytes boundary alignment constraints. - */ -#define MSK_RING_ALIGN 4096 -#define MSK_STAT_ALIGN 4096 +#define MSK_RING_ALIGN 32768 +#define MSK_STAT_ALIGN 32768 /* Rx descriptor data structure */ struct msk_rx_desc { --liOOAslEiF7prFVr--