From owner-freebsd-stable@FreeBSD.ORG Tue Jan 7 08:49:52 2014 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 25602E61 for ; Tue, 7 Jan 2014 08:49:52 +0000 (UTC) Received: from mail-pb0-x236.google.com (mail-pb0-x236.google.com [IPv6:2607:f8b0:400e:c01::236]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id E77CA1030 for ; Tue, 7 Jan 2014 08:49:51 +0000 (UTC) Received: by mail-pb0-f54.google.com with SMTP id un15so19882321pbc.27 for ; Tue, 07 Jan 2014 00:49:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:date:to:cc:subject:message-id:reply-to:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=r9i6q2yXhun/ELCaBNM4iqABjBbSzMjCCYkXiXXBOnI=; b=IRFL4mV7IawVsWbyPBVdVUksW5psI7spBOMtCRbkylklfOvGCOHAOdtFTzOadUDSkK yvU89CVFTcEiud96G3cvXscCVPXTd03dJ3unZf8w1sIBSlddbQYZWi88gYQI5ffy/3s1 eI5ZGF6voGKtCQFF/VuFRjwT8Oc+YCgKTRHhn5RKSgtiozehHWL8QmNYwdLpFbVkE19m b7OyZ8rhdJR6RiCHcIVBhYqhmK451zIOXKcoEEmHF62fqu3J1AqEU5zesQGjXaLbQgpL vnyIOWG31DEGbecsGW/t0501r9whjViwSECCL5cPsXQ7IaiMSZXmtvLBeW0mGNSqua1p saBg== X-Received: by 10.66.228.37 with SMTP id sf5mr3803776pac.19.1389084590091; Tue, 07 Jan 2014 00:49:50 -0800 (PST) Received: from pyunyh@gmail.com (lpe4.p59-icn.cdngp.net. [114.111.62.249]) by mx.google.com with ESMTPSA id g6sm175869607pat.2.2014.01.07.00.49.46 for (version=TLSv1 cipher=RC4-SHA bits=128/128); Tue, 07 Jan 2014 00:49:48 -0800 (PST) Received: by pyunyh@gmail.com (sSMTP sendmail emulation); Tue, 07 Jan 2014 17:49:38 +0900 From: Yonghyeon PYUN Date: Tue, 7 Jan 2014 17:49:38 +0900 To: Curtis Villamizar Subject: Re: regression: msk0 watchdog timeout and interrupt storm Message-ID: <20140107084938.GA1361@michelle.cdnetworks.com> References: <20140106050400.GA1372@michelle.cdnetworks.com> <201401061520.s06FKeVG009399@maildrop2.v6ds.occnc.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="ikeVEW9yuYc//A+q" Content-Disposition: inline In-Reply-To: <201401061520.s06FKeVG009399@maildrop2.v6ds.occnc.com> User-Agent: Mutt/1.4.2.3i Cc: freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list Reply-To: pyunyh@gmail.com List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 07 Jan 2014 08:49:52 -0000 --ikeVEW9yuYc//A+q Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Mon, Jan 06, 2014 at 10:20:40AM -0500, Curtis Villamizar wrote: > [...] > Here are some relevant parts of dmesg. Is there anything else you want? > > real memory = 2147483648 (2048 MB) > avail memory = 2061438976 (1965 MB) > Event timer "LAPIC" quality 400 > ACPI APIC Table: > FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs > FreeBSD/SMP: 1 package(s) x 2 core(s) > cpu0 (BSP): APIC ID: 0 > cpu1 (AP): APIC ID: 1 > > pcib2: irq 19 at device 7.0 on pci0 > pci2: on pcib2 > on pci1 > pcib2: irq 19 at device 7.0 on pci0 > pci2: on pcib2 > mskc0: port 0xe800-0xe8ff mem > 0xfebfc000-0xfebfffff irq 19 at device 0.0 on pci2 > msk0: > on mskc0 > msk0: Ethernet address: c8:9c:dc:56:38:ef > miibus0: on msk0 > e1000phy0: PHY 0 on miibus0 > e1000phy0: none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, > 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, > auto, auto-flow > Thank you for the info. > The computer is a Lenovo ThinkCenter (small tower) and not an uncommon > machine so others are likely to run into this. > > > > Please let me know what I could do to help debug this. > > > > > > > If you have more than 4GB memory, try reducing the amount of > > memory(e.g. 3G) in /boot/loader.conf and let me know whether that > > makes any difference for you. > > Note, in order to test this you have to back out your local > > changes. > > Only have 2 GB memory. > Ok, that means my wild guess was not right. :-( [...] > > I'm under the impression that the controller may have additional > > DMA addressing limitation where TX/RX and status LEs should have > > the same high DMA address. Due to the lack of documentation I'm > > not sure about that. If the issue does not happen with 3GB memory, > > we have to use 32bit DMA addressing. > > We have 2 GB memory so the problem with the original code does happen > with less than 4 GB memory. Everything has the same high address of > zero. > Right. > Is there anything else you want me to try? msk(4) uses 4KB alignment for status/TX/RX rings. Your local change will reduce the number of status LEs to be 1024. Stock msk(4) will use 2048 entries for status LEs and you said the cons variable is stuck with 1024 in this case. I have no idea this can happen at this moment. Did msk(4) ever work on your box? If the answer is yes, would you back out both r258780 and your local change? I have a small local diff which was made after seeing r258780. But I'm not sure whether it makes any difference. > > Curtis > > btw - I added someone from Marvell on the Bcc in case he wants to join > in on the conversation or give us a hint in private email. --ikeVEW9yuYc//A+q Content-Type: text/x-diff; charset=us-ascii Content-Disposition: attachment; filename="msk.type.diff" Index: sys/dev/msk/if_msk.c =================================================================== --- sys/dev/msk/if_msk.c (revision 260362) +++ sys/dev/msk/if_msk.c (working copy) @@ -3600,7 +3600,8 @@ msk_handle_events(struct msk_softc *sc) int rxput[2]; struct msk_stat_desc *sd; uint32_t control, status; - int cons, len, port, rxprog; + int len, port, rxprog; + uint16_t cons; if (sc->msk_stat_cons == CSR_READ_2(sc, STAT_PUT_IDX)) return (0); Index: sys/dev/msk/if_mskreg.h =================================================================== --- sys/dev/msk/if_mskreg.h (revision 260362) +++ sys/dev/msk/if_mskreg.h (working copy) @@ -2539,8 +2539,8 @@ struct msk_softc { bus_addr_t msk_stat_ring_paddr; int msk_int_holdoff; int msk_process_limit; - int msk_stat_cons; - int msk_stat_count; + uint16_t msk_stat_cons; + uint16_t msk_stat_count; struct mtx msk_mtx; }; --ikeVEW9yuYc//A+q--