From owner-freebsd-net@FreeBSD.ORG Tue Apr 22 13:12:08 2003 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7276237B401 for ; Tue, 22 Apr 2003 13:12:08 -0700 (PDT) Received: from wall.polstra.com (wall-gw.polstra.com [206.213.73.130]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3996143FDF for ; Tue, 22 Apr 2003 13:12:07 -0700 (PDT) (envelope-from jdp@polstra.com) Received: from strings.polstra.com (strings.polstra.com [206.213.73.20]) by wall.polstra.com (8.12.3p2/8.12.3) with ESMTP id h3MKC5dt059043 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Tue, 22 Apr 2003 13:12:06 -0700 (PDT) (envelope-from jdp@strings.polstra.com) Received: (from jdp@localhost) by strings.polstra.com (8.12.6/8.12.6/Submit) id h3MKC5OZ009213; Tue, 22 Apr 2003 13:12:05 -0700 (PDT) (envelope-from jdp) Date: Tue, 22 Apr 2003 13:12:05 -0700 (PDT) Message-Id: <200304222012.h3MKC5OZ009213@strings.polstra.com> To: net@freebsd.org From: John Polstra In-Reply-To: References: Organization: Polstra & Co., Seattle, WA Subject: Re: em net (optical GigE) driver hangs? X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Apr 2003 20:12:08 -0000 In article , Dave Dolson wrote: > > Has anyone experienced em interface hangs after approx several days of heavy > operation? > > We are using a system which is mostly RELENG_4_7, using multiple optical em > GigE devices. > > The symptom is that the interface stops transmitting or receiving, reporting > drops on output (no tx descriptors) and input errors (MPC stat-->no receive > descriptors). > > It turns out that all but 64 transmit descriptors are in use. The driver is > waiting for the "done" flag to be set so it can clean the descriptors. > The device is also in the OACTIVE state at this time. > > After the interface is brought down (or unplugged), the em watchdog timer > goes off 5s later. > > We are trying to figure out two things: > 1. why did the driver lock up? > 2. why didn't the watchdog timer go off earlier? > > I think we would be happy to solve #2 given the rarity of the event. > Is the RELENG_4 version likely to fix the problem? I think the RELENG_4 version is likely to eliminate the problem. See the comment near the define of EM_RDTR in if_em.h (in the RELENG_4 version of that file, of course). John -- John Polstra John D. Polstra & Co., Inc. Seattle, Washington USA "Disappointment is a good sign of basic intelligence." -- Chögyam Trungpa