From owner-freebsd-arch@FreeBSD.ORG Mon Jul 14 17:20:49 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E3AED37B401 for ; Mon, 14 Jul 2003 17:20:49 -0700 (PDT) Received: from relay.pair.com (relay.pair.com [209.68.1.20]) by mx1.FreeBSD.org (Postfix) with SMTP id 0665443F93 for ; Mon, 14 Jul 2003 17:20:49 -0700 (PDT) (envelope-from silby@silby.com) Received: (qmail 65280 invoked from network); 15 Jul 2003 00:20:47 -0000 Received: from niwun.pair.com (HELO localhost) (209.68.2.70) by relay.pair.com with SMTP; 15 Jul 2003 00:20:47 -0000 X-pair-Authenticated: 209.68.2.70 Date: Mon, 14 Jul 2003 19:20:19 -0500 (CDT) From: Mike Silbersack To: Julian Elischer In-Reply-To: Message-ID: <20030714191735.Y8225@odysseus.silby.com> References: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: arch@freebsd.org Subject: Re: 4.x mbuf binary compatibility; can it be broken? X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 15 Jul 2003 00:20:50 -0000 On Mon, 14 Jul 2003, Julian Elischer wrote: > On Mon, 14 Jul 2003, Mike Silbersack wrote: > > > > > In the process of hunting down reported panics in xl_newbuf, I've come to > > the conclusion that the panics are a result of mbuf cluster refcounts > > overflowing. This is not too surprising, as we use an array of chars to > > store the refcounts. (-current uses ints, and doesn't have this problem.) > > > > It's easy enough to switch from a char to an int array in 4.x to fix the > > problem there, but there is a problem: Our friendly mbuf macros (MCLALLOC > > and MCLFREE) manipulate the refcount. This means that 3rd party modules > > which use the macros will no longer work properly. > > > > As the user of a 3rd party driver (binary only) > PLEASE don't do this.. > > How does it get 255+ references? I don't know exactly at this point. I can reproduce the situation at will with (in kernel) test code, but I don't know what exactly causes it in the wild. Given that increasing the ref count limit is so easy, I was hoping to avoid spending time tracking down one degenerate case. :) Mike "Silby" Silbersack