From owner-freebsd-scsi@freebsd.org Mon Jun 26 15:39:55 2017 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0C211D8A21B; Mon, 26 Jun 2017 15:39:55 +0000 (UTC) (envelope-from pyunyh@gmail.com) Received: from mail-pg0-x234.google.com (mail-pg0-x234.google.com [IPv6:2607:f8b0:400e:c05::234]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id C82157D9B8; Mon, 26 Jun 2017 15:39:54 +0000 (UTC) (envelope-from pyunyh@gmail.com) Received: by mail-pg0-x234.google.com with SMTP id f127so2076525pgc.0; Mon, 26 Jun 2017 08:39:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:date:to:cc:subject:message-id:reply-to:references:mime-version :content-disposition:in-reply-to:user-agent; bh=zemzJsAmUs18+RZbRidECL9wvjOpoixw5Ky1RCz89tE=; b=X6KFe6WSmYuBIwEyrcxtZozacEGGagfWkG88xHGqPng0Cwc1kB/muLWxJkJeZ05UjS MXvmPS4qL/ZdwG2zvnytEX7NV5uCOLKrpvE7RUIKHWrRXbkYXijbR0EXbbd+9wYj/RJZ J3tu7QU7zniSBImIx9XhtbQQJhdEBc4eg5uzWyrwOtJ13UiBdLzkqsE2zuLF5Kz7Q057 A2ZEw+Or2d31ML1xOw2xr9Y+9SgBTiZV8kcIdkN0XRlRxADOXtbNNlxBjhh1atOBzIR+ hVI4OggtK2qQG+Hv5YCrHFZCHKXrhOPdrmSxABcHtqHqeIqZ1QXave59/lrYf/U1ZE92 J8/g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:date:to:cc:subject:message-id:reply-to :references:mime-version:content-disposition:in-reply-to:user-agent; bh=zemzJsAmUs18+RZbRidECL9wvjOpoixw5Ky1RCz89tE=; b=sjWGiS+vNyvvMqiuKRzF7993mf73OyJZuWScxMWGQ/P8peltJozIEuqdf2ndBdHbb5 Yr0Hn5Uuw+BuIEblcVaxnjtBYOZ2nhgR6R36kOSknVi0T6ag61vOKYLyLmN9WdK6blV0 FBS4n6n5UoDyWeRYpQRS+aFvrmRIwA4lg3Uhw4yz772S+41EcFAsDMHlKUoZxz+VpMpp +5H8zqT3rQQfBhyWdGMilxGi7PBmpvokaMFs//BPZibKpjqvXI2LmPWR1X+5ko2MyaXO nbRfH1aWeTmaVHMNR+Jb5fCxvV4BTsWRMondcIN68JgGf6deFKv6Lm4bhghleZzAFY4G jDhw== X-Gm-Message-State: AKS2vOxXnA6lcQy+SCnoayLiOqKtSf/SeRwM+sSXV/tUDYda98XJYlBn xYOy8ofdkPXOPg== X-Received: by 10.84.128.69 with SMTP id 63mr835077pla.54.1498491594314; Mon, 26 Jun 2017 08:39:54 -0700 (PDT) Received: from localhost ([1.227.152.47]) by smtp.gmail.com with ESMTPSA id n2sm709617pgd.26.2017.06.26.08.39.50 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 26 Jun 2017 08:39:53 -0700 (PDT) From: YongHyeon PYUN X-Google-Original-From: "YongHyeon PYUN" Received: by localhost (sSMTP sendmail emulation); Tue, 27 Jun 2017 00:40:10 +0900 Date: Tue, 27 Jun 2017 00:40:10 +0900 To: Julien Cigar Cc: "Andrey V. Elsukov" , FreeBSD Net , Ryan Stone , Ben RUBSON , "freebsd-scsi@freebsd.org" Subject: Re: mbuf_jumbo_9k & iSCSI failing Message-ID: <20170626154010.GA2488@michelle.fasterthan.co.kr> Reply-To: pyunyh@gmail.com References: <613AFD8E-72B2-4E3F-9C70-1D1E43109B8A@gmail.com> <2c9a9c2652a74d8eb4b34f5a32c7ad5c@AM5PR0502MB2916.eurprd05.prod.outlook.com> <52A2608C-A57E-4E75-A952-F4776BA23CA4@gmail.com> <9B507AA6-40FE-4B8D-853F-2A9422A2DF67@gmail.com> <64abec26-e310-d66d-93ae-3536914ddd84@yandex.ru> <20170626134458.GT43966@mordor.lan> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170626134458.GT43966@mordor.lan> User-Agent: Mutt/1.4.2.3i X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Jun 2017 15:39:55 -0000 On Mon, Jun 26, 2017 at 03:44:58PM +0200, Julien Cigar wrote: > On Mon, Jun 26, 2017 at 04:13:33PM +0300, Andrey V. Elsukov wrote: > > On 25.06.2017 18:32, Ryan Stone wrote: > > > Having looking at the original email more closely, I see that you showed an > > > mlxen interface with a 9020 MTU. Seeing allocation failures of 9k mbuf > > > clusters increase while you are far below the zone's limit means that > > > you're definitely running into the bug I'm describing, and this bug could > > > plausibly cause the iSCSI errors that you describe. > > > > > > The issue is that the newer version of the driver tries to allocate a > > > single buffer to accommodate an MTU-sized packet. Over time, however, > > > memory will become fragmented and eventually it can become impossible to > > > allocate a 9k physically contiguous buffer. When this happens the driver > > > is unable to allocate buffers to receive packets and is forced to drop > > > them. Presumably, if iSCSI suffers too many packet drops it will terminate > > > the connection. The older version of the driver limited itself to > > > page-sized buffers, so it was immune to issues with memory fragmentation. > > > > I think it is not mlxen specific problem, we have the same symptoms with > > ixgbe(4) driver too. To avoid the problem we have patches that are > > disable using of 9k mbufs, and instead only use 4k mbufs. > > I had the same issue on a lightly loaded HP DL20 machine (BCM5720 > chipsets), 8GB of RAM, running 10.3. Problem usually happens > within 30 days with 9k jumbo clusters allocation failure. > This looks strange to me. If I recall correctly bge(4) does not request physically contiguous 9k jumbo buffers for BCM5720 so it wouldn't suffer from memory fragmentation. (It uses m_cljget() and takes advantage of extended RX BDs to handle up to 4 DMA segments). If your controller is either BCM5714/BCM5715 or BCM5780, it requires physically contiguous 9k jumbo buffers to handle jumbo frames though.