From owner-freebsd-infiniband@FreeBSD.ORG Mon Jun 10 16:03:20 2013 Return-Path: Delivered-To: freebsd-infiniband@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 5DA445CD for ; Mon, 10 Jun 2013 16:03:20 +0000 (UTC) (envelope-from accornehl@gmail.com) Received: from mail-ee0-x232.google.com (mail-ee0-x232.google.com [IPv6:2a00:1450:4013:c00::232]) by mx1.freebsd.org (Postfix) with ESMTP id EC0701081 for ; Mon, 10 Jun 2013 16:03:19 +0000 (UTC) Received: by mail-ee0-f50.google.com with SMTP id d49so3147035eek.9 for ; Mon, 10 Jun 2013 09:03:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=fzMnzuoQNeeQ2wKDs0C0yH0oCz0x2zCP3Q5XyFy/ZLc=; b=lBXbsn/QaCsPBGvpYSIgJczCCkZCEhANnFFW8P7fKwPOnt1C16Qj6JCTGfWRaAvXWf 1Bz3Msfec5Re60Q/lXPC/k9CuKsGzjV7qjsz/75UdAqRBhoDJyfkVcfGHslxaI++qga8 0t0L/TNBO1ykqOaJuSmdz0mk85Wf+JFyvLgkedAMrgkRM8g87MUt42QqEXuTcfsewvK2 1LorlkSKPSo+ysfoisJcFsX+4MAXADVIEEn8i9oWQrjJJ+LlHUzroTQ/8CsZUxGDNZ0y BQ341RNCchgUbBZp96Iy+5olPRrK5aVNJGpIOJPqUL6w60ncMpQyXtOxz4vqU/Loh+93 5NGg== MIME-Version: 1.0 X-Received: by 10.14.69.199 with SMTP id n47mr11915357eed.11.1370880198850; Mon, 10 Jun 2013 09:03:18 -0700 (PDT) Received: by 10.223.77.92 with HTTP; Mon, 10 Jun 2013 09:03:18 -0700 (PDT) Received: by 10.223.77.92 with HTTP; Mon, 10 Jun 2013 09:03:18 -0700 (PDT) In-Reply-To: <51B5D798.5090008@os.inf.tu-dresden.de> References: <51B5D798.5090008@os.inf.tu-dresden.de> Date: Mon, 10 Jun 2013 16:03:18 +0000 Message-ID: Subject: Re: ib1: timing out; N sends not completed From: Anthony Cornehl To: Julian Stecklina Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-infiniband@freebsd.org X-BeenThere: freebsd-infiniband@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Infiniband on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Jun 2013 16:03:20 -0000 On Jun 10, 2013 6:41 AM, "Julian Stecklina" wrote: > > Hello, > > I have two machines connected back-to-back via Infiniband with Mellanox > Infinihost III adapters. One machine runs Linux (Fedora 19) and the > other 9-STABLE. > > I sometimes get: > > ib1: timing out; 47 sends not completed > ib1: timing out; 1 sends not completed > ib1: timing out; 56 sends not completed > > or similar and TCP connections will be stuck after each timeout for a > while. It is relatively easy to reproduce this behavior with NetPIPE. > > Any advice? > > Julian > Hey Julian, Just some questions to try and clarify the issue... - which machine is the OpenSM master running on? - what does your qkey violation count look like when you run a portinfo on the ports? - does the issue persist when a switch is added between the hosts? Cheers!