From owner-freebsd-scsi@freebsd.org Tue Jun 7 17:09:12 2016 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0A487B6EDC2 for ; Tue, 7 Jun 2016 17:09:12 +0000 (UTC) (envelope-from list-news@mindpackstudios.com) Received: from mail.furymx.com (mindpack.mx1.furymx.net [64.141.130.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id DE4B11FBC for ; Tue, 7 Jun 2016 17:09:11 +0000 (UTC) (envelope-from list-news@mindpackstudios.com) Received: from mindpack.furymx.net (mindpack.mx1.furymx.net [10.10.1.10]) by mail.furymx.com (Postfix) with ESMTP id 38327219A73; Tue, 7 Jun 2016 12:09:10 -0500 (CDT) X-Virus-Scanned: amavisd-new at furymx.com Received: from mail.furymx.com ([10.10.1.10]) by mindpack.furymx.net (mail.furymx.com [10.10.1.10]) (amavisd-new, port 10024) with ESMTP id t1SO_uXu9uP1; Tue, 7 Jun 2016 12:09:08 -0500 (CDT) Received: from vortex.local (c-98-215-180-176.hsd1.in.comcast.net [98.215.180.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: kyle@mindpackstudios.com) by mail.furymx.com (Postfix) with ESMTPSA id 59F36219A69; Tue, 7 Jun 2016 12:09:08 -0500 (CDT) From: list-news Subject: Re: Avago LSI SAS 3008 & Intel SSD Timeouts To: Borja Marcos References: <30c04d8b-80cb-c637-26dc-97caebad3acb@mindpackstudios.com> <08C01646-9AF3-4E89-A545-C051A284E039@sarenet.es> <986e03a7-5dc8-f5e0-5a17-4bf49459f905@mindpackstudios.com> <2823D96D-881D-4D40-B610-FC8292FA2FC5@sarenet.es> Cc: freebsd-scsi@freebsd.org Message-ID: <4072b65d-25d4-2a79-5911-573517b0ee57@mindpackstudios.com> Date: Tue, 7 Jun 2016 12:09:08 -0500 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:45.0) Gecko/20100101 Thunderbird/45.1.1 MIME-Version: 1.0 In-Reply-To: <2823D96D-881D-4D40-B610-FC8292FA2FC5@sarenet.es> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 07 Jun 2016 17:09:12 -0000 The system is a Twin. In the first post I mentioned this but I probably wasn't clear. The twin unit is this one: https://www.supermicro.com/products/system/2u/2028/sys-2028tp-decr.cfm I've used all components from twin node A and B (cpu / memory / mainboard / controller). I still get the errors. The backplane was the original thought of concern, and that has been RMA'd and replaced - errors continue. I've even swapped out power supplies with another identical unit I have here. In every case the errors continue, until I do this: #camcontrol daX -N 1 (for each drive in the zpool) Then the errors stop. The system errors every few minutes while my application is running. Set tags to -N 1, and everything goes quiet. 16 cores at 100% cpu and drives 80% busy @ ~15k IO p/s, for about 5 hours solid before it finishes a batch, no errors are reported with -N set to 1. If I set tags with -N 255 for each device, errors start again within 5 minutes, and continue every 2-5 minutes, until the batch is finished. -Kyle > I would try, if possible, to swap the controller. > > > > > > > Borja. > >