From owner-freebsd-scsi@freebsd.org  Tue Jul 28 08:45:47 2015
Return-Path: <owner-freebsd-scsi@freebsd.org>
Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id E801B9AAC29
 for <freebsd-scsi@mailman.ysv.freebsd.org>;
 Tue, 28 Jul 2015 08:45:46 +0000 (UTC)
 (envelope-from killing@multiplay.co.uk)
Received: from mail-wi0-f172.google.com (mail-wi0-f172.google.com
 [209.85.212.172])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 84E8418FE
 for <freebsd-scsi@freebsd.org>; Tue, 28 Jul 2015 08:45:45 +0000 (UTC)
 (envelope-from killing@multiplay.co.uk)
Received: by wibxm9 with SMTP id xm9so149962868wib.1
 for <freebsd-scsi@freebsd.org>; Tue, 28 Jul 2015 01:45:44 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:subject:to:references:cc:from:message-id:date
 :user-agent:mime-version:in-reply-to:content-type
 :content-transfer-encoding;
 bh=zTkFham4hTNW8PqIjteCHKUYO2bcQ72g64IIoXXedVc=;
 b=hLE6GSzYU7pt6r7DUOcPEsYlLp1kKzl0o1iTZMdXXk5QXIxq9QRPCvavs9XwAUsmZF
 2i1kUzWn1jtitleQRTJoREHhuI9pSfWgM8G9kEZsQaUhBA2RGn/7/aPLTTT5Pfl8P0sX
 IQz1h6pchyoK4sLIfSpfffMyNU9m8WqdtxdzAGpEtFyrkNDgp5eYkrkLT9i+u1mlL8fN
 Qx9wPg/DGdx9LHY0Mpu74WuXrWnrnTJf27fdUYVfZ9YGlf0EgosXqq+nzFKC0sjlxj8H
 AGlyKlktZYij4oZC4zlcDpkpP7m4MvzviiAPHs6zRIvwUuyq6wQ+siXoo4m0yziEiWEK
 SFGA==
X-Gm-Message-State: ALoCoQnEOckgt3BSf/BcWq6ZooCJdhTN5Lkio3pvQSPrQwRfdW+64hZg/qjLOEcoj3nL83uhtigR
X-Received: by 10.194.249.100 with SMTP id yt4mr68010787wjc.0.1438073144327;
 Tue, 28 Jul 2015 01:45:44 -0700 (PDT)
Received: from [10.10.1.68] (82-69-141-170.dsl.in-addr.zen.co.uk.
 [82.69.141.170])
 by smtp.gmail.com with ESMTPSA id fa8sm17859063wib.14.2015.07.28.01.45.43
 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Tue, 28 Jul 2015 01:45:43 -0700 (PDT)
Subject: Re: Device timeouts(?) with LSI SAS3008 on mpr(4)
To: Yamagi Burmeister <lists@yamagi.org>
References: <20150707132416.71b44c90f7f4cd6014a304b2@yamagi.org>
 <20150713110148.1a27b973881b64ce2f9e98e0@yamagi.org>
 <55A3813C.7010002@multiplay.co.uk>
 <20150713112547.8f044beabe26672fd13fc528@yamagi.org>
 <55A38AE1.5010204@multiplay.co.uk>
 <20150727112504.9b2c4bef27953f1e3dd52123@yamagi.org>
Cc: freebsd-scsi@freebsd.org
From: Steven Hartland <killing@multiplay.co.uk>
Message-ID: <55B74132.6030909@multiplay.co.uk>
Date: Tue, 28 Jul 2015 09:45:38 +0100
User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:38.0) Gecko/20100101
 Thunderbird/38.1.0
MIME-Version: 1.0
In-Reply-To: <20150727112504.9b2c4bef27953f1e3dd52123@yamagi.org>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-scsi@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: SCSI subsystem <freebsd-scsi.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-scsi>,
 <mailto:freebsd-scsi-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-scsi/>
List-Post: <mailto:freebsd-scsi@freebsd.org>
List-Help: <mailto:freebsd-scsi-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-scsi>,
 <mailto:freebsd-scsi-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 28 Jul 2015 08:45:47 -0000

65536 is really small and will likely cause bottleneck when TRIMing 
large areas.

I'd suggest a larger value 2 - 4 MB but at least matching 
kern.geom.dev.delete_max_sectors (262144)

     Regards
     Steve

On 27/07/2015 10:25, Yamagi Burmeister wrote:
> Hello,
> let me appologise for my late answer. My colleagues were in vacation
> and I had no time to pursue this problem. But now another round:
>
> - da0 and da1 are 80G Intel DC S3500 SSDs
> - All other devices are 800G Intel DC S3700 SSDs
>
> kern.cam.da.X.delete_max seem very high for all devices. Forcing the
> tuneable to a very conservative value of 65536 helps, no timeout in 72
> hours and no measurable performance impact. So my guess is:
>
> - ZFS tries to TRIM too many blocks in one operation
> - The SSD blocks for some time while processing the TRIM command
> - The controller thinks that the SSD crashed and sends a reset
>
> I did some tests with the attached tool. I'm able to reproduce the
> timeouts when "enough" data was written to the device. The question is
> what's "enough" data. Sometimes 50G are enough and sometimes 75G can be
> trimmed without any problem.
>
> Nevertheless. A lower kern.cam.da.X.delete_max value helps to work
> around the problem. And everything else is just speculation. So:
> Problem solved. Thank you for your help and input.
>
> Regards,
> Yamagi
>
> On Mon, 13 Jul 2015 10:54:41 +0100
> Steven Hartland <killing@multiplay.co.uk> wrote:
>
>> I assume da0 and da1 are a different disk then?
>>
>> With regards your disk setup are all of you disks SSD's if so why do you
>> have separate log and cache devices?
>>
>> One thing you could try is to limit the delete size.
>>
>> kern.geom.dev.delete_max_sectors limits the single request size allowed
>> by geom but then individual requests can be built back up in cam so I
>> don't think this will help you too much.
>>
>> Instead I would try limiting the individual device delete_max, so add
>> one line per disk into /boot/loader.conf of the form:
>> kern.cam.da.X.delete_max=1073741824
>>
>> You can actually change these on the fly using sysctl, but in order to
>> catch an cleanup done on boot loader.conf is the best place to tune them
>> permanently.
>>
>> I've attached a little c util which you can use to do direct disk
>> deletes if you have a spare disk you can play with.
>>
>> Be aware that most controller optimise delete's out if they know the
>> cells are empty hence you do need to have written data to the sectors
>> each time you test a delete.
>>
>> As the requests go through geom anything over
>> kern.geom.dev.delete_max_sectors will be split but then may well be
>> recombined in CAM.
>>
>> Another relevant setting is vfs.zfs.vdev.trim_max_active which can be
>> used to limit the number of outstanding geom delete requests to the each
>> device.
>>
>> Oh one other thing, it would be interesting to see the output from
>> camcontrol identify <device> e.g.
>> camcontrol identify da8
>> camcontrol identify da0
>>
>>       Regards
>>       Steve
>>
>> On 13/07/2015 10:25, Yamagi Burmeister wrote:
>>> On Mon, 13 Jul 2015 10:13:32 +0100
>>> Steven Hartland <killing@multiplay.co.uk> wrote:
>>>
>>>> What do you see from:
>>>> sysctl -a | grep -E '(delete|trim)'
>>> % sysctl -a | grep -E '(delete|trim)'
>>> kern.geom.dev.delete_max_sectors: 262144
>>> kern.cam.da.1.delete_max: 8589803520
>>> kern.cam.da.1.delete_method: ATA_TRIM
>>> kern.cam.da.8.delete_max: 12884705280
>>> kern.cam.da.8.delete_method: ATA_TRIM
>>> kern.cam.da.9.delete_max: 12884705280
>>> kern.cam.da.9.delete_method: ATA_TRIM
>>> kern.cam.da.3.delete_max: 12884705280
>>> kern.cam.da.3.delete_method: ATA_TRIM
>>> kern.cam.da.12.delete_max: 12884705280
>>> kern.cam.da.12.delete_method: ATA_TRIM
>>> kern.cam.da.7.delete_max: 12884705280
>>> kern.cam.da.7.delete_method: ATA_TRIM
>>> kern.cam.da.2.delete_max: 12884705280
>>> kern.cam.da.2.delete_method: ATA_TRIM
>>> kern.cam.da.11.delete_max: 12884705280
>>> kern.cam.da.11.delete_method: ATA_TRIM
>>> kern.cam.da.6.delete_max: 12884705280
>>> kern.cam.da.6.delete_method: ATA_TRIM
>>> kern.cam.da.10.delete_max: 12884705280
>>> kern.cam.da.10.delete_method: ATA_TRIM
>>> kern.cam.da.5.delete_max: 12884705280
>>> kern.cam.da.5.delete_method: ATA_TRIM
>>> kern.cam.da.0.delete_max: 8589803520
>>> kern.cam.da.0.delete_method: ATA_TRIM
>>> kern.cam.da.4.delete_max: 12884705280
>>> kern.cam.da.4.delete_method: ATA_TRIM
>>> vfs.zfs.trim.max_interval: 1
>>> vfs.zfs.trim.timeout: 30
>>> vfs.zfs.trim.txg_delay: 32
>>> vfs.zfs.trim.enabled: 1
>>> vfs.zfs.vdev.trim_max_pending: 10000
>>> vfs.zfs.vdev.bio_delete_disable: 0
>>> vfs.zfs.vdev.trim_max_active: 64
>>> vfs.zfs.vdev.trim_min_active: 1
>>> vfs.zfs.vdev.trim_on_init: 1
>>> kstat.zfs.misc.arcstats.deleted: 289783817
>>> kstat.zfs.misc.zio_trim.failed: 431
>>> kstat.zfs.misc.zio_trim.unsupported: 0
>>> kstat.zfs.misc.zio_trim.success: 6457142235
>>> kstat.zfs.misc.zio_trim.bytes: 88207753330688
>>>
>>>
>>>> Also while your seeing time-outs what does the output from gstat -d -p
>>>> look like?
>>> I'll try to get that data but it may take a while.
>>>
>>> Thank you,
>>> Yamagi
>>>
>