From owner-freebsd-scsi@freebsd.org Tue Jul 28 08:45:47 2015 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E801B9AAC29 for ; Tue, 28 Jul 2015 08:45:46 +0000 (UTC) (envelope-from killing@multiplay.co.uk) Received: from mail-wi0-f172.google.com (mail-wi0-f172.google.com [209.85.212.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 84E8418FE for ; Tue, 28 Jul 2015 08:45:45 +0000 (UTC) (envelope-from killing@multiplay.co.uk) Received: by wibxm9 with SMTP id xm9so149962868wib.1 for ; Tue, 28 Jul 2015 01:45:44 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:subject:to:references:cc:from:message-id:date :user-agent:mime-version:in-reply-to:content-type :content-transfer-encoding; bh=zTkFham4hTNW8PqIjteCHKUYO2bcQ72g64IIoXXedVc=; b=hLE6GSzYU7pt6r7DUOcPEsYlLp1kKzl0o1iTZMdXXk5QXIxq9QRPCvavs9XwAUsmZF 2i1kUzWn1jtitleQRTJoREHhuI9pSfWgM8G9kEZsQaUhBA2RGn/7/aPLTTT5Pfl8P0sX IQz1h6pchyoK4sLIfSpfffMyNU9m8WqdtxdzAGpEtFyrkNDgp5eYkrkLT9i+u1mlL8fN Qx9wPg/DGdx9LHY0Mpu74WuXrWnrnTJf27fdUYVfZ9YGlf0EgosXqq+nzFKC0sjlxj8H AGlyKlktZYij4oZC4zlcDpkpP7m4MvzviiAPHs6zRIvwUuyq6wQ+siXoo4m0yziEiWEK SFGA== X-Gm-Message-State: ALoCoQnEOckgt3BSf/BcWq6ZooCJdhTN5Lkio3pvQSPrQwRfdW+64hZg/qjLOEcoj3nL83uhtigR X-Received: by 10.194.249.100 with SMTP id yt4mr68010787wjc.0.1438073144327; Tue, 28 Jul 2015 01:45:44 -0700 (PDT) Received: from [10.10.1.68] (82-69-141-170.dsl.in-addr.zen.co.uk. [82.69.141.170]) by smtp.gmail.com with ESMTPSA id fa8sm17859063wib.14.2015.07.28.01.45.43 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 28 Jul 2015 01:45:43 -0700 (PDT) Subject: Re: Device timeouts(?) with LSI SAS3008 on mpr(4) To: Yamagi Burmeister References: <20150707132416.71b44c90f7f4cd6014a304b2@yamagi.org> <20150713110148.1a27b973881b64ce2f9e98e0@yamagi.org> <55A3813C.7010002@multiplay.co.uk> <20150713112547.8f044beabe26672fd13fc528@yamagi.org> <55A38AE1.5010204@multiplay.co.uk> <20150727112504.9b2c4bef27953f1e3dd52123@yamagi.org> Cc: freebsd-scsi@freebsd.org From: Steven Hartland Message-ID: <55B74132.6030909@multiplay.co.uk> Date: Tue, 28 Jul 2015 09:45:38 +0100 User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:38.0) Gecko/20100101 Thunderbird/38.1.0 MIME-Version: 1.0 In-Reply-To: <20150727112504.9b2c4bef27953f1e3dd52123@yamagi.org> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Jul 2015 08:45:47 -0000 65536 is really small and will likely cause bottleneck when TRIMing large areas. I'd suggest a larger value 2 - 4 MB but at least matching kern.geom.dev.delete_max_sectors (262144) Regards Steve On 27/07/2015 10:25, Yamagi Burmeister wrote: > Hello, > let me appologise for my late answer. My colleagues were in vacation > and I had no time to pursue this problem. But now another round: > > - da0 and da1 are 80G Intel DC S3500 SSDs > - All other devices are 800G Intel DC S3700 SSDs > > kern.cam.da.X.delete_max seem very high for all devices. Forcing the > tuneable to a very conservative value of 65536 helps, no timeout in 72 > hours and no measurable performance impact. So my guess is: > > - ZFS tries to TRIM too many blocks in one operation > - The SSD blocks for some time while processing the TRIM command > - The controller thinks that the SSD crashed and sends a reset > > I did some tests with the attached tool. I'm able to reproduce the > timeouts when "enough" data was written to the device. The question is > what's "enough" data. Sometimes 50G are enough and sometimes 75G can be > trimmed without any problem. > > Nevertheless. A lower kern.cam.da.X.delete_max value helps to work > around the problem. And everything else is just speculation. So: > Problem solved. Thank you for your help and input. > > Regards, > Yamagi > > On Mon, 13 Jul 2015 10:54:41 +0100 > Steven Hartland wrote: > >> I assume da0 and da1 are a different disk then? >> >> With regards your disk setup are all of you disks SSD's if so why do you >> have separate log and cache devices? >> >> One thing you could try is to limit the delete size. >> >> kern.geom.dev.delete_max_sectors limits the single request size allowed >> by geom but then individual requests can be built back up in cam so I >> don't think this will help you too much. >> >> Instead I would try limiting the individual device delete_max, so add >> one line per disk into /boot/loader.conf of the form: >> kern.cam.da.X.delete_max=1073741824 >> >> You can actually change these on the fly using sysctl, but in order to >> catch an cleanup done on boot loader.conf is the best place to tune them >> permanently. >> >> I've attached a little c util which you can use to do direct disk >> deletes if you have a spare disk you can play with. >> >> Be aware that most controller optimise delete's out if they know the >> cells are empty hence you do need to have written data to the sectors >> each time you test a delete. >> >> As the requests go through geom anything over >> kern.geom.dev.delete_max_sectors will be split but then may well be >> recombined in CAM. >> >> Another relevant setting is vfs.zfs.vdev.trim_max_active which can be >> used to limit the number of outstanding geom delete requests to the each >> device. >> >> Oh one other thing, it would be interesting to see the output from >> camcontrol identify e.g. >> camcontrol identify da8 >> camcontrol identify da0 >> >> Regards >> Steve >> >> On 13/07/2015 10:25, Yamagi Burmeister wrote: >>> On Mon, 13 Jul 2015 10:13:32 +0100 >>> Steven Hartland wrote: >>> >>>> What do you see from: >>>> sysctl -a | grep -E '(delete|trim)' >>> % sysctl -a | grep -E '(delete|trim)' >>> kern.geom.dev.delete_max_sectors: 262144 >>> kern.cam.da.1.delete_max: 8589803520 >>> kern.cam.da.1.delete_method: ATA_TRIM >>> kern.cam.da.8.delete_max: 12884705280 >>> kern.cam.da.8.delete_method: ATA_TRIM >>> kern.cam.da.9.delete_max: 12884705280 >>> kern.cam.da.9.delete_method: ATA_TRIM >>> kern.cam.da.3.delete_max: 12884705280 >>> kern.cam.da.3.delete_method: ATA_TRIM >>> kern.cam.da.12.delete_max: 12884705280 >>> kern.cam.da.12.delete_method: ATA_TRIM >>> kern.cam.da.7.delete_max: 12884705280 >>> kern.cam.da.7.delete_method: ATA_TRIM >>> kern.cam.da.2.delete_max: 12884705280 >>> kern.cam.da.2.delete_method: ATA_TRIM >>> kern.cam.da.11.delete_max: 12884705280 >>> kern.cam.da.11.delete_method: ATA_TRIM >>> kern.cam.da.6.delete_max: 12884705280 >>> kern.cam.da.6.delete_method: ATA_TRIM >>> kern.cam.da.10.delete_max: 12884705280 >>> kern.cam.da.10.delete_method: ATA_TRIM >>> kern.cam.da.5.delete_max: 12884705280 >>> kern.cam.da.5.delete_method: ATA_TRIM >>> kern.cam.da.0.delete_max: 8589803520 >>> kern.cam.da.0.delete_method: ATA_TRIM >>> kern.cam.da.4.delete_max: 12884705280 >>> kern.cam.da.4.delete_method: ATA_TRIM >>> vfs.zfs.trim.max_interval: 1 >>> vfs.zfs.trim.timeout: 30 >>> vfs.zfs.trim.txg_delay: 32 >>> vfs.zfs.trim.enabled: 1 >>> vfs.zfs.vdev.trim_max_pending: 10000 >>> vfs.zfs.vdev.bio_delete_disable: 0 >>> vfs.zfs.vdev.trim_max_active: 64 >>> vfs.zfs.vdev.trim_min_active: 1 >>> vfs.zfs.vdev.trim_on_init: 1 >>> kstat.zfs.misc.arcstats.deleted: 289783817 >>> kstat.zfs.misc.zio_trim.failed: 431 >>> kstat.zfs.misc.zio_trim.unsupported: 0 >>> kstat.zfs.misc.zio_trim.success: 6457142235 >>> kstat.zfs.misc.zio_trim.bytes: 88207753330688 >>> >>> >>>> Also while your seeing time-outs what does the output from gstat -d -p >>>> look like? >>> I'll try to get that data but it may take a while. >>> >>> Thank you, >>> Yamagi >>> >