From owner-freebsd-scsi@freebsd.org Mon Jul 13 09:02:05 2015 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B988D37DD for ; Mon, 13 Jul 2015 09:02:05 +0000 (UTC) (envelope-from lists@yamagi.org) Received: from mail1.yamagi.org (yugo.yamagi.org [212.48.122.103]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 7DF571D83 for ; Mon, 13 Jul 2015 09:02:04 +0000 (UTC) (envelope-from lists@yamagi.org) Received: from [192.168.100.101] (helo=aka) by mail1.yamagi.org with esmtpsa (TLSv1:DHE-RSA-AES256-SHA:256) (Exim 4.85 (FreeBSD)) (envelope-from ) id 1ZEZcY-000GsV-4I for freebsd-scsi@freebsd.org; Mon, 13 Jul 2015 11:01:55 +0200 Date: Mon, 13 Jul 2015 11:01:48 +0200 From: Yamagi Burmeister To: freebsd-scsi@freebsd.org Subject: Re: Device timeouts(?) with LSI SAS3008 on mpr(4) Message-Id: <20150713110148.1a27b973881b64ce2f9e98e0@yamagi.org> In-Reply-To: <20150707132416.71b44c90f7f4cd6014a304b2@yamagi.org> References: <20150707132416.71b44c90f7f4cd6014a304b2@yamagi.org> X-Mailer: Sylpheed 3.4.2 (GTK+ 2.24.27; amd64-portbld-freebsd10.0) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Jul 2015 09:02:05 -0000 Hello, after some fiddling and testing I managed to track this down. TRIM is the culprit: - With vfs.zfs.trim.enabled set to 1 timeouts occure. Regardless of cabeling, of a backplane or direct connection. It doesn't matter if Intel DC S3500 oder S3700 SSDs are connected, but on the other hand both share the same controller. I don't have enough onboard S-ATA ports to test the whole setup without the 9300-8i HBA, but a short (maybe too short and without enough load) test with 6 SSDs didn't show any timeouts. - With vfs.zfs.trim.enabled set to 0 I havn't seen a single timeout for ~56 hours. Regards, Yamagi On Tue, 7 Jul 2015 13:24:16 +0200 Yamagi Burmeister wrote: > Hello, > I've got 3 new Supermicro servers based upon the X10DRi-LN4+ platform. > Each server is equiped with 2 LSI SAS9300-8i-SQL SAS adapters. Each > adapter serves 8 Intel DC S3700 SSDs. Operating system is 10.1-STABLE > as of r283938 on 2 servers and r285196 on the last one. > > The controller identify themself as: > > ---- > > mpr0: port 0x6000-0x60ff mem > 0xc7240000-0xc724ffff,0xc7200000-0xc723ffff irq 32 at device 0.0 on > pci2 mpr0: IOCFacts : MsgVersion: 0x205 > HeaderVersion: 0x2300 > IOCNumber: 0 > IOCExceptions: 0x0 > MaxChainDepth: 128 > NumberOfPorts: 1 > RequestCredit: 10240 > ProductID: 0x2221 > IOCRequestFrameSize: 32 > MaxInitiators: 32 > MaxTargets: 1024 > MaxSasExpanders: 42 > MaxEnclosures: 43 > HighPriorityCredit: 128 > MaxReplyDescriptorPostQueueDepth: 65504 > ReplyFrameSize: 32 > MaxVolumes: 0 > MaxDevHandle: 1106 > MaxPersistentEntries: 128 > mpr0: Firmware: 08.00.00.00, Driver: 09.255.01.00-fbsd > mpr0: IOCCapabilities: > 7a85c > > ---- > > 08.00.00.00 is the last available firmware. > > > Since day one 'dmesg' is cluttered with CAM errors: > > ---- > > mpr1: Sending reset from mprsas_send_abort for target ID 5 > (da11:mpr1:0:5:0): WRITE(10). CDB: 2a 00 4c 15 1f 88 00 00 08 > 00 length 4096 SMID 554 terminated ioc 804b scsi 0 state c xfer 0 > (da11:mpr1:0:5:0): ATA COMMAND PASS THROUGH(16). CDB: 85 0d 06 00 01 00 > 01 00 00 00 00 00 00 40 06 00 length 512 SMID 506 ter(da11:mpr1:0:5:0): > READ(10). CDB: 28 00 4c 2b 95 c0 00 00 10 00 minated ioc 804b scsi 0 > state c xfer 0 (da11:mpr1:0:5:0): CAM status: Command timeout mpr1: > (da11:Unfreezing devq for target ID 5 mpr1:0:5:0): Retrying command > (da11:mpr1:0:5:0): READ(10). CDB: 28 00 4c 2b 95 c0 00 00 10 00 > (da11:mpr1:0:5:0): CAM status: SCSI Status Error (da11:mpr1:0:5:0): > SCSI status: Check Condition (da11:mpr1:0:5:0): SCSI sense: UNIT > ATTENTION asc:29,0 (Power on, reset, or bus device reset occurred) > (da11:mpr1:0:5:0): Retrying command (per sense data) (da11:mpr1:0:5:0): > READ(10). CDB: 28 00 4c 22 b5 b8 00 00 18 00 (da11:mpr1:0:5:0): CAM > status: SCSI Status Error (da11:mpr1:0:5:0): SCSI status: Check > Condition (da11:mpr1:0:5:0): SCSI sense: UNIT ATTENTION asc:29,0 (Power > on, reset, or bus device reset occurred) (da11:mpr1:0:5:0): Retrying > command (per sense data) (noperiph:mpr1:0:4294967295:0): SMID 2 > Aborting command 0xfffffe0001601a30 > > mpr1: Sending reset from mprsas_send_abort for target ID 2 > (da8:mpr1:0:2:0): WRITE(10). CDB: 2a 00 59 81 ae 18 00 00 30 00 > length 24576 SMID 898 terminated ioc 804b scsi 0 state c xfer 0 > (da8:mpr1:0:2:0): READ(10). CDB: 28 00 59 77 cc e0 00 00 18 00 length > 12288 SMID 604 terminated ioc 804b scsi 0 state c xfer 0 mpr1: > Unfreezing devq for target ID 2 (da8:mpr1:0:2:0): ATA COMMAND PASS > THROUGH(16). CDB: 85 0d 06 00 01 00 01 00 00 00 00 00 00 40 06 00 > (da8:mpr1:0:2:0): CAM status: Command timeout (da8:mpr1:0:2:0): > Retrying command (da8:mpr1:0:2:0): WRITE(10). CDB: 2a 00 59 81 ae 18 00 > 00 30 00 (da8:mpr1:0:2:0): CAM status: SCSI Status Error > (da8:mpr1:0:2:0): SCSI status: Check Condition (da8:mpr1:0:2:0): SCSI > sense: UNIT ATTENTION asc:29,0 (Power on, reset, or bus device reset > occurred) (da8:mpr1:0:2:0): Retrying command (per sense data) > (da8:mpr1:0:2:0): READ(10). CDB: 28 00 59 41 3d 08 00 00 10 00 > (da8:mpr1:0:2:0): CAM status: SCSI Status Error (da8:mpr1:0:2:0): SCSI > status: Check Condition (da8:mpr1:0:2:0): SCSI sense: UNIT ATTENTION > asc:29,0 (Power on, reset, or bus device reset occurred) > (da8:mpr1:0:2:0): Retrying command (per sense data) > (noperiph:mpr1:0:4294967295:0): SMID 3 Aborting command > 0xfffffe000160b660 > > ---- > > ZFS doesn't like this and sees read errors or even write errors. In > extreme cases the device is marked as FAULTED: > > ---- > > pool: examplepool > state: DEGRADED > status: One or more devices are faulted in response to persistent > errors. Sufficient replicas exist for the pool to continue functioning > in a degraded state. > action: Replace the faulted device, or use 'zpool clear' to mark the > device repaired. > scan: none requested > config: > > NAME STATE READ WRITE CKSUM > examplepool DEGRADED 0 0 0 > raidz1-0 ONLINE 0 0 0 > da3p1 ONLINE 0 0 0 > da4p1 ONLINE 0 0 0 > da5p1 ONLINE 0 0 0 > logs > da1p1 FAULTED 3 0 0 too many errors > cache > da1p2 FAULTED 3 0 0 too many errors > spares > da2p1 AVAIL > > errors: No known data errors > > ---- > > The problems arise on all 3 machines all all SSDs nearly daily. So I > highly suspect a software issue. Has anyone an idea what's going on and > what I can do to solve this problems? More information can be provided > if necessary. > > Regards, > Yamagi > > -- > Homepage: www.yamagi.org > XMPP: yamagi@yamagi.org > GnuPG/GPG: 0xEFBCCBCB > _______________________________________________ > freebsd-scsi@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-scsi > To unsubscribe, send any mail to "freebsd-scsi-unsubscribe@freebsd.org" -- Homepage: www.yamagi.org XMPP: yamagi@yamagi.org GnuPG/GPG: 0xEFBCCBCB From owner-freebsd-scsi@freebsd.org Mon Jul 13 09:13:44 2015 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 295863B61 for ; Mon, 13 Jul 2015 09:13:44 +0000 (UTC) (envelope-from killing@multiplay.co.uk) Received: from mail-wi0-f172.google.com (mail-wi0-f172.google.com [209.85.212.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id B6A041AC2 for ; Mon, 13 Jul 2015 09:13:43 +0000 (UTC) (envelope-from killing@multiplay.co.uk) Received: by widjy10 with SMTP id jy10so63227769wid.1 for ; Mon, 13 Jul 2015 02:13:36 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-type :content-transfer-encoding; bh=zkq16H8nA/qpUnQGNwAkakfjlS/OY6v1xBWP7jP3yKM=; b=Z8VBn4PJ3clz2ODhJ1SnaO9W9I+auM9Ui4ueWvB48BvirFR3GDuPAoc9IW6j2gfy9n kyq3KNPOrbyynRrrrvMzQgTebVbIOJmGV1ZOpE1gBbbYUGvs/JhUrow1ND+X7ttunMhp +ztvY44qb58ud6ydFauXKOQ3tX/5QegXscdDQ/GF9F7fnyhZWXewOc+nG7wMPM9bBIdB x4H8Aqw5FnQfLs/Dd+QguO9eh9OOYDxh2UGgeMl6gUGFxl9oKHsvLZZWVgt/3Ijlizun ND57byZNoMoW8bmWoiy6s86MUq29jDJBxNnygm3c0PjHNysnqFgh1TC08NGm17RFuiYb U/1Q== X-Gm-Message-State: ALoCoQk/kf/Y+V1RMwVqS0L7ooE4f/xZyIne5LGUNwNg4mTeBjugSLc3Tt7NjFw7fmzSPLfjl1DN X-Received: by 10.180.106.137 with SMTP id gu9mr21729908wib.54.1436778816206; Mon, 13 Jul 2015 02:13:36 -0700 (PDT) Received: from [10.10.1.68] (82-69-141-170.dsl.in-addr.zen.co.uk. [82.69.141.170]) by smtp.gmail.com with ESMTPSA id k5sm13490519wij.1.2015.07.13.02.13.35 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 13 Jul 2015 02:13:35 -0700 (PDT) Subject: Re: Device timeouts(?) with LSI SAS3008 on mpr(4) To: freebsd-scsi@freebsd.org References: <20150707132416.71b44c90f7f4cd6014a304b2@yamagi.org> <20150713110148.1a27b973881b64ce2f9e98e0@yamagi.org> From: Steven Hartland Message-ID: <55A3813C.7010002@multiplay.co.uk> Date: Mon, 13 Jul 2015 10:13:32 +0100 User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:38.0) Gecko/20100101 Thunderbird/38.0.1 MIME-Version: 1.0 In-Reply-To: <20150713110148.1a27b973881b64ce2f9e98e0@yamagi.org> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Jul 2015 09:13:44 -0000 That would indicate that TRIM on your disks is causing a problem, possibly a firmware bug causing TRIM requests to take an excessively long time to complete. What do you see from: sysctl -a | grep -E '(delete|trim)' Also while your seeing time-outs what does the output from gstat -d -p look like? Regards Steve On 13/07/2015 10:01, Yamagi Burmeister wrote: > Hello, > after some fiddling and testing I managed to track this down. TRIM is > the culprit: > > - With vfs.zfs.trim.enabled set to 1 timeouts occure. Regardless of > cabeling, of a backplane or direct connection. It doesn't matter if > Intel DC S3500 oder S3700 SSDs are connected, but on the other hand > both share the same controller. I don't have enough onboard S-ATA > ports to test the whole setup without the 9300-8i HBA, but a short > (maybe too short and without enough load) test with 6 SSDs didn't show > any timeouts. > > - With vfs.zfs.trim.enabled set to 0 I havn't seen a single timeout > for ~56 hours. > > Regards, > Yamagi > > On Tue, 7 Jul 2015 13:24:16 +0200 > Yamagi Burmeister wrote: > >> Hello, >> I've got 3 new Supermicro servers based upon the X10DRi-LN4+ platform. >> Each server is equiped with 2 LSI SAS9300-8i-SQL SAS adapters. Each >> adapter serves 8 Intel DC S3700 SSDs. Operating system is 10.1-STABLE >> as of r283938 on 2 servers and r285196 on the last one. >> >> The controller identify themself as: >> >> ---- >> >> mpr0: port 0x6000-0x60ff mem >> 0xc7240000-0xc724ffff,0xc7200000-0xc723ffff irq 32 at device 0.0 on >> pci2 mpr0: IOCFacts : MsgVersion: 0x205 >> HeaderVersion: 0x2300 >> IOCNumber: 0 >> IOCExceptions: 0x0 >> MaxChainDepth: 128 >> NumberOfPorts: 1 >> RequestCredit: 10240 >> ProductID: 0x2221 >> IOCRequestFrameSize: 32 >> MaxInitiators: 32 >> MaxTargets: 1024 >> MaxSasExpanders: 42 >> MaxEnclosures: 43 >> HighPriorityCredit: 128 >> MaxReplyDescriptorPostQueueDepth: 65504 >> ReplyFrameSize: 32 >> MaxVolumes: 0 >> MaxDevHandle: 1106 >> MaxPersistentEntries: 128 >> mpr0: Firmware: 08.00.00.00, Driver: 09.255.01.00-fbsd >> mpr0: IOCCapabilities: >> 7a85c >> >> ---- >> >> 08.00.00.00 is the last available firmware. >> >> >> Since day one 'dmesg' is cluttered with CAM errors: >> >> ---- >> >> mpr1: Sending reset from mprsas_send_abort for target ID 5 >> (da11:mpr1:0:5:0): WRITE(10). CDB: 2a 00 4c 15 1f 88 00 00 08 >> 00 length 4096 SMID 554 terminated ioc 804b scsi 0 state c xfer 0 >> (da11:mpr1:0:5:0): ATA COMMAND PASS THROUGH(16). CDB: 85 0d 06 00 01 00 >> 01 00 00 00 00 00 00 40 06 00 length 512 SMID 506 ter(da11:mpr1:0:5:0): >> READ(10). CDB: 28 00 4c 2b 95 c0 00 00 10 00 minated ioc 804b scsi 0 >> state c xfer 0 (da11:mpr1:0:5:0): CAM status: Command timeout mpr1: >> (da11:Unfreezing devq for target ID 5 mpr1:0:5:0): Retrying command >> (da11:mpr1:0:5:0): READ(10). CDB: 28 00 4c 2b 95 c0 00 00 10 00 >> (da11:mpr1:0:5:0): CAM status: SCSI Status Error (da11:mpr1:0:5:0): >> SCSI status: Check Condition (da11:mpr1:0:5:0): SCSI sense: UNIT >> ATTENTION asc:29,0 (Power on, reset, or bus device reset occurred) >> (da11:mpr1:0:5:0): Retrying command (per sense data) (da11:mpr1:0:5:0): >> READ(10). CDB: 28 00 4c 22 b5 b8 00 00 18 00 (da11:mpr1:0:5:0): CAM >> status: SCSI Status Error (da11:mpr1:0:5:0): SCSI status: Check >> Condition (da11:mpr1:0:5:0): SCSI sense: UNIT ATTENTION asc:29,0 (Power >> on, reset, or bus device reset occurred) (da11:mpr1:0:5:0): Retrying >> command (per sense data) (noperiph:mpr1:0:4294967295:0): SMID 2 >> Aborting command 0xfffffe0001601a30 >> >> mpr1: Sending reset from mprsas_send_abort for target ID 2 >> (da8:mpr1:0:2:0): WRITE(10). CDB: 2a 00 59 81 ae 18 00 00 30 00 >> length 24576 SMID 898 terminated ioc 804b scsi 0 state c xfer 0 >> (da8:mpr1:0:2:0): READ(10). CDB: 28 00 59 77 cc e0 00 00 18 00 length >> 12288 SMID 604 terminated ioc 804b scsi 0 state c xfer 0 mpr1: >> Unfreezing devq for target ID 2 (da8:mpr1:0:2:0): ATA COMMAND PASS >> THROUGH(16). CDB: 85 0d 06 00 01 00 01 00 00 00 00 00 00 40 06 00 >> (da8:mpr1:0:2:0): CAM status: Command timeout (da8:mpr1:0:2:0): >> Retrying command (da8:mpr1:0:2:0): WRITE(10). CDB: 2a 00 59 81 ae 18 00 >> 00 30 00 (da8:mpr1:0:2:0): CAM status: SCSI Status Error >> (da8:mpr1:0:2:0): SCSI status: Check Condition (da8:mpr1:0:2:0): SCSI >> sense: UNIT ATTENTION asc:29,0 (Power on, reset, or bus device reset >> occurred) (da8:mpr1:0:2:0): Retrying command (per sense data) >> (da8:mpr1:0:2:0): READ(10). CDB: 28 00 59 41 3d 08 00 00 10 00 >> (da8:mpr1:0:2:0): CAM status: SCSI Status Error (da8:mpr1:0:2:0): SCSI >> status: Check Condition (da8:mpr1:0:2:0): SCSI sense: UNIT ATTENTION >> asc:29,0 (Power on, reset, or bus device reset occurred) >> (da8:mpr1:0:2:0): Retrying command (per sense data) >> (noperiph:mpr1:0:4294967295:0): SMID 3 Aborting command >> 0xfffffe000160b660 >> >> ---- >> >> ZFS doesn't like this and sees read errors or even write errors. In >> extreme cases the device is marked as FAULTED: >> >> ---- >> >> pool: examplepool >> state: DEGRADED >> status: One or more devices are faulted in response to persistent >> errors. Sufficient replicas exist for the pool to continue functioning >> in a degraded state. >> action: Replace the faulted device, or use 'zpool clear' to mark the >> device repaired. >> scan: none requested >> config: >> >> NAME STATE READ WRITE CKSUM >> examplepool DEGRADED 0 0 0 >> raidz1-0 ONLINE 0 0 0 >> da3p1 ONLINE 0 0 0 >> da4p1 ONLINE 0 0 0 >> da5p1 ONLINE 0 0 0 >> logs >> da1p1 FAULTED 3 0 0 too many errors >> cache >> da1p2 FAULTED 3 0 0 too many errors >> spares >> da2p1 AVAIL >> >> errors: No known data errors >> >> ---- >> >> The problems arise on all 3 machines all all SSDs nearly daily. So I >> highly suspect a software issue. Has anyone an idea what's going on and >> what I can do to solve this problems? More information can be provided >> if necessary. >> >> Regards, >> Yamagi >> >> -- >> Homepage: www.yamagi.org >> XMPP: yamagi@yamagi.org >> GnuPG/GPG: 0xEFBCCBCB >> _______________________________________________ >> freebsd-scsi@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-scsi >> To unsubscribe, send any mail to "freebsd-scsi-unsubscribe@freebsd.org" > From owner-freebsd-scsi@freebsd.org Mon Jul 13 09:26:00 2015 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 06CC23F21 for ; Mon, 13 Jul 2015 09:26:00 +0000 (UTC) (envelope-from lists@yamagi.org) Received: from mail1.yamagi.org (yugo.yamagi.org [212.48.122.103]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id C03EF1F98 for ; Mon, 13 Jul 2015 09:25:58 +0000 (UTC) (envelope-from lists@yamagi.org) Received: from [192.168.100.101] (helo=aka) by mail1.yamagi.org with esmtpsa (TLSv1:DHE-RSA-AES256-SHA:256) (Exim 4.85 (FreeBSD)) (envelope-from ) id 1ZEZzl-000H9a-3F; Mon, 13 Jul 2015 11:25:54 +0200 Date: Mon, 13 Jul 2015 11:25:47 +0200 From: Yamagi Burmeister To: killing@multiplay.co.uk Cc: freebsd-scsi@freebsd.org Subject: Re: Device timeouts(?) with LSI SAS3008 on mpr(4) Message-Id: <20150713112547.8f044beabe26672fd13fc528@yamagi.org> In-Reply-To: <55A3813C.7010002@multiplay.co.uk> References: <20150707132416.71b44c90f7f4cd6014a304b2@yamagi.org> <20150713110148.1a27b973881b64ce2f9e98e0@yamagi.org> <55A3813C.7010002@multiplay.co.uk> X-Mailer: Sylpheed 3.4.2 (GTK+ 2.24.27; amd64-portbld-freebsd10.0) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Jul 2015 09:26:00 -0000 On Mon, 13 Jul 2015 10:13:32 +0100 Steven Hartland wrote: > What do you see from: > sysctl -a | grep -E '(delete|trim)' % sysctl -a | grep -E '(delete|trim)' kern.geom.dev.delete_max_sectors: 262144 kern.cam.da.1.delete_max: 8589803520 kern.cam.da.1.delete_method: ATA_TRIM kern.cam.da.8.delete_max: 12884705280 kern.cam.da.8.delete_method: ATA_TRIM kern.cam.da.9.delete_max: 12884705280 kern.cam.da.9.delete_method: ATA_TRIM kern.cam.da.3.delete_max: 12884705280 kern.cam.da.3.delete_method: ATA_TRIM kern.cam.da.12.delete_max: 12884705280 kern.cam.da.12.delete_method: ATA_TRIM kern.cam.da.7.delete_max: 12884705280 kern.cam.da.7.delete_method: ATA_TRIM kern.cam.da.2.delete_max: 12884705280 kern.cam.da.2.delete_method: ATA_TRIM kern.cam.da.11.delete_max: 12884705280 kern.cam.da.11.delete_method: ATA_TRIM kern.cam.da.6.delete_max: 12884705280 kern.cam.da.6.delete_method: ATA_TRIM kern.cam.da.10.delete_max: 12884705280 kern.cam.da.10.delete_method: ATA_TRIM kern.cam.da.5.delete_max: 12884705280 kern.cam.da.5.delete_method: ATA_TRIM kern.cam.da.0.delete_max: 8589803520 kern.cam.da.0.delete_method: ATA_TRIM kern.cam.da.4.delete_max: 12884705280 kern.cam.da.4.delete_method: ATA_TRIM vfs.zfs.trim.max_interval: 1 vfs.zfs.trim.timeout: 30 vfs.zfs.trim.txg_delay: 32 vfs.zfs.trim.enabled: 1 vfs.zfs.vdev.trim_max_pending: 10000 vfs.zfs.vdev.bio_delete_disable: 0 vfs.zfs.vdev.trim_max_active: 64 vfs.zfs.vdev.trim_min_active: 1 vfs.zfs.vdev.trim_on_init: 1 kstat.zfs.misc.arcstats.deleted: 289783817 kstat.zfs.misc.zio_trim.failed: 431 kstat.zfs.misc.zio_trim.unsupported: 0 kstat.zfs.misc.zio_trim.success: 6457142235 kstat.zfs.misc.zio_trim.bytes: 88207753330688 > Also while your seeing time-outs what does the output from gstat -d -p > look like? I'll try to get that data but it may take a while. Thank you, Yamagi -- Homepage: www.yamagi.org XMPP: yamagi@yamagi.org GnuPG/GPG: 0xEFBCCBCB From owner-freebsd-scsi@freebsd.org Mon Jul 13 09:54:53 2015 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 327A7998073 for ; Mon, 13 Jul 2015 09:54:53 +0000 (UTC) (envelope-from killing@multiplay.co.uk) Received: from mail-wg0-f53.google.com (mail-wg0-f53.google.com [74.125.82.53]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id C301D1DCD for ; Mon, 13 Jul 2015 09:54:52 +0000 (UTC) (envelope-from killing@multiplay.co.uk) Received: by wgxm20 with SMTP id m20so108506804wgx.3 for ; Mon, 13 Jul 2015 02:54:45 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:subject:to:references:cc:from:message-id:date :user-agent:mime-version:in-reply-to:content-type; bh=nTBbimBKYqTiEFIo/TCSrruHKZhhLCZWcgQTaNBQ9FE=; b=P1UoBIYuPNk9giVt2i1iokb0uTTEXPJUefDfYRSSjQd0q+pefPFmyi7KPI2tX5z0Vo xE18Twz0H1QlZIDt52yIv6CTwRRrYIwpwT1adjmDoR4HIZff8BWtkchrYzS/keBJULnQ bjQ+FrhRAvr2FMpQVkzz4qfr5EUQ/R+/K7qFggnMGaVEgVMp2Sj8G0hkwsMzcCAiAiFA ETu8Xsf2YGthb4bfwigM8siBu8bZ/W9G2gtCWHu0eOaMgbpm2P6V2I98W9VKSAaVxt1B 7mVStTGTB6/oNF279W6dfPvA6P5Fl4PiyDIDyeBAjHGxw0eAnw5f0LzUZrGwK6KWdCWq ddHw== X-Gm-Message-State: ALoCoQktfuq0SIngVo8eqg97Z0CildyArpOrPc0PUCSpXPD5mvtnC9ixu6pXErmPOh2uuxCv04P3 X-Received: by 10.194.178.99 with SMTP id cx3mr64113623wjc.33.1436781285426; Mon, 13 Jul 2015 02:54:45 -0700 (PDT) Received: from [10.10.1.68] (82-69-141-170.dsl.in-addr.zen.co.uk. [82.69.141.170]) by smtp.gmail.com with ESMTPSA id q3sm28192965wjr.38.2015.07.13.02.54.43 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 13 Jul 2015 02:54:43 -0700 (PDT) Subject: Re: Device timeouts(?) with LSI SAS3008 on mpr(4) To: Yamagi Burmeister References: <20150707132416.71b44c90f7f4cd6014a304b2@yamagi.org> <20150713110148.1a27b973881b64ce2f9e98e0@yamagi.org> <55A3813C.7010002@multiplay.co.uk> <20150713112547.8f044beabe26672fd13fc528@yamagi.org> Cc: freebsd-scsi@freebsd.org From: Steven Hartland Message-ID: <55A38AE1.5010204@multiplay.co.uk> Date: Mon, 13 Jul 2015 10:54:41 +0100 User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:38.0) Gecko/20100101 Thunderbird/38.0.1 MIME-Version: 1.0 In-Reply-To: <20150713112547.8f044beabe26672fd13fc528@yamagi.org> Content-Type: multipart/mixed; boundary="------------020206030603020309090600" X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Jul 2015 09:54:53 -0000 This is a multi-part message in MIME format. --------------020206030603020309090600 Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit I assume da0 and da1 are a different disk then? With regards your disk setup are all of you disks SSD's if so why do you have separate log and cache devices? One thing you could try is to limit the delete size. kern.geom.dev.delete_max_sectors limits the single request size allowed by geom but then individual requests can be built back up in cam so I don't think this will help you too much. Instead I would try limiting the individual device delete_max, so add one line per disk into /boot/loader.conf of the form: kern.cam.da.X.delete_max=1073741824 You can actually change these on the fly using sysctl, but in order to catch an cleanup done on boot loader.conf is the best place to tune them permanently. I've attached a little c util which you can use to do direct disk deletes if you have a spare disk you can play with. Be aware that most controller optimise delete's out if they know the cells are empty hence you do need to have written data to the sectors each time you test a delete. As the requests go through geom anything over kern.geom.dev.delete_max_sectors will be split but then may well be recombined in CAM. Another relevant setting is vfs.zfs.vdev.trim_max_active which can be used to limit the number of outstanding geom delete requests to the each device. Oh one other thing, it would be interesting to see the output from camcontrol identify e.g. camcontrol identify da8 camcontrol identify da0 Regards Steve On 13/07/2015 10:25, Yamagi Burmeister wrote: > On Mon, 13 Jul 2015 10:13:32 +0100 > Steven Hartland wrote: > >> What do you see from: >> sysctl -a | grep -E '(delete|trim)' > % sysctl -a | grep -E '(delete|trim)' > kern.geom.dev.delete_max_sectors: 262144 > kern.cam.da.1.delete_max: 8589803520 > kern.cam.da.1.delete_method: ATA_TRIM > kern.cam.da.8.delete_max: 12884705280 > kern.cam.da.8.delete_method: ATA_TRIM > kern.cam.da.9.delete_max: 12884705280 > kern.cam.da.9.delete_method: ATA_TRIM > kern.cam.da.3.delete_max: 12884705280 > kern.cam.da.3.delete_method: ATA_TRIM > kern.cam.da.12.delete_max: 12884705280 > kern.cam.da.12.delete_method: ATA_TRIM > kern.cam.da.7.delete_max: 12884705280 > kern.cam.da.7.delete_method: ATA_TRIM > kern.cam.da.2.delete_max: 12884705280 > kern.cam.da.2.delete_method: ATA_TRIM > kern.cam.da.11.delete_max: 12884705280 > kern.cam.da.11.delete_method: ATA_TRIM > kern.cam.da.6.delete_max: 12884705280 > kern.cam.da.6.delete_method: ATA_TRIM > kern.cam.da.10.delete_max: 12884705280 > kern.cam.da.10.delete_method: ATA_TRIM > kern.cam.da.5.delete_max: 12884705280 > kern.cam.da.5.delete_method: ATA_TRIM > kern.cam.da.0.delete_max: 8589803520 > kern.cam.da.0.delete_method: ATA_TRIM > kern.cam.da.4.delete_max: 12884705280 > kern.cam.da.4.delete_method: ATA_TRIM > vfs.zfs.trim.max_interval: 1 > vfs.zfs.trim.timeout: 30 > vfs.zfs.trim.txg_delay: 32 > vfs.zfs.trim.enabled: 1 > vfs.zfs.vdev.trim_max_pending: 10000 > vfs.zfs.vdev.bio_delete_disable: 0 > vfs.zfs.vdev.trim_max_active: 64 > vfs.zfs.vdev.trim_min_active: 1 > vfs.zfs.vdev.trim_on_init: 1 > kstat.zfs.misc.arcstats.deleted: 289783817 > kstat.zfs.misc.zio_trim.failed: 431 > kstat.zfs.misc.zio_trim.unsupported: 0 > kstat.zfs.misc.zio_trim.success: 6457142235 > kstat.zfs.misc.zio_trim.bytes: 88207753330688 > > >> Also while your seeing time-outs what does the output from gstat -d -p >> look like? > I'll try to get that data but it may take a while. > > Thank you, > Yamagi > --------------020206030603020309090600 Content-Type: text/plain; charset=UTF-8; name="ioctl-delete.c" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="ioctl-delete.c" #include #include #include #include #include #include #include #include #include #include void syntax() { fprintf(stderr,"ioctl-delete \n"); exit(1); } double timediff(const struct timeval *t1, const struct timeval *t2) { double ret; ret = t2->tv_sec - t1->tv_sec; ret += (t2->tv_usec - t1->tv_usec) * 0.000001; return ret; } int main(int argc, char** argv) { off_t ioarg[2]; int fd; off_t offset, size; char *device; struct timeval start, end; double tdiff; char buf[8]; uint64_t bsec; unsigned int sector_size = 512; if (4 != argc) syntax(); device = argv[1]; offset = strtoul(argv[2], NULL, 10); size = strtoul(argv[3], NULL, 10); fprintf(stderr, "deleting: %ld, %ld\n", offset, size); if ((fd = open(device, O_RDWR)) < 0) err(1, "device '%s' not found", device); if (ioctl(fd, DIOCGSECTORSIZE, §or_size) <0) err(1, "delete failed"); ioarg[0] = offset * sector_size; ioarg[1] = size * sector_size; gettimeofday(&start, NULL); if (ioctl(fd, DIOCGDELETE, ioarg) <0) err(1, "delete failed"); gettimeofday(&end, NULL); tdiff = timediff(&start, &end); bsec = (int64_t)((long double)ioarg[1] / tdiff); humanize_number(buf, sizeof(buf), bsec, "/s", HN_AUTOSCALE, HN_B | HN_NOSPACE | HN_DECIMAL); printf("deleted %lu bytes in %f seconds, %ld bytes per second (%s)\n", ioarg[1], tdiff, bsec, buf); close(fd); exit(1); } --------------020206030603020309090600-- From owner-freebsd-scsi@freebsd.org Tue Jul 14 15:24:53 2015 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A09189A1CDF for ; Tue, 14 Jul 2015 15:24:53 +0000 (UTC) (envelope-from etnapierala@gmail.com) Received: from mail-lb0-x232.google.com (mail-lb0-x232.google.com [IPv6:2a00:1450:4010:c04::232]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 0FFC2942 for ; Tue, 14 Jul 2015 15:24:53 +0000 (UTC) (envelope-from etnapierala@gmail.com) Received: by lbbyj8 with SMTP id yj8so8335527lbb.0 for ; Tue, 14 Jul 2015 08:24:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:date:from:to:cc:subject:message-id:mail-followup-to :references:mime-version:content-type:content-disposition :content-transfer-encoding:in-reply-to:user-agent; bh=wLKpmhbCNirTyjpwteqzu3xyC6KSWOc8d8/5ZQCQFXM=; b=f0xQ2ISolGpmbgx06grLRnyVMxggZevK96Hsklb4eK8iYahGDhMUw80miS6fuK1tBm 7xUCBLs3PGt2obsXLm0/xOTathN3QjHIGJyPe8Qx3U+lsl+8GYupBGEfdmpUDgAh3F5k qhxtxLCY3Vw9Z9KpVDDexg0lSqJRz86VLqqjY/DnCssl+ABhd0jN76CQyMovJ7gvL6DT Sd36ZgO0gFbC3o/XsQTiLv7YRd5juFWwvouBwbyBYjvRB6NC+vm8BPlklUcBZhJSIsVL b7fVgdtKz1NSvX2pFs96t/oZM1q5q8tznIKIfftjasvtMdVHSLEPmqOvbnOuLNOyuFft Gvtg== X-Received: by 10.112.221.9 with SMTP id qa9mr38316621lbc.23.1436887490383; Tue, 14 Jul 2015 08:24:50 -0700 (PDT) Received: from brick ([46.229.149.194]) by smtp.gmail.com with ESMTPSA id r2sm360958laj.6.2015.07.14.08.24.48 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 14 Jul 2015 08:24:48 -0700 (PDT) Sender: =?UTF-8?Q?Edward_Tomasz_Napiera=C5=82a?= Date: Tue, 14 Jul 2015 17:24:46 +0200 From: Edward Tomasz =?utf-8?Q?Napiera=C5=82a?= To: Andrea Brancatelli Cc: freebsd-scsi@freebsd.org Subject: Re: New iSCSI initiator with Dell TL2000 Message-ID: <20150714152446.GB3390@brick> Mail-Followup-To: Andrea Brancatelli , freebsd-scsi@freebsd.org References: <54F86F91.80603@schema31.it> <20150305170842.GA12103@brick.home> <555DA040.2020901@schema31.it> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <555DA040.2020901@schema31.it> User-Agent: Mutt/1.5.23 (2014-03-12) X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Jul 2015 15:24:53 -0000 Sorry for delay. The debug log was quite useful - it shows that the target drops the connection after the first PDU. Unfortunately, I didn't have a slightest idea what could actually be wrong there. The patch below is a shoot in the dark. Could you patch the source tree, rebuild and reinstall world, run iscsid with debug, just like before, and see if it makes any difference? (You actually only need a new iscsid binary; if you don't want to rebuild everything just tell me, I'll send you the executable.) If it doesn't - could you make a packet dump of a working session establishment (ie with iscontrol), using tcpdump like below (substitute em0 with your network interface) and send it to me: tcpdump -s0 port 3260 -i em0 -w meh.pcap Index: usr.sbin/iscsid/discovery.c =================================================================== --- usr.sbin/iscsid/discovery.c (revision 285383) +++ usr.sbin/iscsid/discovery.c (working copy) @@ -89,7 +89,7 @@ text_new_request(struct connection *conn) bhstr->bhstr_target_transfer_tag = 0xffffffff; bhstr->bhstr_initiator_task_tag = 0; /* XXX */ - bhstr->bhstr_cmdsn = 0; /* XXX */ + bhstr->bhstr_cmdsn = htonl(1); /* XXX */ bhstr->bhstr_expstatsn = htonl(conn->conn_statsn + 1); return (request); @@ -133,7 +133,7 @@ logout_new_request(struct connection *conn) bhslr->bhslr_reason = BHSLR_REASON_CLOSE_SESSION; bhslr->bhslr_reason |= 0x80; bhslr->bhslr_initiator_task_tag = 0; /* XXX */ - bhslr->bhslr_cmdsn = 0; /* XXX */ + bhslr->bhslr_cmdsn = htonl(1); /* XXX */ bhslr->bhslr_expstatsn = htonl(conn->conn_statsn + 1); return (request); Index: usr.sbin/iscsid/login.c =================================================================== --- usr.sbin/iscsid/login.c (revision 285383) +++ usr.sbin/iscsid/login.c (working copy) @@ -298,7 +298,7 @@ login_new_request(struct connection *conn, int csg memcpy(bhslr->bhslr_isid, &conn->conn_isid, sizeof(bhslr->bhslr_isid)); bhslr->bhslr_tsih = htons(conn->conn_tsih); bhslr->bhslr_initiator_task_tag = 0; - bhslr->bhslr_cmdsn = 0; + bhslr->bhslr_cmdsn = htonl(1); bhslr->bhslr_expstatsn = htonl(conn->conn_statsn + 1); return (request); On 0521T1107, Andrea Brancatelli wrote: > Hello, > > sorry for the delay, but the machine is in production and we dont' have > many circunstances for experimentations. > > This is the machine: > > [root@arsenico ~]# freebsd-version > 10.1-RELEASE-p10 > [root@arsenico ~]# > > So, we started iscsid -d and this is the output we got: > > [root@arsenico ~]# while :; do iscsid -d; done > iscsid: waiting for request from the kernel > iscsid: not forking due to -d flag; will exit after servicing a single > request > iscsid: 10.40.2.100 > (iqn.1988-11.com.dell.20278e:eui.5000e111456c2002.0): connecting to > 10.40.2.100 > iscsid: 10.40.2.100 > (iqn.1988-11.com.dell.20278e:eui.5000e111456c2002.0): setting session > timeout to 60 seconds > iscsid: 10.40.2.100 > (iqn.1988-11.com.dell.20278e:eui.5000e111456c2002.0): Capsicum > capability mode enabled > iscsid: 10.40.2.100 > (iqn.1988-11.com.dell.20278e:eui.5000e111456c2002.0): beginning Login > phase; sending Login PDU > iscsid: 10.40.2.100 > (iqn.1988-11.com.dell.20278e:eui.5000e111456c2002.0): key to send: > "AuthMethod=None" > iscsid: 10.40.2.100 > (iqn.1988-11.com.dell.20278e:eui.5000e111456c2002.0): key to send: > "InitiatorName=iqn.1994-09.org.freebsd:arsenico.roma.schema31.it" > iscsid: 10.40.2.100 > (iqn.1988-11.com.dell.20278e:eui.5000e111456c2002.0): key to send: > "SessionType=Normal" > iscsid: 10.40.2.100 > (iqn.1988-11.com.dell.20278e:eui.5000e111456c2002.0): key to send: > "TargetName=iqn.1988-11.com.dell.20278e:eui.5000e111456c2002.0" > iscsid: 10.40.2.100 > (iqn.1988-11.com.dell.20278e:eui.5000e111456c2002.0): exiting due to timeout > iscsid: waiting for request from the kernel > iscsid: not forking due to -d flag; will exit after servicing a single > request > iscsid: 10.40.2.100 > (iqn.1988-11.com.dell.20278e:eui.5000e111456c2002.1): connecting to > 10.40.2.100 > iscsid: 10.40.2.100 > (iqn.1988-11.com.dell.20278e:eui.5000e111456c2002.1): setting session > timeout to 60 seconds > iscsid: 10.40.2.100 > (iqn.1988-11.com.dell.20278e:eui.5000e111456c2002.1): Capsicum > capability mode enabled > iscsid: 10.40.2.100 > (iqn.1988-11.com.dell.20278e:eui.5000e111456c2002.1): beginning Login > phase; sending Login PDU > iscsid: 10.40.2.100 > (iqn.1988-11.com.dell.20278e:eui.5000e111456c2002.1): key to send: > "AuthMethod=None" > iscsid: 10.40.2.100 > (iqn.1988-11.com.dell.20278e:eui.5000e111456c2002.1): key to send: > "InitiatorName=iqn.1994-09.org.freebsd:arsenico.roma.schema31.it" > iscsid: 10.40.2.100 > (iqn.1988-11.com.dell.20278e:eui.5000e111456c2002.1): key to send: > "SessionType=Normal" > iscsid: 10.40.2.100 > (iqn.1988-11.com.dell.20278e:eui.5000e111456c2002.1): key to send: > "TargetName=iqn.1988-11.com.dell.20278e:eui.5000e111456c2002.1" > iscsid: 10.40.2.100 > (iqn.1988-11.com.dell.20278e:eui.5000e111456c2002.1): exiting due to timeout > > > If you need else feel free to ask, we'll try to give you some more infos. > > > Il 05/03/15 18:08, Edward Tomasz Napierała ha scritto: > > On 0305T1600, Andrea Brancatelli wrote: > >> Hello everybody. > >> > >> We have a marvelous Dell Powervault TL2000 here, with the iSCSI bridge. > >> > >> Our "backup server", with Bacula running on it, used to be a 9.1 machine > >> and used to work ok :-) > >> > >> Today we upgraded to FreeBSD 10 and tried to switch to new iscsid, but > > FreeBSD 10.what exactly? 10.1? > > > >> we weren't able to connect to the iSCSI device. > >> > >> This is what we found in the TL2000 log file: > >> > >> > >> Mar 5 14:51:15 (UTC) bwcore[216]: iSCSI: Connection accepted from 10.40.3.1 > >> Mar 5 14:51:15 (UTC) bwcore[238]: iSCSI: New session from > >> iqn.1994-09.org.freebsd:arsenico.roma.schema31.it to > >> iqn.1988-11.com.dell.20278e:eui.5000e111456c2002.0 > >> Mar 5 14:51:15 (UTC) bwcore[238]: iSCSI: request to login to target > >> iqn.1988-11.com.dell.20278e:eui.5000e111456c2002.0 > >> Mar 5 14:51:15 (UTC) bwcore[238]: iSCSI: Login request with illegal stage > >> Mar 5 14:51:15 (UTC) bwcore[238]: iSCSI: Local socket closure > >> Mar 5 14:52:45 (UTC) bwcore[216]: iSCSI: Connection accepted from 10.40.3.1 > > Could you do this: kill iscsid (pkill iscsid), then start iscsid like this: > > > > while :; do iscsid -d; done > > > > ... then, with this running in a terminal, try to add the session, > > then copy/paste the output and mail me? Thanks! > > > > -- > *Andrea Brancatelli > > Schema31 S.p.A. > Responsabile IT* > > BO - FI - ROMA - PA > ITALY > Tel: +39.06.98358472 > Cell: +39.331.2488468 > Fax: +39.055.71880466 > Società del gruppo SC31 ITALIA > _______________________________________________ > freebsd-scsi@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-scsi > To unsubscribe, send any mail to "freebsd-scsi-unsubscribe@freebsd.org" From owner-freebsd-scsi@freebsd.org Tue Jul 14 13:00:34 2015 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3F117999AA2 for ; Tue, 14 Jul 2015 13:00:34 +0000 (UTC) (envelope-from sagig@dev.mellanox.co.il) Received: from mail-wi0-f172.google.com (mail-wi0-f172.google.com [209.85.212.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id D0A00681 for ; Tue, 14 Jul 2015 13:00:33 +0000 (UTC) (envelope-from sagig@dev.mellanox.co.il) Received: by widjy10 with SMTP id jy10so98711763wid.1 for ; Tue, 14 Jul 2015 06:00:26 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:to:cc:from:subject:message-id:date:user-agent :mime-version:content-type:content-transfer-encoding; bh=J4+11uM/fQZH9ZM0wf/7GRLETv0qDtlW6EG49aRHZSI=; b=k3ckuitxq4mpFLYAj/jrcF8bLwDzhyEN5w7EljLDtOyclLLZEi3VqL/9esi+LAbqBz tpT63fy5nRdgANUdNPvjobN74DPmtv4ZdqUKf7MPeeV1aeq9hPdejziCtsG3kO/Hc3NV iniXPv4olboZulgL612/wjBaPv25TWWsIpcgSJX5aKhS1Tzupw5kiEMwXTfwe/zWezPK mTIjCh6Wg/Wm7O3y4BKPhTwB67ychk1MCBhvsB6EiiJMx0Htl46JsK9dkRsiQp19iUjd uft5DRJPxMPwOHQkrmyNPrYsN0PvFRqbCDdR4vsDd9SziSH9VvcCgYybFIXJHYOccXP8 8rWA== X-Gm-Message-State: ALoCoQkQfyf2f2Ee/tRU+bwMTQO9xBDOW7iKPYS9fLWebY3tlqWovMxJ3Jx3R8psq2K+5/SsgtjB X-Received: by 10.194.235.169 with SMTP id un9mr18326703wjc.136.1436878826536; Tue, 14 Jul 2015 06:00:26 -0700 (PDT) Received: from [10.223.0.123] ([193.47.165.251]) by smtp.googlemail.com with ESMTPSA id fm8sm3119702wib.9.2015.07.14.06.00.24 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 14 Jul 2015 06:00:25 -0700 (PDT) To: freebsd-scsi@freebsd.org, freebsd-infiniband@freebsd.org, =?UTF-8?Q?Edward_Tomasz_Napiera=c5=82a?= Cc: Hans Petter Selasky , Oded Shanoon , Max Gurtovoy , Oren Duer , Yaron Gepstein , "John F. Kim" , Rich Cotton , Branko Cenanovic , Arik Kol , Boris Shpolyansky , Or Gerlitz From: Sagi Grimberg Subject: iSER initiator FreeBSD support [RFC] Message-ID: <55A507E6.8060507@dev.mellanox.co.il> Date: Tue, 14 Jul 2015 16:00:22 +0300 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.0.1 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-Mailman-Approved-At: Tue, 14 Jul 2015 16:41:56 +0000 X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Jul 2015 13:00:34 -0000 Hi All, We have updated our iser github repository [1] with the latest iser driver (beta level) additional support and fixes: - Full Rebase on top of 11-current (r284921) - Discovery (over iser) support - Added spec compliance for iser assisted Login message - HA support (automatic reconnects, traffic failover) - Split iser from iscsi module (build as a separate module with dependancy on iscsi) - Several stability fixes The current version of the iser driver is constantly tested in our nightly regression systems and seems to pass several tests: - Traffic tests - Login/logout: basic functionality, non-existent target error flows, stress - Compliance/FS: diskinfo, newfs, iozone - HA: Session recovery, IO with port toggling - Module load/unload We would like to learn about other open-source test suites that we can run in our labs. Edward, We feel that the code is ready for inclusion. Would you mind reviewing our code and provide feedback? [1] https://github.com/sagigrimberg/iser-freebsd Branch: iser-rebase-11-current-r284921 Thanks, Max Gurtovoy Sagi Grimberg Mellanox Technologies From owner-freebsd-scsi@freebsd.org Wed Jul 15 04:32:08 2015 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B7B0A99BFCF for ; Wed, 15 Jul 2015 04:32:08 +0000 (UTC) (envelope-from alex.burlyga.ietf@gmail.com) Received: from mail-yk0-x22a.google.com (mail-yk0-x22a.google.com [IPv6:2607:f8b0:4002:c07::22a]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 76FCD14CC for ; Wed, 15 Jul 2015 04:32:08 +0000 (UTC) (envelope-from alex.burlyga.ietf@gmail.com) Received: by ykeo3 with SMTP id o3so26543270yke.0 for ; Tue, 14 Jul 2015 21:32:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=e+oJjuGYrPZyX93Y4mCniABxl3w9VEv+56Rtave9GGk=; b=NtAWNN1ugUJDP1hrIAJctfh1by0e83xE2KBV7gWpvlznlLy4HRjiQ0X0lb8r5U/oxx CJReflo9WUJYwCzaEXi79EOdvm2vcHb7j4weo731KLZgcP2ZuZb8ddHj+PGHhcSbJo+U 4W2GBd3clwgYpa39nIsBAbFlEhN6UVqCpRDP13p0jUHV/pNisI2X57zU3lOrIqrDzZi3 oxDwao3oWHpxYzd8EABsgxTQXdGqfz6HUWhygKEyzTgoqlH4vx6YKV0893lyR4hZVQ+T dhaZAQbk6v+AAAKhwztiWVEzFqRgBlvUux8iRi9116rjhJmmdDSX+n7Hwn9w1gMhs95s Xo5w== MIME-Version: 1.0 X-Received: by 10.13.249.3 with SMTP id j3mr2166896ywf.170.1436934727635; Tue, 14 Jul 2015 21:32:07 -0700 (PDT) Received: by 10.13.244.65 with HTTP; Tue, 14 Jul 2015 21:32:07 -0700 (PDT) In-Reply-To: <55A507E6.8060507@dev.mellanox.co.il> References: <55A507E6.8060507@dev.mellanox.co.il> Date: Tue, 14 Jul 2015 21:32:07 -0700 Message-ID: Subject: Re: iSER initiator FreeBSD support [RFC] From: "alex.burlyga.ietf alex.burlyga.ietf" To: Sagi Grimberg Cc: freebsd-scsi@freebsd.org Content-Type: text/plain; charset=UTF-8 X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 15 Jul 2015 04:32:08 -0000 This is pretty cool. One thing is missing on the list is libiscsi which has pretty comprehensive compliance test. You can find source here: https://github.com/sahlberg/libiscsi, and iscsi-test-cu tool which generates XML output for integration with Jenkins. fio for performance or verification is good too an has a bit better output format then iozone. Alex On Tue, Jul 14, 2015 at 6:00 AM, Sagi Grimberg wrote: > Hi All, > > We have updated our iser github repository [1] with the latest iser > driver (beta level) additional support and fixes: > - Full Rebase on top of 11-current (r284921) > - Discovery (over iser) support > - Added spec compliance for iser assisted Login message > - HA support (automatic reconnects, traffic failover) > - Split iser from iscsi module (build as a separate module with > dependancy on iscsi) > - Several stability fixes > > The current version of the iser driver is constantly tested in our > nightly regression systems and seems to pass several tests: > - Traffic tests > - Login/logout: basic functionality, non-existent target error flows, stress > - Compliance/FS: diskinfo, newfs, iozone > - HA: Session recovery, IO with port toggling > - Module load/unload > > We would like to learn about other open-source test suites that we can > run in our labs. > > Edward, > We feel that the code is ready for inclusion. Would you mind reviewing > our code and provide feedback? > > [1] https://github.com/sagigrimberg/iser-freebsd > Branch: iser-rebase-11-current-r284921 > > Thanks, > Max Gurtovoy > Sagi Grimberg > Mellanox Technologies > _______________________________________________ > freebsd-scsi@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-scsi > To unsubscribe, send any mail to "freebsd-scsi-unsubscribe@freebsd.org"