From owner-freebsd-stable@FreeBSD.ORG Thu Oct 4 15:18:00 2012 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 374681065670 for ; Thu, 4 Oct 2012 15:18:00 +0000 (UTC) (envelope-from ndenev@gmail.com) Received: from mail-wi0-f172.google.com (mail-wi0-f172.google.com [209.85.212.172]) by mx1.freebsd.org (Postfix) with ESMTP id B577C8FC08 for ; Thu, 4 Oct 2012 15:17:59 +0000 (UTC) Received: by mail-wi0-f172.google.com with SMTP id hq12so3520626wib.13 for ; Thu, 04 Oct 2012 08:17:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=subject:mime-version:content-type:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to:x-mailer; bh=WwczudQT9ZluAc2OKpwWhB1Qb9fAZzC/7YuHXrBHOpA=; b=MgwQRoZJAycTQJ2hjXWPB1JynozJTUorvfT2tzMLVqRnwYuoMas2fNFMdOymKg0uTL FyALG2MZ/1SfnpizVng+0SqcpC56jgw4/A5YHWOMROM2Ht+YeCfTADqp5bAEiudY+Ixs cRAp4QEXpIasw8Xbwj4oVm996VQ3fvU+kxk4EG1xeQJa4EJtpgR3sNpGaTHDGM3p3CmG H0NzTeYgcFTM1Z7IPyKQptOXb2AjO4r3wiQZoMkaZhxY7YJ3HZEwZXBxE+bkQHUyMEGU XI9s48FCfWyETCWp8jvoIIrKywJBFKi7ShUrvzrGnP/6u0/bxEIZi3mmz5oNf7SOpKP8 q2Mw== Received: by 10.180.81.37 with SMTP id w5mr4063671wix.10.1349363878312; Thu, 04 Oct 2012 08:17:58 -0700 (PDT) Received: from ndenevsa.sf.moneybookers.net (g1.moneybookers.com. [217.18.249.148]) by mx.google.com with ESMTPS id hv8sm37243495wib.0.2012.10.04.08.17.55 (version=TLSv1/SSLv3 cipher=OTHER); Thu, 04 Oct 2012 08:17:56 -0700 (PDT) Mime-Version: 1.0 (Mac OS X Mail 6.1 \(1498\)) Content-Type: text/plain; charset=iso-8859-1 From: Nikolay Denev In-Reply-To: Date: Thu, 4 Oct 2012 18:17:54 +0300 Content-Transfer-Encoding: quoted-printable Message-Id: References: <72A4B763-D36B-4912-8C20-7373A0562EA1@gmail.com> <11028C2E-9DB0-4B71-A7B1-98160D5AEA93@gmail.com> To: Chuck Tuffli X-Mailer: Apple Mail (2.1498) Cc: "freebsd-stable@freebsd.org" Subject: Re: CAM Target Layer and Linux (continued) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 04 Oct 2012 15:18:00 -0000 On Oct 4, 2012, at 2:52 AM, Chuck Tuffli wrote: > On Tue, Oct 2, 2012 at 3:03 AM, Nikolay Denev = wrote: >>=20 >> On Sep 27, 2012, at 6:33 PM, Nikolay Denev wrote: >>=20 >>> Hi All, >>>=20 >>> With the help of Chuck Tuffli, I'm now able to use CTL to export a = zvol over FC to a Linux host: >>>=20 >>> LUN Backend Size (Blocks) BS Serial Number Device ID >>> 0 block 4185915392 512 FBSDZFS001 ORA_ASM_01 >>> lun_type=3D0 >>> num_threads=3D14 >>> file=3D/dev/zvol/tank/oracle_asm_01 >>> 1 block 4185915392 512 FBSDZFS002 ORA_ASM_02 >>> lun_type=3D0 >>> num_threads=3D14 >>> file=3D/dev/zvol/tank/oracle_asm_02 >>> 2 block 4185915392 512 FBSDZFS003 ORA_ASM_03 >>> lun_type=3D0 >>> num_threads=3D14 >>> file=3D/dev/zvol/tank/oracle_asm_03 >>> 3 block 4185915392 512 FBSDZFS004 ORA_ASM_04 >>> lun_type=3D0 >>> num_threads=3D14 >>> file=3D/dev/zvol/tank/oracle_asm_04 >>>=20 >>> Then we ran some tests using Oracle's ORION benchmark tool from the = Linux host. >>> We ran one test which passed successfully, >>> then I've just disabled zfs prefetch -> "vfs.zfs.prefetch_disable=3D1"= >>> and rerun the test, which failed due to this error. >>>=20 >>> On the FreeBSD side: >>>=20 >>> (0:3:0:1): READ(10). CDB: 28 0 84 f9 58 0 0 4 0 0 >>> (0:3:0:1): Tag: 0x116220, Type: 1 >>> (0:3:0:1): CTL Status: SCSI Error >>> (0:3:0:1): SCSI Status: Check Condition >>> (0:3:0:1): SCSI sense: NOT READY asc:4b,0 (Data phase error) > ... >> After a whole day of orion tests without problems, we started an = Oracle ASM instance from the Linux host and >> again got an error, this time it was WRITE error : >>=20 >> (0:3:0:3): WRITE(10). CDB: 2a 0 1 5b 10 0 0 4 0 0 >> (0:3:0:3): Tag: 0x110940, Type: 1 >> (0:3:0:3): CTL Status: SCSI Error >> (0:3:0:3): SCSI Status: Check Condition >> (0:3:0:3): SCSI sense: NOT READY asc:4b,0 (Data phase error) >>=20 >> I've tried to track down this "Data phase error" in the CTL code and = it looks like it is something related to the isp(4) driver: >=20 > This would have been my first guess if there had been something in the > logs from isp, but since there wasn't, it's hard to tell. I been > running orion for ~3hrs now with a different FC driver + an analyzer > but haven't seen this problem. >=20 > Would it be possible to stick some prints in default clause of the > ctlfedone() to confirm if this is front or back end problem? > Especially interesting would be the value of done_ccb->ccb_h.status. >=20 > ---chuck I have added the printfs like this : --- sys/cam/ctl/scsi_ctl.c.orig 2012-10-04 10:52:57.413144029 +0200 +++ sys/cam/ctl/scsi_ctl.c 2012-10-04 11:23:35.501143149 +0200 @@ -1415,6 +1415,7 @@ */ io->io_hdr.port_status =3D 0xbad1; ctl_set_data_phase_error(&io->scsiio); + printf("XXX: done_ccb->ccb_h.status =3D = %lu\n", (long unsigned int)done_ccb->ccb_h.status); /* * XXX KDM figure out residual. */ But I've postponed the tests as the pool got nearly filled up, and = probably the ZVOLs became very fragmented and they were extremely slow to access and generated scsi timeout and = abort command errors from the Linux host. Even deleting them took maybe 40 minutes. Also there was some bad interaction while accessing the zvols over CAM = and at the same time using a nfs share from this host, which bring all disk IO on the pool almost to a stop. I will create a new zvol tomorrow and retest with the printf enabled, = while the machine is idle (no nfs activity).