From owner-freebsd-stable@FreeBSD.ORG Sat Sep 20 07:15:52 2014 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id BFD8CB41; Sat, 20 Sep 2014 07:15:52 +0000 (UTC) Received: from aslan.scsiguy.com (www.scsiguy.com [70.89.174.89]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 95A6184D; Sat, 20 Sep 2014 07:15:52 +0000 (UTC) Received: from [192.168.0.61] (jt-mbp.home.scsiguy.org [192.168.0.61]) (authenticated bits=0) by aslan.scsiguy.com (8.14.9/8.14.9) with ESMTP id s8K7FkbQ062982 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Sat, 20 Sep 2014 01:15:47 -0600 (MDT) (envelope-from gibbs@scsiguy.com) Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) Subject: Re: getting to 4K disk blocks in ZFS From: "Justin T. Gibbs" In-Reply-To: <9F3CB84B-E0F1-4343-8E37-4E4A9F252C52@scsiguy.com> Date: Sat, 20 Sep 2014 01:15:46 -0600 Content-Transfer-Encoding: quoted-printable Message-Id: <05BBE286-A6B6-4F82-A06E-57002EFCA4DE@scsiguy.com> References: <540FF3C4.6010305@ish.com.au> <54114029.3060507@FreeBSD.org> <2128347.Ah5i0RTCvp@overcee.wemm.org> <541230F1.3060402@digiware.nl> <7D0869A9-C114-4C4F-877A-3FB26AD7737D@scsiguy.com> <9F3CB84B-E0F1-4343-8E37-4E4A9F252C52@scsiguy.com> To: Willem Jan Withagen X-Mailer: Apple Mail (2.1878.6) Cc: freebsd-stable@freebsd.org, Steven Hartland , Peter Wemm , Andriy Gapon , Aristedes Maniatis X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 20 Sep 2014 07:15:52 -0000 On Sep 20, 2014, at 12:21 AM, Justin T. Gibbs wrote: > On Sep 19, 2014, at 5:07 PM, Justin T. Gibbs = wrote: >=20 >> On Sep 11, 2014, at 5:32 PM, Willem Jan Withagen = wrote: >>=20 >>> On 11-9-2014 19:49, Peter Wemm wrote: >>>>> Another downside is 1/4th of uberblocks, 32 vs 128. >>>>> Also, automatic sector size detection works great for me and I've = never had >>>>> a need to manually tweak ashift. >>>>=20 >>>> Unfortunately, I have. Same drive connected two different ways: >>>>=20 >>>> da12 at mps1 bus 0 scbus1 target 11 lun 0 >>>> da12: Fixed Direct Access SCSI-6 device=20= >>>> da12: 600.000MB/s transfers >>>> da12: Command Queueing enabled >>>> da12: 3815447MB (7814037168 512 byte sectors: 255H 63S/T 486401C) >>>>=20 >>>> ada1 at ahcich1 bus 0 scbus3 target 0 lun 0 >>>> ada1: ATA-8 SATA 3.x device >>>> ada1: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes) >>>> ada1: Command Queueing enabled >>>> ada1: 3815447MB (7814037168 512 byte sectors: 16H 63S/T 16383C) >>>> ada1: quirks=3D0x1<4K> >>>>=20 >>>> The 4k flag is missing when it's on the sas controller. The Ident = strings are=20 >>>> changed. >>>>=20 >>>> This came up elsewhere recently. >>>=20 >>> I reported the same fact for the new set of WD REDs I installed. >>> Seems that ada and da have different quirks tables... >>> So disks on SATA connectors on the motherboard are diagnosed as = being 4Kb. >>> The disks on my twa don't get the quirk and are considered 512b >>>=20 >>> =97WjW >>=20 >> I=92m surprised that we have to constantly add quirks. Are these = drives really failing to report their ata params correctly? Is there a = reason we don=92t currently utilize the ata params data (which is = already fetched for trim/unmap detection) to also set lbppbe (logical = block per physical block exponent) and lalba (lowest aligned lba)? We = may find that many of the existing quirks are unnecessary if we fix the = probe code. >>=20 >> =97 >> Justin >=20 >=20 > Here=92s a start at using the ata_params sector size data. I think it = needs to go a bit further and detect situations where the SCSI = controller=92s emulation gets the logical sector size wrong and fail to = attach - but I=92m out of steam for tonight. >=20 > Note that this patch is against Spectra Logic=92s = FreeBSD/stable/10=92ish tree, so may not apply cleanly for you as is. I = can rebase it against head tomorrow if there is interest and someone = else doesn=92t beat me to it. Hmm. Attachments aren=92t allowed? Here=92s the inlined diff. = Hopefully Apple Mail won=92t mangle it. =97 Justin --- //SpectraBSD/stable/sys/cam/ata/ata_all.c 2014-08-01 = 21:08:39.000000000 -0600 +++ /usr/home/justing/perforce/SpectraBSD/sys/cam/ata/ata_all.c = 2014-08-01 21:08:39.000000000 -0600 @@ -335,7 +335,25 @@ printf("<%s %s %s %s>", vendor, product, revision, fw); } =20 +/* (L)ogical (B)locks (P)er (P)hysical (B)lock (E)xponent. */ +uint8_t +ata_lbppbe(struct ata_params *ident_data) +{ + if ((ident_data->pss & ATA_PSS_VALID_MASK) !=3D = ATA_PSS_VALID_VALUE) + return (0); + return (ident_data->pss & ATA_PSS_LSPPS); +} + +/* (L)owest (A)ligned (L)ogical (B)lock (A)ddress. */ uint32_t +ata_lalba(struct ata_params *ident_data) +{ + if ((ident_data->lsalign & 0xc000) !=3D 0x4000) + return (0); + return (ident_data->lsalign & 0x3fff); +} + +uint32_t ata_logical_sector_size(struct ata_params *ident_data) { if ((ident_data->pss & ATA_PSS_VALID_MASK) =3D=3D = ATA_PSS_VALID_VALUE && @@ -346,28 +364,17 @@ return (512); } =20 -uint64_t +uint32_t ata_physical_sector_size(struct ata_params *ident_data) { - if ((ident_data->pss & ATA_PSS_VALID_MASK) =3D=3D = ATA_PSS_VALID_VALUE) { - if (ident_data->pss & ATA_PSS_MULTLS) { - return = ((uint64_t)ata_logical_sector_size(ident_data) * - (1 << (ident_data->pss & ATA_PSS_LSPPS))); - } else { - return = (uint64_t)ata_logical_sector_size(ident_data); - } - } - return (512); + return (ata_logical_sector_size(ident_data) << = ata_lbppbe(ident_data)); } =20 uint64_t ata_logical_sector_offset(struct ata_params *ident_data) { - if ((ident_data->lsalign & 0xc000) =3D=3D 0x4000) { - return ((uint64_t)ata_logical_sector_size(ident_data) * - (ident_data->lsalign & 0x3fff)); - } - return (0); + return ((uint64_t)ata_logical_sector_size(ident_data) * + ata_lalba(ident_data)); } =20 void --- //SpectraBSD/stable/sys/cam/ata/ata_all.h 2014-08-01 = 21:08:39.000000000 -0600 +++ /usr/home/justing/perforce/SpectraBSD/sys/cam/ata/ata_all.h = 2014-08-01 21:08:39.000000000 -0600 @@ -111,8 +111,10 @@ void ata_print_ident(struct ata_params *ident_data); void ata_print_ident_short(struct ata_params *ident_data); =20 +uint8_t ata_lbppbe(struct ata_params *ident_data); +uint32_t ata_lalba(struct ata_params *ident_data); uint32_t ata_logical_sector_size(struct ata_params *ident_data); -uint64_t ata_physical_sector_size(struct ata_params *ident_data); +uint32_t ata_physical_sector_size(struct ata_params *ident_data); uint64_t ata_logical_sector_offset(struct ata_params = *ident_data); =20 void ata_28bit_cmd(struct ccb_ataio *ataio, uint8_t cmd, uint8_t = features, --- //SpectraBSD/stable/sys/cam/scsi/scsi_da.c 2014-09-19 = 23:21:40.000000000 -0600 +++ /usr/home/justing/perforce/SpectraBSD/sys/cam/scsi/scsi_da.c = 2014-09-19 23:21:40.000000000 -0600 @@ -2039,6 +2041,24 @@ daschedule(periph); wakeup(&softc->disk->d_mediasize); if ((softc->flags & DA_FLAG_ANNOUNCED) =3D=3D 0) { + + /* + * Create our sysctl variables, now that we know + * we have successfully attached. + */ + /* increase the refcount */ + if (cam_periph_acquire(periph) =3D=3D CAM_REQ_CMP) { + taskqueue_enqueue(taskqueue_thread, + &softc->sysctl_task); + xpt_announce_periph(periph, + softc->announce_buf); + xpt_announce_quirks(periph, softc->quirks.flags, + DA_Q_BIT_STRING); + } else { + xpt_print(periph->path, "fatal error, " + "could not acquire reference count\n"); + } + softc->flags |=3D DA_FLAG_ANNOUNCED; cam_periph_unhold(periph); } else @@ -3360,26 +3381,6 @@ } } free(csio->data_ptr, M_SCSIDA); - if (softc->announce_buf[0] !=3D '\0' && - ((softc->flags & DA_FLAG_ANNOUNCED) =3D=3D 0)) { - /* - * Create our sysctl variables, now that we know - * we have successfully attached. - */ - /* increase the refcount */ - if (cam_periph_acquire(periph) =3D=3D = CAM_REQ_CMP) { - taskqueue_enqueue(taskqueue_thread, - &softc->sysctl_task); - xpt_announce_periph(periph, - = softc->announce_buf); - xpt_announce_quirks(periph, = softc->quirks.flags, - DA_Q_BIT_STRING); - } else { - xpt_print(periph->path, "fatal error, " - "could not acquire reference = count\n"); - } - } - /* We already probed the device. */ if (softc->flags & DA_FLAG_PROBED) { daprobedone(periph, done_ccb); @@ -3633,10 +3578,13 @@ } case DA_CCB_PROBE_ATA: { - int i; + struct scsi_read_capacity_data_long rcaplong; struct ata_params *ata_params; + struct disk_params *dp; int16_t *ptr; + int i; =20 + dp =3D &softc->params; ata_params =3D (struct ata_params *)csio->data_ptr; ptr =3D (uint16_t *)ata_params; =20 @@ -3658,6 +3606,27 @@ */ if (ata_params->media_rotation_rate =3D=3D 1) softc->sort_io_queue =3D 0; + + /* + * Perform our own emulation of read capacity = data + * rather than rely on the SCSI controller to = get + * it right. + */ + memset(&rcaplong, 0, sizeof(rcaplong)); + rcaplong.prot_lbppbe =3D ata_lbppbe(ata_params); + scsi_ulto2b(ata_lalba(ata_params), + rcaplong.lalba_lbp); + dasetgeom(periph, = ata_logical_sector_size(ata_params), + dp->sectors - 1, &rcaplong, = sizeof(rcaplong)); + snprintf(softc->announce_buf, + sizeof(softc->announce_buf), + "%juMB (%ju %u byte sectors: %dH %dS/T " + "%dC)", (uintmax_t) + (((uintmax_t)dp->secsize * + dp->sectors) / (1024*1024)), + (uintmax_t)dp->sectors, + dp->secsize, dp->heads, + dp->secs_per_track, dp->cylinders); } else { int error; error =3D daerror(done_ccb, CAM_RETRY_SELTO,