Date: Wed, 5 Aug 2015 13:46:15 +0000 (UTC) From: Alexander Motin <mav@FreeBSD.org> To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r286320 - head/sys/cam/ctl Message-ID: <201508051346.t75DkFrY042506@repo.freebsd.org>
next in thread | raw e-mail | index | archive | help
Author: mav Date: Wed Aug 5 13:46:15 2015 New Revision: 286320 URL: https://svnweb.freebsd.org/changeset/base/286320 Log: Issue all reads of single XCOPY segment simultaneously. During vMotion and Clone VMware by default runs multiple sequential 4MB XCOPY requests same time. If CTL issues reads sequentially in 1MB chunks for each XCOPY command, reads from different commands are not detected as sequential by serseq option code and allowed to execute simultaneously. Such read pattern confused ZFS prefetcher, causing suboptimal disk access. Issuing all reads same time make serseq code work properly, serializing reads both within each XCOPY command and between them. My tests with ZFS pool of 14 disks in RAID10 shows prefetcher efficiency improved from 37% to 99.7%, copying speed improved by 10-60%, average read latency reduced twice on HDD layer and by five times on zvol layer. MFC after: 2 weeks Sponsored by: iXsystems, Inc. Modified: head/sys/cam/ctl/ctl_tpc.c Modified: head/sys/cam/ctl/ctl_tpc.c ============================================================================== --- head/sys/cam/ctl/ctl_tpc.c Wed Aug 5 13:10:13 2015 (r286319) +++ head/sys/cam/ctl/ctl_tpc.c Wed Aug 5 13:46:15 2015 (r286320) @@ -817,7 +817,7 @@ tpc_process_b2b(struct tpc_list *list) struct scsi_ec_segment_b2b *seg; struct scsi_ec_cscd_dtsp *sdstp, *ddstp; struct tpc_io *tior, *tiow; - struct runl run, *prun; + struct runl run; uint64_t sl, dl; off_t srclba, dstlba, numbytes, donebytes, roundbytes; int numlba; @@ -889,8 +889,7 @@ tpc_process_b2b(struct tpc_list *list) list->segsectors = numbytes / dstblock; donebytes = 0; TAILQ_INIT(&run); - prun = &run; - list->tbdio = 1; + list->tbdio = 0; while (donebytes < numbytes) { roundbytes = numbytes - donebytes; if (roundbytes > TPC_MAX_IO_SIZE) { @@ -942,8 +941,8 @@ tpc_process_b2b(struct tpc_list *list) tiow->io->io_hdr.ctl_private[CTL_PRIV_FRONTEND].ptr = tiow; TAILQ_INSERT_TAIL(&tior->run, tiow, rlinks); - TAILQ_INSERT_TAIL(prun, tior, rlinks); - prun = &tior->run; + TAILQ_INSERT_TAIL(&run, tior, rlinks); + list->tbdio++; donebytes += roundbytes; srclba += roundbytes / srcblock; dstlba += roundbytes / dstblock;
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201508051346.t75DkFrY042506>