Date: Fri, 10 Dec 2010 18:26:21 +0300 From: Lev Serebryakov <lev@serebryakov.spb.ru> To: Alexander Motin <mav@FreeBSD.org> Cc: freebsd-hackers@freebsd.org Subject: Re: Where userland read/write requests, whcih is larger than MAXPHYS, are splitted? Message-ID: <242059106.20101210182621@serebryakov.spb.ru> In-Reply-To: <4D023D00.10301@FreeBSD.org> References: <mailpost.1291988544.5326917.42118.mailing.freebsd.hackers@FreeBSD.cs.nctu.edu.tw> <4D023D00.10301@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Hello, Alexander. You wrote 10 =E4=E5=EA=E0=E1=F0=FF 2010 =E3., 17:45:20: >> I'm digging thought GEOM/IO code and can not find place, where >> requests from userland to read more than MAXPHYS bytes, is splitted >> into several "struct bio"? >> It seems, that these children request are issued one-by-one, not in >> parallel, am I right? Why? It breaks down parallelism, when >> underlying GEOM can process several requests simoltaneously? > AFAIK first time requests from user-land broken to MAXPHYS-size pieces > by physio() before entering GEOM. Requests are indeed serialized here, I > suppose to limit KVA that thread can harvest, but IMHO it could be > reconsidered. It is good idea, maybe to have GEOM flag for this? For example, any stripe/geom3/geom5 code can process read of series of reads, for example much fater, than sequentially -- if userland want to read big blocks, bigger than stripe size. And small stripe size is bad idea due to high fixed cost of transaction. Now, when application read files on RAID5 with big blocks (say, read() is called with 1Mb buffer), RAID5 geom sees read requests of 128Kb in size, one by one. And with stripe size of 128Kb, it performs as single disk :( I can add pre-read for full-sized reads, but it is not generic solution, and sending BIOs from one (logical/userland) read/write request without awaiting their completion is generic solution. > One more split happens (when needed) at geom_disk module to honor disk > driver's maximal I/O size. There is no serialization. Most of ATA/SATA > drivers in 8-STABLE support I/O up to at least min(512K, MAXPHYS) - 128K > by default. Many SCSI drivers still limited by DFLTPHYS - 64K. Yep, it is what I seen in my investigations. --=20 // Black Lion AKA Lev Serebryakov <lev@serebryakov.spb.ru>
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?242059106.20101210182621>