Date: Tue, 05 Mar 2002 15:26:35 -0800 From: Lars Eggert <larse@ISI.EDU> To: Zhihui Zhang <zzhang@cs.binghamton.edu> Cc: "Rogier R. Mulhuijzen" <drwilco@drwilco.net>, Julian Elischer <julian@elischer.org>, freebsd-hackers@FreeBSD.ORG Subject: Re: A weird disk behaviour Message-ID: <3C85542B.5060100@isi.edu> References: <Pine.SOL.4.21.0203051818510.13181-100000@onyx>
next in thread | previous in thread | raw e-mail | index | archive | help
[-- Attachment #1 --]
Zhihui Zhang wrote:
> Several times slower! The point is that writing less data performs
> worse. So I call it weird.
Huh? You originally said:
> (1) Write each block fully and sequentially, ie. 8192 bytes.
>
> (2) I still write these blocks sequentially, but for each block I only
> write part of it.
...
> I find out the the performance of (2) is several times better than the
> performance of (1). Can anyone explain to me why this is the case?
If (2) is better than (1), then writing *less* data is faster. Which is
it, now?
Lars
> -Zhihui
>
> On Tue, 5 Mar 2002, Lars Eggert wrote:
>
>
>>Zhihui Zhang wrote:
>>
>>>Well, the core of my program is as follows (RANDOM(x) return a value
>>>between 0 and x):
>>>
>>> blocksize = 8192;
>>> write_size_low = 512;
>>>
>>> time(&time1);
>>> for (i = 0; i < write_count; i++) {
>>> write_size = write_size_low +
>>> RANDOM(write_size_high-write_size_low);
>>> write_size = roundup(write_size, DEV_BSIZE);
>>> if (testcase == 1)
>>> write_size = blocksize;
>>> write_block(rawfd, sectorno, buf, write_size);
>>> sectorno += blocksize / DEV_BSIZE;
>>> }
>>> time(&time2);
>>>
>>>If testcase is one, then the time elapsed (time2 - time1) is much less.
>>>
>>How "much less" in milliseconds?
>>
>>Also, in your original mail, you said you had 15,000 of these 8K blocks,
>>which is only 120MB or so. Use 150,000 or 1,500,000 and check your
>>results then.
>>
>>Lars
>>
>>
>>
>>
>>>-Zhihui
>>>
>>>On Tue, 5 Mar 2002, Lars Eggert wrote:
>>>
>>>
>>>
>>>>I agree that it's probably caching at some level. You're only writing
>>>>about 120MB of data (and half that in your second case). Bump these to a
>>>>couple of GB and see what happens.
>>>>
>>>>Also, could you post your actual measurements?
>>>>
>>>>Lars
>>>>
>>>>
>>>>Zhihui Zhang wrote:
>>>>
>>>>
>>>>>The machine has 128M memory. I am doing physical I/O one block at a time,
>>>>>so there should be no memory copy.
>>>>>
>>>>>-Zhihui
>>>>>
>>>>>On Tue, 5 Mar 2002, Rogier R. Mulhuijzen wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>>At 16:03 5-3-2002 -0500, Zhihui Zhang wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>>On Tue, 5 Mar 2002, Julian Elischer wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>more writes fit in the disk's write cache?
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>For (1), it writes 15000 * 8192 bytes in all. For (2), it writes 15000 *
>>>>>>>4096 bytes in all (assuming the random number distributes evenly between 0
>>>>>>>and 8192). So your suggestion does not make sense to me.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>How large is your buffercache? it might be that the 15000 * ~4096 roughly
>>>>>>matches with your cache, and 15000 * 8912 doesn't.
>>>>>>
>>>>>>Case (1) would require a lot more physical IO in that case than case (2)
>>>>>>would require.
>>>>>>
>>>>>> Doc
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>>-Zhihui
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>On Tue, 5 Mar 2002, Zhihui Zhang wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>I am doing some raw I/O test on a seagate SCSI disk running FreeBSD 4.5.
>>>>>>>>>This situation is like this:
>>>>>>>>>
>>>>>>>>>+-----+----+----+----+----+----+----+----+----+----+---+------
>>>>>>>>>| | | | | | | | | | | | ....
>>>>>>>>>+-----+----+----+----+----+----+----+----+----+----+---+------
>>>>>>>>>
>>>>>>>>>Each block is of fixed size, say 8192 bytes. Now I have a user program
>>>>>>>>>writing each contiguously laid out block sequentially using /dev/daxxx
>>>>>>>>>interface. There are a lot of them, say 15000. I write the blocks in two
>>>>>>>>>ways (the data used in writing are garbage):
>>>>>>>>>
>>>>>>>>>(1) Write each block fully and sequentially, ie. 8192 bytes.
>>>>>>>>>
>>>>>>>>>(2) I still write these blocks sequentially, but for each block I only
>>>>>>>>>write part of it. Exactly how many bytes are written inside each
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>block is
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>>determinted by a random number between 512 .. 8192 bytes (rounded up a
>>>>>>>>>to multiple of 512 bytes).
>>>>>>>>>
>>>>>>>>>I find out the the performance of (2) is several times better than the
>>>>>>>>>performance of (1). Can anyone explain to me why this is the case?
>>>>>>>>>
>>>>>>>>>Thanks for any suggestions or hints.
>>>>>>>>>
>>>>>>>>>-Zhihui
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>To Unsubscribe: send mail to majordomo@FreeBSD.org
>>>>>>>>>with "unsubscribe freebsd-hackers" in the body of the message
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>To Unsubscribe: send mail to majordomo@FreeBSD.org
>>>>>>>with "unsubscribe freebsd-hackers" in the body of the message
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>To Unsubscribe: send mail to majordomo@FreeBSD.org
>>>>>with "unsubscribe freebsd-hackers" in the body of the message
>>>>>
>>>>>
>>>>>
>>>>
>>>>--
>>>>Lars Eggert <larse@isi.edu> Information Sciences Institute
>>>>http://www.isi.edu/larse/ University of Southern California
>>>>
>>>>
>>>>
>>
>>
>>--
>>Lars Eggert <larse@isi.edu> Information Sciences Institute
>>http://www.isi.edu/larse/ University of Southern California
>>
>>
--
Lars Eggert <larse@isi.edu> Information Sciences Institute
http://www.isi.edu/larse/ University of Southern California
[-- Attachment #2 --]
0 *H
010 + 0 *H
00G0
*H
010 UZA10UWestern Cape10U Cape Town10
U
Thawte10UCertificate Services1(0&UPersonal Freemail RSA 2000.8.300
010824164000Z
020824164000Z0T10
UEggert1
0U*Lars10ULars Eggert10 *H
larse@isi.edu00
*H
0 |\Pw v~~FDooӦA\- Cˀ4.)&{肋,z(ܷر߈T7_'txGH^tt/ҹB8%t<#ֲN V0T0*+e!0 00L2uMyffBNUbNJJcdZ2s0U0
larse@isi.edu0U0 0
*H
aJPMՒ ]cѭC+kS+wZ1gY",YT41
j6:~℩D~Kؚl=u(ՎM?cF7@}T00G0
*H
010 UZA10UWestern Cape10U Cape Town10
U
Thawte10UCertificate Services1(0&UPersonal Freemail RSA 2000.8.300
010824164000Z
020824164000Z0T10
UEggert1
0U*Lars10ULars Eggert10 *H
larse@isi.edu00
*H
0 |\Pw v~~FDooӦA\- Cˀ4.)&{肋,z(ܷر߈T7_'txGH^tt/ҹB8%t<#ֲN V0T0*+e!0 00L2uMyffBNUbNJJcdZ2s0U0
larse@isi.edu0U0 0
*H
aJPMՒ ]cѭC+kS+wZ1gY",YT41
j6:~℩D~Kؚl=u(ՎM?cF7@}T0)00
*H
010 UZA10UWestern Cape10U Cape Town10U
Thawte Consulting1(0&UCertification Services Division1$0"UThawte Personal Freemail CA1+0) *H
personal-freemail@thawte.com0
000830000000Z
020829235959Z010 UZA10UWestern Cape10U Cape Town10
U
Thawte10UCertificate Services1(0&UPersonal Freemail RSA 2000.8.3000
*H
0 32c %E>nx'gڈD)c5*mp<ܮto034qmOe
KaU5u'rװ|CBPQ<9TIf - ki N0L0)U"0 010UPrivateLabel1-2970U0 0U0
*H
so&e4KYbDI
j&*bctmSK8P:l4撜n# KrgPo.XPWՈ9[9}4%MjÑ/<RbH100010 UZA10UWestern Cape10U Cape Town10
U
Thawte10UCertificate Services1(0&UPersonal Freemail RSA 2000.8.30G0 + a0 *H
1 *H
0 *H
1
020305232635Z0# *H
1ř:;i.&L0R *H
1E0C0
*H
0*H
0
*H
@0+0
*H
(0*H
1010 UZA10UWestern Cape10U Cape Town10
U
Thawte10UCertificate Services1(0&UPersonal Freemail RSA 2000.8.30G0
*H
m#{ZĴa%=V\_XG9"7JSrZ>9)Sgtv Te0u: HL@ayR"\F#ru&:!g E ĴOK1
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3C85542B.5060100>
