Date: Wed, 17 Mar 2010 14:50:35 -0700 (PDT) From: Matthew Dillon <dillon@apollo.backplane.com> To: Thiago Damas <tdamas@gmail.com> Cc: freebsd-hackers@freebsd.org Subject: Re: ATA 4K sector issues Message-ID: <201003172150.o2HLoZxW070346@apollo.backplane.com> References: <alpine.BSF.2.00.1003171114280.74067@mignon.ki.iif.hu> <86tysf58a2.fsf@ds4.des.no> <alpine.BSF.2.00.1003171652260.74067@mignon.ki.iif.hu> <f8e3d83f1003171034m5e75eae4r5e8b31d88d361d3b@mail.gmail.com> <367b2c981003171112n785ea9d4q21d00b533819ca67@mail.gmail.com> <f8e3d83f1003171117k20d553b7y7ce4c3c8ed2f5c96@mail.gmail.com> <201003172023.o2HKNNbj069321@apollo.backplane.com> <201003172111.o2HLBIgJ069873@apollo.backplane.com> <f8e3d83f1003171417s60196803ra4884dae487edb9a@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
: There is a sysctl, md_compress, that I turned out in my tests, but not :working as expected. : Why using gnop -S 4096 works well? : :Thiago You are setting the sector size to 4K with gnop -S 4096 so presumably ZFS will not do any fragmented writes smaller than that. I'm not sure why that would matter except possibly for ZIL writes. In the case of ZIL if ZFS is using sector-sized writes (I don't know what it actually uses) then setting the sector size to 4K would be more efficient as the drive would not have to issue a read-before-write when the disk cache is flushed after the ZIL write. One important aspect of having the filesystem use a larger logical block size, such as 4K or 16K or 32K etc, is that the filesystem itself knows whether any trailing data is garbage or not and will avoid doing a read-before-write when writing small amounts of data. Most of the time if the filesystem is allocating space from its blockmap it knows the trailing data in the block is garbage and will zero it instead of performing a read-before-write. Also, the buffer cache covers hundreds of megabytes verses the hard drive cache which is typically only 8-64MB (though the OCZ Colosus has 128M). Still, this means the kernel will do a much better job write-combining than the drive. The drive has no knowledge of what is garbage and what is not at the drive level, so the moment this stuff moves out of the drive and into the kernel you reap rewards on these larger physical sector-sized drives. -Matt
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201003172150.o2HLoZxW070346>