Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 1 May 1999 22:02:57 +0200
From:      Brad Knowles <brad@shub-internet.org>
To:        Greg Lehey <grog@lemis.com>, Matthew Jacob <mjacob@feral.com>
Cc:        FreeBSD current users <FreeBSD-current@FreeBSD.ORG>
Subject:   Re: Linux char devices (was: Porting Greg Lehey's rawio.c from FreeBSD to Linux...)
Message-ID:  <19990501220257.001997@relay.skynet.be>
In-Reply-To: <19990501152523.H80561@freebie.lemis.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, May 1, 1999, Greg Lehey <grog@lemis.com> wrote:

>On Friday, 30 April 1999 at 21:25:12 -0700, Matt Jacob wrote:

>> There are no raw devices in Linux. Linus is totally against them as
>> stupid. Linus has some good points about this, but it's still an, um,
>> interesting stance.
>
>It also makes it impossible for rawio to run accurately.  rawio
>measures device throughput, not system throughput.  Cache the data and
>you completely lose this ability (hey!  Under Vinum, an array of four
>floppies has a random read throughput of 50 MB/s!).

    It also makes it rather hard for programs like Oracle or Sybase to
run and use raw devices for their databases, so you instead wind up
polluting your buffer cache no matter what.  As I understand it, the
Oracle guys finally pounded into his brain the importance of real raw
device access, and they are starting to take some steps towards fixing
this problem in 2.2.x.


    Anyway, a friend of mine just yesterday actually got rawio.c to work
under Linux.  He ended up having to make some changes to rawio, as well
as some changes to the way mmap() is implemented.  Let me quote his
entire message on this subject:

>On Fri, Apr 30, 1999 at 03:37:03PM +0200, Brad Knowles wrote:
>>     Anyway, if there's anyone out there with any experience in porting
>> programs that do low-level disk I/O, I'd appreciate it if you could take
>> a look at this program and give me some pointers on what it would take to
>> get it to compile and run under Linux (specifically, Debian Linux with
>> kernel 2.2.6).
>
><rant>
>
>Ok, I decided to work on this a bit and I had no idea the trouble I'd run
>into....  I ported rawio to Red Hat Linux 5.2 pretty easily--it took about
>fifteen minutes.  Here's a summary of changes:
>
>a few more #includes (sys/ioctl.h, among others)
>swap out the BSD disklabel.h for linux/genhd.h and rename struct disklabel 
>	to struct bsd_disklabel (from linux/genhd.h)
>add srandomdev() from FreeBSD sources
>delete MAP_INHERIT from mmap()--no exec's in the program, so not useful--
>	good thing, because linux doesn't support it
>
>This wasn't too bad.  Then I tried to run it and got the "Can't mmap shared
>memory: Invalid argument" message.  Aargh.  It works fine under OpenBSD.
>After perusing the mmap() man pages, I think that everything's ok. I run
>rawio with strace and find this:
>
>mmap(0, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = \
>	0x40100000
>...
>mmap(0, 4096, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_ANONYMOUS, -1, 0) = -1 \
>	EINVAL
>_exit(1)
>
>I think, "Huh.  mmap() with the MAP_PRIVATE flag works fine.  mmap() with
>the MAP_SHARED flag returns EINVAL.  That's dumb."  I check the man page
>again and look at the reasons for EINVAL--nothing different if it's private
>or shared.  So, I figure it is a kernel bug and check out the kernel source
>for mmap.  Sure enough, in some function do_mmap():
>
>if (file != NULL) {
>...
>} else if ((flags & MAP_TYPE) != MAP_PRIVATE)
>	return -EINVAL;
>
>So (on a development/test/guinea pig/sacrificial lamb machine) I remove this
>offensive bit of code, recompile the kernel, reboot, and voila, rawio can
>actually communicate between the parent and the child processes,  just like
>it should have been able to in the first place.
>
>Of course, now I wonder why this code was there in the first place....  I'm
>not so expert a kernel hacker that I can definitively say that this is a
>ridiculous bit of of non-logic, but it sure did interfere with rawio....
>
></rant>
>
>Anyway, I half ported it to Linux and half ported Linux to the rest of the
>world.  You can get the patches at:
>
>ftp://mvhs1.mbhs.edu/pub/linux/linux-2.2.5-mmap.patch	(kernel patch)
>ftp://mvhs1.mbhs.edu/pub/linux/rawio-linux.patch	(patch to author src)
>ftp://mvhs1.mbhs.edu/pub/linux/rawio-linux.tar.gz	(patched srcs)
>ftp://mvhs1.mbhs.edu/pub/linux/rawio.tar.gz		(author src)
>
>Now you can run your benchmark tests against buffered device files to your
>heart's content.  <sniff>  :)
>
>P.S.  Keep in mind that you'll have to specify the device size manually.  On
>BSD, rawio can read the disklabel and get the size from there.  I didn't go
>to the trouble of reimplementing the device size for ext2fs labels.  So,
>unless you are running rawio under Linux against a BSD disk, the disklabel
>code won't help you much.  ;)
>-- 
>Matt Shibla  mshibla@mbhs.edu   (301) 649-2880 (vox)  (301) 649-2830 (fax)
>       Maryland Virtual High School   Montgomery Blair High School
>          51 University Boulevard East Silver Spring, MD  20901
>PGP Key:  http://pgp5.ai.mit.edu:11371/pks/lookup?op=get&search=0x8918CBCD


    I don't suppose anyone here knows why this piece of do_mmap() is
written this particular way, and if there's any harm in removing that code?


    TIA!

-- 
Brad Knowles <brad@shub-internet.org> <http://www.shub-internet.org/brad/>;
    <http://wwwkeys.pgp.net:11371/pks/lookup?op=get&search=0xE38CCEF1>;

Your mouse has moved.   Windows NT must be restarted for the change to
take effect.   Reboot now?  [ OK ]



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?19990501220257.001997>