Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 13 Dec 2006 14:15:40 -0500
From:      Bill Moran <wmoran@collaborativefusion.com>
To:        Peter Jeremy <peterjeremy@optushome.com.au>
Cc:        hackers@freebsd.org, Kris Kennaway <kris@obsecurity.org>
Subject:   Re: shmmax tops out at 2G?
Message-ID:  <20061213141540.6909ec4f.wmoran@collaborativefusion.com>
In-Reply-To: <20061213183431.GC888@turion.vk2pj.dyndns.org>
References:  <20061212121714.a3fbb61b.wmoran@collaborativefusion.com> <20061213105021.c7d5b274.wmoran@collaborativefusion.com> <20061213183431.GC888@turion.vk2pj.dyndns.org>

next in thread | previous in thread | raw e-mail | index | archive | help

[Kris -- are you interested in this or should I trim you from the CC?]

In response to Peter Jeremy <peterjeremy@optushome.com.au>:
> On Wed, 2006-Dec-13 10:50:21 -0500, Bill Moran wrote:
> >In response to Bill Moran <wmoran@collaborativefusion.com>:
> >> sysctl kern.ipc.shmmax=2200000000
> >> kern.ipc.shmmax: 2100000000 -> -2094967296
> >> 
> >> Looks like an unsigned 32-bit int.  That doesn't seem to scale as well as
> >> would be expected on 64-bit arch (or PAE for that matter).
> >> 
> >> Is this a mistake, or intentional?  I'm working with some big memory
> >> systems, and I sure would like to allocate more than 2G for PostgreSQL
> >> to use ...
> 
> I thought POSIX specified 'int' but I may be mis-remembering.  Tru64
> uses int (and 2GB max) whilst Solaris allows 64-bit values.
> Logically, shm_segsz and shm{min,max} should be intptr_t, shmall is
> less clear but probably should be similar.

So, in your opinion:
struct shmid_ds {
        struct ipc_perm shm_perm;       /* operation permission structure */
        intptr_t        shm_segsz;      /* size of segment in bytes */
        pid_t           shm_lpid;   /* process ID of last shared memory op */
        pid_t           shm_cpid;       /* process ID of creator */
        short           shm_nattch;     /* number of current attaches */
        time_t          shm_atime;      /* time of last shmat() */
        time_t          shm_dtime;      /* time of last shmdt() */
        time_t          shm_ctime;      /* time of last change by shmctl() */
        void           *shm_internal;   /* sysv stupidity */
};

struct shminfo {
        intptr_t shmmax,        /* max shared memory segment size (bytes) */
                 shmmin;        /* min shared memory segment size (bytes) */
        int      shmmni,        /* max number of shared memory identifiers */
                 shmseg,        /* max shared memory segments per process */
                 shmall;        /* max amount of shared memory (pages) */
};

> >int shmget(key_t, size_t, int);
> >
> >I appears as if those values should have been size_t all along.  I'm
> >_assuming_ that the return value is an identifier and not a memory
> >address, which is what the docs seem to imply.
> 
> shmget() returns an "id" that uniquely refers to a shared memory
> segment (stupidly designed SysV IPC namespace) and shmat() takes
> the "id" and returns the address.
> 
> >So, my first thought is that all the int values in those structures
> >should be changed to size_t.  If I understand the use of that type
> >correctly, it should always be the native word size on the architecture,
> 
> I believe intptr_t is more logical - an integer size that is the
> same size as a pointer.  Unfortunately, as I mentioned above, some
> of this is specified in "standards" and logic is usually only present
> by accident in such documents.

Well, I guess there are a few questions if I want to make changes that
will end up back in the tree:
1) Can anyone quote the standards so we know what they expect?  I got
   the impression that you weren't sure about the standards.
2) If the standards attempt to lock us in to the 2G limit, is FreeBSD
   willing to move forward, thus breaking standards compliance?

> >but will that make this work for PAE as well, or should those be
> >changed to uint64_t so they're 8 bits wide on all archs?
> 
> PAE is kernel only - userland still sees only 32 bits.  (You can
> fit more RAM into the box, but each process is still limited to
> 4GB - KVM size).  Don't unnecessarily use [u]int64_t as it is
> comparatively inefficient on 32-bit architectures.

So intptr_t makes the most sense here, as it will Do the Right Thing
on 64-bit arch, 32-bit arch, and PAE.

-- 
Bill Moran
Collaborative Fusion Inc.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20061213141540.6909ec4f.wmoran>