Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 16 Apr 2002 21:50:22 -0700
From:      Terry Lambert <tlambert2@mindspring.com>
To:        callum.gibson@db.com
Cc:        hackers@FreeBSD.ORG
Subject:   Re: ipcrm/shmctl failure (fix NOT found)
Message-ID:  <3CBCFF0E.56972E35@mindspring.com>
References:  <20020417030844.14060.qmail@merton.aus.deuba.com>

next in thread | previous in thread | raw e-mail | index | archive | help
callum.gibson@db.com wrote:
> I didn't know if you were talking about "not incrementing" when the
> process exits or when it rforked. If you rfork(RFMEM), you'd want to
> increment the vm_refcnt I'm pretty sure (and it does).

No, you really don't.

You have a number of references on the vm (one per RFMEM) process.
The correct translation of these references is to have a *single*
reference count instance to the shared memory segment itself,
rather than incrementing the segment references, shmseg->shm_nattch.

If the VM reference counting on normal segments weren't working,
then there'd be a huge-and-obvious-to-everyone problem.  I think
that incrementing the shmseg->shm_nattch on the vfork is definitely
the wrong thing to do.

The reference to the shared memory segment is by the VM... not by
the process that references the VM.


Since your problem is a symptom of increment of shmseg->shm_nattch
without a corresponding decrement, then the *only* code that can be
involved is shmat() and shmfork() for the increment, and for the
delete, shm_delete_mapping(), which is called from shmexit() and
shmdt().

That basically impies that RFMEM is not set when vm_fork() is called
from the Linux ABI code, since that's the only place that calls the
shmfork() code.

> The whole bug is
> the point that vm_refcnt is never decremented and the shm_nattch is
> therefore only decremented if you explicitly detach from memory (which
> will call shm_delete_mapping). So if an rfork'd program uses shared mem
> and crashes, the vm_refcnt stays > 1, the shared mem is never freed
> because shmexit -> shm_delete_mapping is never called.  Hopefully this
> only affects shared mem, as there is more stuff inside the if statement
> you include below other than the shmexit.

It should not be incremented in the first place.  It is erroneously
incremented, IMO.


> }...in other words, the resource track exit does not occur until
> }the reference count is about to go from 1->0.  Note that there
> }is an implicit race here, actually, between the reference and
> }the detach, in which another instance could conceivably be
> }created.  8-(.
> 
> Don't know about the race, although one is mentioned in the cvs logs on
> the current branch. I presume you're talking SMP only though?
> As a side note, in current this reads:
>         if (--vmspace->vm_refcnt == 0) {


Yes.  This doesn't have the race, because there isn't a window between
the time of the compare and the decrement.


> However, I can't find the spot where the ref count _actually_ goes to zero
> in 4.5 - I suspect it does, but the only decrement of vm_refcnt in the
> code is in vmspace_free and I traced all calls to that. I suspect it
> just frees all the memory associated with the process on exit
> without doing the final decrement to zero. There is a comment just above
> cpu_exit which says:
> 
>          * The address space is released by "vmspace_free(p->p_vmspace)";
> 
> but I don't know who calls that unless it somehow happens from cpu_exit.

The reference is initialized to 1 when it is created.  See vmspace_alloc()
in vm_map.c.


> }At this point, I think it would be wise to instrument rfork, fork,
> }vm_fork, shmfork, and shmexit to see what's going on with your
> }particular program.
> }
> }It may be that your program is reattaching an already attached
> }shared memory segment, and expecting the behaviour to be sane.
> }
> }Really, the place to look for that would be in the Linux kernel
> }sources, to see how it handled shares memory segments with Linux
> }threads... it may be that it doesn't expect them to be attached,
> }and that each thread is expected to do the attach.  The above
> }instrumentation points should tell you this.
> 
> This is not limited to linux threads, it should affect anything which
> increments vm_refcnt and allocates shared mem. It's obvious what should
> happen, just not obvious how to implement it without causing a side effect.
> Not sure that seeing how linux does it would help in this regard.

I think it is Linux specific.  I think it is related to RFMEM not
being set in flags when the vm_fork() is called.

-- Terry

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3CBCFF0E.56972E35>