Date: Tue, 16 Apr 2002 21:50:22 -0700 From: Terry Lambert <tlambert2@mindspring.com> To: callum.gibson@db.com Cc: hackers@FreeBSD.ORG Subject: Re: ipcrm/shmctl failure (fix NOT found) Message-ID: <3CBCFF0E.56972E35@mindspring.com> References: <20020417030844.14060.qmail@merton.aus.deuba.com>
next in thread | previous in thread | raw e-mail | index | archive | help
callum.gibson@db.com wrote: > I didn't know if you were talking about "not incrementing" when the > process exits or when it rforked. If you rfork(RFMEM), you'd want to > increment the vm_refcnt I'm pretty sure (and it does). No, you really don't. You have a number of references on the vm (one per RFMEM) process. The correct translation of these references is to have a *single* reference count instance to the shared memory segment itself, rather than incrementing the segment references, shmseg->shm_nattch. If the VM reference counting on normal segments weren't working, then there'd be a huge-and-obvious-to-everyone problem. I think that incrementing the shmseg->shm_nattch on the vfork is definitely the wrong thing to do. The reference to the shared memory segment is by the VM... not by the process that references the VM. Since your problem is a symptom of increment of shmseg->shm_nattch without a corresponding decrement, then the *only* code that can be involved is shmat() and shmfork() for the increment, and for the delete, shm_delete_mapping(), which is called from shmexit() and shmdt(). That basically impies that RFMEM is not set when vm_fork() is called from the Linux ABI code, since that's the only place that calls the shmfork() code. > The whole bug is > the point that vm_refcnt is never decremented and the shm_nattch is > therefore only decremented if you explicitly detach from memory (which > will call shm_delete_mapping). So if an rfork'd program uses shared mem > and crashes, the vm_refcnt stays > 1, the shared mem is never freed > because shmexit -> shm_delete_mapping is never called. Hopefully this > only affects shared mem, as there is more stuff inside the if statement > you include below other than the shmexit. It should not be incremented in the first place. It is erroneously incremented, IMO. > }...in other words, the resource track exit does not occur until > }the reference count is about to go from 1->0. Note that there > }is an implicit race here, actually, between the reference and > }the detach, in which another instance could conceivably be > }created. 8-(. > > Don't know about the race, although one is mentioned in the cvs logs on > the current branch. I presume you're talking SMP only though? > As a side note, in current this reads: > if (--vmspace->vm_refcnt == 0) { Yes. This doesn't have the race, because there isn't a window between the time of the compare and the decrement. > However, I can't find the spot where the ref count _actually_ goes to zero > in 4.5 - I suspect it does, but the only decrement of vm_refcnt in the > code is in vmspace_free and I traced all calls to that. I suspect it > just frees all the memory associated with the process on exit > without doing the final decrement to zero. There is a comment just above > cpu_exit which says: > > * The address space is released by "vmspace_free(p->p_vmspace)"; > > but I don't know who calls that unless it somehow happens from cpu_exit. The reference is initialized to 1 when it is created. See vmspace_alloc() in vm_map.c. > }At this point, I think it would be wise to instrument rfork, fork, > }vm_fork, shmfork, and shmexit to see what's going on with your > }particular program. > } > }It may be that your program is reattaching an already attached > }shared memory segment, and expecting the behaviour to be sane. > } > }Really, the place to look for that would be in the Linux kernel > }sources, to see how it handled shares memory segments with Linux > }threads... it may be that it doesn't expect them to be attached, > }and that each thread is expected to do the attach. The above > }instrumentation points should tell you this. > > This is not limited to linux threads, it should affect anything which > increments vm_refcnt and allocates shared mem. It's obvious what should > happen, just not obvious how to implement it without causing a side effect. > Not sure that seeing how linux does it would help in this regard. I think it is Linux specific. I think it is related to RFMEM not being set in flags when the vm_fork() is called. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3CBCFF0E.56972E35>