Date: Wed, 17 Apr 2002 15:17:50 +1000 (EST) From: callum.gibson@db.com To: tlambert2@mindspring.com Cc: hackers@FreeBSD.ORG Subject: Re: ipcrm/shmctl failure (fix NOT found) Message-ID: <20020417051750.14714.qmail@merton.aus.deuba.com> In-Reply-To: <3CBCFF0E.56972E35@mindspring.com> from "tlambert2@mindspring.com" at Apr 16, 2002 09:50:22 PM
next in thread | previous in thread | raw e-mail | index | archive | help
tlambert2@mindspring.com writes:
}> I didn't know if you were talking about "not incrementing" when the
}> process exits or when it rforked. If you rfork(RFMEM), you'd want to
}> increment the vm_refcnt I'm pretty sure (and it does).
}
}No, you really don't.
I don't know or we don't want to increment the vm_refcnt when rforking?
}You have a number of references on the vm (one per RFMEM) process.
}The correct translation of these references is to have a *single*
}reference count instance to the shared memory segment itself,
}rather than incrementing the segment references, shmseg->shm_nattch.
Ok - so shmfork can not increment shm_nattch. But you still want
to increment vm_refcnt when you rfork or your second sentence is a
contradiction (one ref per RFMEM). But you are saying there is a single
vm (albeit with multiple references to it) but because it's only one vm
there is in effect a _single_ reference to the shmseg from that.
Do I understand you correctly?
}If the VM reference counting on normal segments weren't working,
}then there'd be a huge-and-obvious-to-everyone problem. I think
}that incrementing the shmseg->shm_nattch on the vfork is definitely
}the wrong thing to do.
It's surprising what people don't notice.
}Since your problem is a symptom of increment of shmseg->shm_nattch
}without a corresponding decrement, then the *only* code that can be
}involved is shmat() and shmfork() for the increment, and for the
}delete, shm_delete_mapping(), which is called from shmexit() and
}shmdt().
No, I don't think I said that - all I know is that shmexit never gets
called and that seems to be because vm_refcnt is incremented.
}That basically impies that RFMEM is not set when vm_fork() is called
}from the Linux ABI code, since that's the only place that calls the
}shmfork() code.
Nah, I checked that. It does a clone(CLONEVM) in the linux threads lib
which translates to a rfork(RFMEM) in i386/linux/linux_machdep.c .
}> The whole bug is
}> the point that vm_refcnt is never decremented and the shm_nattch is
}> therefore only decremented if you explicitly detach from memory (which
}> will call shm_delete_mapping). So if an rfork'd program uses shared mem
}> and crashes, the vm_refcnt stays > 1, the shared mem is never freed
}> because shmexit -> shm_delete_mapping is never called. Hopefully this
}> only affects shared mem, as there is more stuff inside the if statement
}> you include below other than the shmexit.
}It should not be incremented in the first place. It is erroneously
}incremented, IMO.
You mean shm_nattch is erroneously incremented, not vm_refcnt I think?
}> }...in other words, the resource track exit does not occur until
}> }the reference count is about to go from 1->0. Note that there
}> }is an implicit race here, actually, between the reference and
}> }the detach, in which another instance could conceivably be
}> }created. 8-(.
}>
}> Don't know about the race, although one is mentioned in the cvs logs on
}> the current branch. I presume you're talking SMP only though?
}> As a side note, in current this reads:
}> if (--vmspace->vm_refcnt == 0) {
}
}
}Yes. This doesn't have the race, because there isn't a window between
}the time of the compare and the decrement.
Perhaps what I'm really seeing is the race then? I do have a single vm
with a single ref to a shmseg, but when the process crashes all the
rforked processes exit and clobber the vm_refcnt so that shmexit never
gets called to decrement shm_nattch to zero? A new theory...
}> without doing the final decrement to zero. There is a comment just above
}> cpu_exit which says:
}>
}> * The address space is released by "vmspace_free(p->p_vmspace)";
}>
}> but I don't know who calls that unless it somehow happens from cpu_exit.
}The reference is initialized to 1 when it is created. See vmspace_alloc()
}in vm_map.c.
But where does vm_refcnt go to zero (in 4.5)?
}> This is not limited to linux threads, it should affect anything which
}> increments vm_refcnt and allocates shared mem. It's obvious what should
}> happen, just not obvious how to implement it without causing a side
}effect.
}> Not sure that seeing how linux does it would help in this regard.
}I think it is Linux specific. I think it is related to RFMEM not
}being set in flags when the vm_fork() is called.
As best I could tell, RFMEM is, in fact, set by the library and by the
kernel.
Callum Gibson callum.gibson@db.com
Global Markets IT, Deutsche Bank, Australia 61 2 9258 1620
### The opinions in this message are mine and not Deutsche's ###
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20020417051750.14714.qmail>
