Date: Wed, 26 Apr 2006 13:22:11 -0700 From: Jason Evans <jasone@FreeBSD.org> To: Paul Allen <nospam@ugcs.caltech.edu> Cc: Attila Nagy <bra@fsn.hu>, freebsd-current@freebsd.org Subject: Re: malloc problems with MySQL? Message-ID: <444FD673.3070004@FreeBSD.org> In-Reply-To: <20060426182837.GA29737@barf.ugcs.caltech.edu> References: <444F71F3.6030901@fsn.hu> <444F83AD.9040207@FreeBSD.org> <20060426182837.GA29737@barf.ugcs.caltech.edu>
next in thread | previous in thread | raw e-mail | index | archive | help
Paul Allen wrote: >>From Jason Evans <jasone@freebsd.org>, Wed, Apr 26, 2006 at 07:29:01AM -0700: > >>Allocations that are larger than the chunk size (2 MB by default) are >>allocated using mmap(2), rather than sbrk(2). Most likely, your >>problems will go away if you reduce the heap size, so that mmap has more >>address space to work with. > > Which raises the question of what this code is checking for... > 1167 #ifdef USE_BRK > 1168 else if ((uintptr_t)ret >= (uintptr_t)brk_base > 1169 && (uintptr_t)ret < (uintptr_t)brk_max) { > 1170 /* > 1171 * We succeeded in mapping memory, but at a location that could > 1172 * be confused with brk. Leave the mapping intact so that this > 1173 * won't ever happen again, then try again. > 1174 */ > 1175 assert(addr == NULL); > 1176 goto AGAIN; > 1177 } > 1178 #endif This code was based on my misunderstanding of MAXDSIZ. I thought it was an upper bound on how large the data segment could be, but it's actually only the default value. I will remove this code, and modify the semantics of brk_max to deal with a variable data segment size limit. > 1212 /* > 1213 * Check for address ranges that were previously chunks and try > 1214 * to use them. > 1215 */ > 1216 > 1217 tchunk = RB_MIN(chunk_tree_s, &old_chunks); > 1218 while (tchunk != NULL) { > 1219 /* Found an address range. Try to recycle it. */ > 1220 > 1221 chunk = tchunk->chunk; > 1222 delchunk = tchunk; > 1223 tchunk = RB_NEXT(chunk_tree_s, &old_chunks, delchunk); > 1224 > 1225 /* Remove delchunk from the tree. */ > 1226 RB_REMOVE(chunk_tree_s, &old_chunks, delchunk); > 1227 base_chunk_node_dealloc(delchunk); > 1228 > 1229 #ifdef USE_BRK > 1230 if ((uintptr_t)chunk >= (uintptr_t)brk_base > 1231 && (uintptr_t)chunk < (uintptr_t)brk_max) { > 1232 /* Re-use a previously freed brk chunk. */ > 1233 ret = chunk; > 1234 goto RETURN; > 1235 } > 1236 #endif > 1237 if ((ret = pages_map(chunk, size)) != NULL) { > 1238 /* Success. */ > 1239 goto RETURN; > 1240 } > 1241 } > 1242 > There is something rather scary going on in this part of the code. > Perhaps it just needs better commenting... You are scanning the tree > headed by old_chunks looking for an address range. If this is in the > mmap region you attempt to mmap it again... to "reuse address space" > > To which I have to say: huh? In chunk_dealloc you explicitly call munmap > on the address. Therefore, what is the point of intentionally "reusing" > it by insisting the kernel return an address you had before. Perhaps > this would make sense if you had some perchunk structures allocated that > you wanted to reuse but if this is the case then I question whether you > really mean to just RB_REMOVE the chunk. Surely some other cleanup would > be appropriate... This code behaves as intended. Note that in order to get chunk-aligned memory via mmap(), it is necessary to over-allocate, then trim. By keeping a cache of unmapped chunks, we can avoid the extra system calls most of the time, and we are also able to reduce memory fragmentation in many cases. (Over-allocation and trimming fails to utilize chunk-sized holes in the memory map.) > For that matter the brk allocation code should be changed to permit an > allocation greater than chunk_size. i.e., by assigning incr from size > and pulling it out of the if block. If you do this, of course you can > only be strictly better off than you are now because a program that > mainly allocs without freeing will get the benefit of the larger address > space. Whereas given that you allocate chunk_size from brk now, > when you go to reuse brk space you can decide to only do so to fullfil > chunk_size requests without losing anything relative to the current > implementation. There is one issue with this change that you don't mention: the current implementation never returns sbrk()ed memory. I did this in order to avoid races in multi-threaded programs that use brk() or sbrk(). However, in hindsight, such obscure (not to mention poorly designed) programs aren't worth supporting at the expense of all other programs. As such, I don't have a problem with making this change (along with a change that attempts to shrink the data segment during chunk deallocation). > That you do not permit kegs larger than chunk_size though suggests a > defect in your implementation. one solution to this would be to adjust > chunk_size before giving up on the allocation... I don't understand what you are saying here. Perhaps you were being confused by the chunk_size variable masking in huge_malloc() that you mention below. > Also it is quite confusing that you have a global variable "chunk_size" > that you mask with a local variable in huge_malloc. Indeed, it is. I'll fix this. Thanks, Jason
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?444FD673.3070004>