From owner-freebsd-current@FreeBSD.ORG Wed Apr 26 18:28:38 2006 Return-Path: X-Original-To: freebsd-current@freebsd.org Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1E9F616A402; Wed, 26 Apr 2006 18:28:38 +0000 (UTC) (envelope-from jd@ugcs.caltech.edu) Received: from barf.ugcs.caltech.edu (barf.ugcs.caltech.edu [131.215.176.99]) by mx1.FreeBSD.org (Postfix) with ESMTP id CC31D43D4C; Wed, 26 Apr 2006 18:28:37 +0000 (GMT) (envelope-from jd@ugcs.caltech.edu) Received: by barf.ugcs.caltech.edu (Postfix, from userid 3640) id 204061F00C2; Wed, 26 Apr 2006 11:28:37 -0700 (PDT) Date: Wed, 26 Apr 2006 11:28:37 -0700 From: Paul Allen To: Jason Evans Message-ID: <20060426182837.GA29737@barf.ugcs.caltech.edu> References: <444F71F3.6030901@fsn.hu> <444F83AD.9040207@FreeBSD.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <444F83AD.9040207@FreeBSD.org> Sender: jd@ugcs.caltech.edu Cc: Attila Nagy , freebsd-current@freebsd.org Subject: Re: malloc problems with MySQL? X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Apr 2006 18:28:38 -0000 >From Jason Evans , Wed, Apr 26, 2006 at 07:29:01AM -0700: > Allocations that are larger than the chunk size (2 MB by default) are > allocated using mmap(2), rather than sbrk(2). Most likely, your > problems will go away if you reduce the heap size, so that mmap has more > address space to work with. Which raises the question of what this code is checking for... 1167 #ifdef USE_BRK 1168 else if ((uintptr_t)ret >= (uintptr_t)brk_base 1169 && (uintptr_t)ret < (uintptr_t)brk_max) { 1170 /* 1171 * We succeeded in mapping memory, but at a location that could 1172 * be confused with brk. Leave the mapping intact so that this 1173 * won't ever happen again, then try again. 1174 */ 1175 assert(addr == NULL); 1176 goto AGAIN; 1177 } 1178 #endif imo, mmap should be changed to allocate below the limit of maxdsiz but always above the current brk. It does not currently do so. Thus these headaches. Of course your code is littered with checks effectively against maxdsiz so changing this behavior would break your malloc implementation :o( 1212 /* 1213 * Check for address ranges that were previously chunks and try 1214 * to use them. 1215 */ 1216 1217 tchunk = RB_MIN(chunk_tree_s, &old_chunks); 1218 while (tchunk != NULL) { 1219 /* Found an address range. Try to recycle it. */ 1220 1221 chunk = tchunk->chunk; 1222 delchunk = tchunk; 1223 tchunk = RB_NEXT(chunk_tree_s, &old_chunks, delchunk); 1224 1225 /* Remove delchunk from the tree. */ 1226 RB_REMOVE(chunk_tree_s, &old_chunks, delchunk); 1227 base_chunk_node_dealloc(delchunk); 1228 1229 #ifdef USE_BRK 1230 if ((uintptr_t)chunk >= (uintptr_t)brk_base 1231 && (uintptr_t)chunk < (uintptr_t)brk_max) { 1232 /* Re-use a previously freed brk chunk. */ 1233 ret = chunk; 1234 goto RETURN; 1235 } 1236 #endif 1237 if ((ret = pages_map(chunk, size)) != NULL) { 1238 /* Success. */ 1239 goto RETURN; 1240 } 1241 } 1242 There is something rather scary going on in this part of the code. Perhaps it just needs better commenting... You are scanning the tree headed by old_chunks looking for an address range. If this is in the mmap region you attempt to mmap it again... to "reuse address space" To which I have to say: huh? In chunk_dealloc you explicitly call munmap on the address. Therefore, what is the point of intentionally "reusing" it by insisting the kernel return an address you had before. Perhaps this would make sense if you had some perchunk structures allocated that you wanted to reuse but if this is the case then I question whether you really mean to just RB_REMOVE the chunk. Surely some other cleanup would be appropriate... imo, after you call munmap on a chunk you should not be placing it into oldchunks. and you should remove the call to page_map on line 1237 and place it after? the attempt to make a new brk allocation. For that matter the brk allocation code should be changed to permit an allocation greater than chunk_size. i.e., by assigning incr from size and pulling it out of the if block. If you do this, of course you can only be strictly better off than you are now because a program that mainly allocs without freeing will get the benefit of the larger address space. Whereas given that you allocate chunk_size from brk now, when you go to reuse brk space you can decide to only do so to fullfil chunk_size requests without losing anything relative to the current implementation. That you do not permit kegs larger than chunk_size though suggests a defect in your implementation. one solution to this would be to adjust chunk_size before giving up on the allocation... Also it is quite confusing that you have a global variable "chunk_size" that you mask with a local variable in huge_malloc. In summary: I think the appropriate logic of chunk_alloc would be 1) if size == chunk_size attempt to reuse an entry from brk given in old_chunks--old_chunks should only contain brk allocations 1b) if that fails attempt to extend brk 1c) if that fails attempt an mmap 2) if size != chunk_size attempt to mmap the region 2b) if that fails allocate from brk I think chunk_dealloc is already smart enough such that when deallocating the huge sbrk that it will actually make multiple chunk_size entries in old_chunks. > > Jason > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"