Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 26 Apr 2006 11:28:37 -0700
From:      Paul Allen <nospam@ugcs.caltech.edu>
To:        Jason Evans <jasone@freebsd.org>
Cc:        Attila Nagy <bra@fsn.hu>, freebsd-current@freebsd.org
Subject:   Re: malloc problems with MySQL?
Message-ID:  <20060426182837.GA29737@barf.ugcs.caltech.edu>
In-Reply-To: <444F83AD.9040207@FreeBSD.org>
References:  <444F71F3.6030901@fsn.hu> <444F83AD.9040207@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
>From Jason Evans <jasone@freebsd.org>, Wed, Apr 26, 2006 at 07:29:01AM -0700:
> Allocations that are larger than the chunk size (2 MB by default) are 
> allocated using mmap(2), rather than sbrk(2).  Most likely, your 
> problems will go away if you reduce the heap size, so that mmap has more 
> address space to work with.
Which raises the question of what this code is checking for...
1167 #ifdef USE_BRK
1168         else if ((uintptr_t)ret >= (uintptr_t)brk_base
1169             && (uintptr_t)ret < (uintptr_t)brk_max) {
1170                 /*
1171                  * We succeeded in mapping memory, but at a location that could
1172                  * be confused with brk.  Leave the mapping intact so that this
1173                  * won't ever happen again, then try again.
1174                  */
1175                 assert(addr == NULL);
1176                 goto AGAIN;
1177         }
1178 #endif

imo, mmap should be changed to allocate below the limit of maxdsiz but always
above the current brk.  It does not currently do so.  Thus these headaches.
Of course your code is littered with checks effectively against maxdsiz so
changing this behavior would break your malloc implementation :o(

1212                 /*
1213                  * Check for address ranges that were previously chunks and try
1214                  * to use them.
1215                  */
1216 
1217                 tchunk = RB_MIN(chunk_tree_s, &old_chunks);
1218                 while (tchunk != NULL) {
1219                         /* Found an address range.  Try to recycle it. */
1220 
1221                         chunk = tchunk->chunk;
1222                         delchunk = tchunk;
1223                         tchunk = RB_NEXT(chunk_tree_s, &old_chunks, delchunk);
1224 
1225                         /* Remove delchunk from the tree. */
1226                         RB_REMOVE(chunk_tree_s, &old_chunks, delchunk);
1227                         base_chunk_node_dealloc(delchunk);
1228 
1229 #ifdef USE_BRK
1230                         if ((uintptr_t)chunk >= (uintptr_t)brk_base
1231                             && (uintptr_t)chunk < (uintptr_t)brk_max) {
1232                                 /* Re-use a previously freed brk chunk. */
1233                                 ret = chunk;
1234                                 goto RETURN;
1235                         }
1236 #endif
1237                         if ((ret = pages_map(chunk, size)) != NULL) {
1238                                 /* Success. */
1239                                 goto RETURN;
1240                         }
1241                 }
1242 
There is something rather scary going on in this part of the code.
Perhaps it just needs better commenting... You are scanning the tree
headed by old_chunks looking for an address range.  If this is in the
mmap region you attempt to mmap it again... to "reuse address space"

To which I have to say: huh? In chunk_dealloc you explicitly call munmap 
on the address.  Therefore, what is the point of intentionally "reusing" 
it by insisting the kernel return an address you had before.  Perhaps
this would make sense if you had some perchunk structures allocated that
you wanted to reuse but if this is the case then I question whether you
really mean to just RB_REMOVE the chunk.  Surely some other cleanup would
be appropriate... 

imo, after you call munmap on a chunk you should not be placing it into 
oldchunks.  and you should remove the call to page_map on line 1237 and
place it after? the attempt to make a new brk allocation.

For that matter the brk allocation code should be changed to permit an
allocation greater than chunk_size.  i.e., by assigning incr from size
and pulling it out of the if block.  If you do this, of course you can
only be strictly better off than you are now because a program that
mainly allocs without freeing will get the benefit of the larger address
space.  Whereas given that you allocate chunk_size from brk now,
when you go to reuse brk space you can decide to only do so to fullfil
chunk_size requests without losing anything relative to the current 
implementation.

That you do not permit kegs larger than chunk_size though suggests a
defect in your implementation.  one solution to this would be to adjust
chunk_size before giving up on the allocation...

Also it is quite confusing that you have a global variable "chunk_size"
that you mask with a local variable in huge_malloc.

In summary: I think the appropriate logic of chunk_alloc would be
1) if size == chunk_size attempt to reuse an entry from brk given in
   old_chunks--old_chunks should only contain brk allocations
1b) if that fails attempt to extend brk
1c) if that fails attempt an mmap
2) if size != chunk_size attempt to mmap the region
2b) if that fails allocate from brk

I think chunk_dealloc is already smart enough such that when deallocating 
the huge sbrk that it will actually make multiple chunk_size entries in
 old_chunks.

> 
> Jason
> _______________________________________________
> freebsd-current@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060426182837.GA29737>