From owner-freebsd-arch@FreeBSD.ORG Thu Jul 22 17:59:17 2010 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6115D1065674 for ; Thu, 22 Jul 2010 17:59:17 +0000 (UTC) (envelope-from mdf356@gmail.com) Received: from mail-ew0-f54.google.com (mail-ew0-f54.google.com [209.85.215.54]) by mx1.freebsd.org (Postfix) with ESMTP id E32F28FC08 for ; Thu, 22 Jul 2010 17:59:16 +0000 (UTC) Received: by ewy26 with SMTP id 26so3458183ewy.13 for ; Thu, 22 Jul 2010 10:59:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:sender:received :in-reply-to:references:date:x-google-sender-auth:message-id:subject :from:to:cc:content-type:content-transfer-encoding; bh=X0vHkpvMe+udvrROZUJbxUGzy4Iu4kk/q0Jk1+Nd7F0=; b=kyU4qlDGWPRvBwdhDVyr9QM2q4+Ys+v8atbrTn7DaHjISfKawND0F5Kqht0R90kptP VYdM3eQGy06veN0memp0VYmfBGn/FFjGaLPr5wYpYVv5B0JrkTP0/Q+uSPd6mdfpPeLN +3lHZTNPnToOq+JDTtpf7uUyrOodKHjHeWuYs= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=x3yrzv+tRJHbvWl40FA0J4kuGm+3+0L9P9dovRr9MH/DobMT/d1Y8GOCLyCKklEU8j ZauClnQsDD6u3ybnROSdn5sRX2FCaY/tfU5FqNuw+VxPD1YWtJGAH8kxdZroeZCYYMY+ 1Q8MRShJ+fHODwCmG44IjO6XQs8GE8PgW5/lk= MIME-Version: 1.0 Received: by 10.213.19.207 with SMTP id c15mr8399804ebb.1.1279821555302; Thu, 22 Jul 2010 10:59:15 -0700 (PDT) Sender: mdf356@gmail.com Received: by 10.42.6.85 with HTTP; Thu, 22 Jul 2010 10:59:14 -0700 (PDT) In-Reply-To: <20100722174120.GR2381@deviant.kiev.zoral.com.ua> References: <20100722174120.GR2381@deviant.kiev.zoral.com.ua> Date: Thu, 22 Jul 2010 10:59:14 -0700 X-Google-Sender-Auth: Wcexl4YW7dz2u-3CF6I1HMuImiU Message-ID: From: mdf@FreeBSD.org To: Kostik Belousov Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-arch@freebsd.org Subject: Re: Multi-zone malloc(9) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Jul 2010 17:59:17 -0000 On Thu, Jul 22, 2010 at 10:41 AM, Kostik Belousov wro= te: > On Thu, Jul 22, 2010 at 09:54:51AM -0700, mdf@freebsd.org wrote: >> Occasionally we run into use-after-free and malloc'd buffer overrun >> scenarios. =A0When this happens it can be rather difficult to determine >> what code is at fault, since e.g. every 64 byte allocation, regardless >> of malloc type, comes from the same UMA zone. =A0This means that an >> overflow in M_TEMP will affect M_DEVBUF, etc. =A0Adding multiple uma >> zones for each bucket size means that we can hash on the malloc type's >> shortdesc field so that there are fewer collisions and misused memory >> from one malloc type only affects a subset of other malloc types. >> Varying the hash means that, with several crashes due to memory stomp, >> a single malloc type can usually be determined as the culprit. =A0If the >> bug isn't obvious from inspection at this point, MemGuard will help >> catch the offender. >> >> The patch at: >> >> =A0 =A0http://people.freebsd.org/~mdf/multizone_malloc.patch >> >> implements an optional multi-zone malloc(9). =A0By default there is a >> single zone, and MALLOC_DEBUG_MAXZONES can be specified in the kernel >> configuration file. =A0A ddb function will print all the malloc types >> that have a hash collision with the specified type. >> >> A few questions for -arch@: >> >> =A0- We found this very useful at Isilon. =A0Should this go into CURRENT= ? >> >> =A0- Should this be on by default for GENERIC? =A0The memory overhead of= 8 >> uma zones per malloc allocation size shouldn't be very large. >> >> =A0- would a __FreeBSD_version bump be needed since the malloc_internal >> type is known by user-space? > > Can you quantify the overhead, both in CPU time and memory usage terms > ? I would much prefer to have debug and non-debug kernels to run > similar code, in other words, can the multizone allocation be enabled > unconditionally ? CPU usage should be lost in the noise, since the subzone hash class is computed once at boot. Memory overhead is a bit hard to quantify, since with multiple zones there are more partially allocated pages in each malloc size. Worst case would be, I think, 1 page per CPU per multi-zone per allocation size, so for 4 CPUs with 4k pages, that would be 4*8*9 or a little over 1MB. The actual number of multi-zones is also a TUNABLE, so the same code could be built for debug and non-debug, and a loader.conf change would enable the actual use of multiple zones during boot. Thanks, matthew