From owner-freebsd-current@FreeBSD.ORG Fri Mar 13 23:48:39 2015 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id BD5B5702 for ; Fri, 13 Mar 2015 23:48:39 +0000 (UTC) Received: from mail-ie0-x22f.google.com (mail-ie0-x22f.google.com [IPv6:2607:f8b0:4001:c03::22f]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 83CBFA1E for ; Fri, 13 Mar 2015 23:48:39 +0000 (UTC) Received: by iecvj10 with SMTP id vj10so127605891iec.0 for ; Fri, 13 Mar 2015 16:48:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=W4ruO1GsIAWBssaTlwgyVwA/kQ56dwYEJPi4bbUiFR8=; b=yyCS6q43Rf3pNO+KuhdTcpTONOmajwY9SdUxbml1uPtRW+ZmtqE55BGW0xoDkFP53z kGrIQ463bQFwgMWYP3k42sQSkYwY0LCAmVaKi/ckqLNu6vW4J804bbhfMtK8K/CrtUB9 rUhpaBbqtYfDplocO5J6Q5OEyOjrWG65SncQRmPo+GOo4KU+d7ZpKcxpYoxs9ucDv3iG BwFOt5vlfDyCdpp6yuFMrfI+yAlrXBMtVT29x/2Na+6LEFArDMXr3V13Wv7WObZk8oMn D6IxDKTYK3kqPsqinCrrOZ5++tNU0dzSxfCniRHGoUkZlHdjMBLYmJxoAToFp+G8rYko ToFQ== MIME-Version: 1.0 X-Received: by 10.50.43.162 with SMTP id x2mr115082430igl.46.1426290518934; Fri, 13 Mar 2015 16:48:38 -0700 (PDT) Received: by 10.107.156.75 with HTTP; Fri, 13 Mar 2015 16:48:38 -0700 (PDT) Date: Fri, 13 Mar 2015 19:48:38 -0400 Message-ID: Subject: What parts of UMA are part of the stable ABI? From: Ryan Stone To: FreeBSD Current Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 13 Mar 2015 23:48:39 -0000 In this freebsd-hackers thread[1], a user reported that 10.1-RELEASE crashes during boot on a system with 3TB of RAM. As it turns out, when you have that much RAM ZFS autotunes itself to allocate a 6GB hash table. This triggers a nasty 32-bit integer truncation bug in malloc(9). malloc() calls uma_large_malloc(), but uma_large_malloc() accepts an int instead of a size_t and all kinds of hilarity can ensure from there. The user has confirmed that the page in [2] fixed the kernel from instantly panicking once zfs.ko was loaded. I'm a bit concerned about whether the patch as written is an MFC candidate though. uma_large_malloc() calls page_alloc() to actuallly allocate the memory, and page_alloc() also accepts an int size parameter. This is where things get tricky. The signature for page_alloc() is governed by the uma_alloc() typedef, as uma also uses it internally for allocating memory for uma_zones. There is even a uma_zone_set_allocf() API for overriding the default allocation function. So there's definitely an argument to be made the the signature of page_alloc() being a part of the stable ABI. I have no hesitation in saying that uma_large_malloc() is not a stable API and changing it is fair game. If uma_alloc() is a part of the stable API, then it's simple enough to commit a 64-bit safe allocation function for uma_large_malloc() to call and changing page_alloc() to call it instead. That commit can be MFC'ed, and a follow-up commit could convert the UMA APIs to use size_t everywhere. While I am at this, I'd like to also change the uma init/fini/ctor/dtor to also use size_t. I'm a little torn on this because this will definitely cause a lot of churn, both in the tree and for downstream consumers, and there's not necessarily going to be a big benefit to it. However, I suppose that the existence of machines where 4GB is less than 1% of system memory may mean that allocating 4GB at a time may not that outlandish. I can definitely be talked out of this though. [1] https://lists.freebsd.org/pipermail/freebsd-hardware/2015-March/007602.html [2] http://people.freebsd.org/~rstone/patches/vm_64bit_malloc.diff