From owner-freebsd-hackers@FreeBSD.ORG Sun Feb 24 21:16:26 2008 Return-Path: Delivered-To: hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 032E816A401; Sun, 24 Feb 2008 21:16:26 +0000 (UTC) (envelope-from rea-fbsd@codelabs.ru) Received: from 0.mx.codelabs.ru (0.mx.codelabs.ru [144.206.177.45]) by mx1.freebsd.org (Postfix) with ESMTP id 6885713C45A; Sun, 24 Feb 2008 21:16:25 +0000 (UTC) (envelope-from rea-fbsd@codelabs.ru) DomainKey-Signature: a=rsa-sha1; q=dns; c=simple; s=one; d=codelabs.ru; h=Received:Date:From:To:Cc:Message-ID:References:MIME-Version:Content-Type:Content-Disposition:Content-Transfer-Encoding:In-Reply-To:Sender:X-Spam-Status:Subject; b=F0H29h0qRw4T7lGqf4pYcQMbuvysVtd85dcLPMADQI2av+rjCtWiaIOndCaBwNAGmuuPiEZfc0sAJIae0ukS3S3EmRoXEIEBS18yIR6BXQeDnkjgq3+GwooMzGlVn1eASx9gSIiQgFq0dH58o8Tot+6DFpTkLVCnD+zmjArD13A=; Received: from amnesiac.at.no.dns (ppp85-141-160-161.pppoe.mtu-net.ru [85.141.160.161]) by 0.mx.codelabs.ru with esmtpsa (TLSv1:AES256-SHA:256) id 1JTOCt-000LDX-K2; Mon, 25 Feb 2008 00:16:24 +0300 Date: Mon, 25 Feb 2008 00:16:22 +0300 From: Eygene Ryabinkin To: Jeremy Chadwick Message-ID: References: <20080223010856.7244.qmail@smasher.org> <47C068B5.2090000@thedarkside.nl> <20080223185620.GA98105@eos.sc1.parodius.com> MIME-Version: 1.0 Content-Type: text/plain; charset=koi8-r Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: Sender: rea-fbsd@codelabs.ru X-Spam-Status: No, score=-2.2 required=4.0 tests=ALL_TRUSTED,AWL,BAYES_05 Cc: hackers@freebsd.org, security@freebsd.org Subject: Re: Zeroing sensitive memory chunks [Was: Security Flaw in Popular Disk Encryption Technologies] X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 24 Feb 2008 21:16:26 -0000 Good day. I am posting the follow-up to the -hackers and CC'ing to the -security, because some more-or-less nasty points were found. Sat, Feb 23, 2008 at 10:32:02PM +0300, Eygene Ryabinkin wrote: > But there is another concern with bzero(): it is well-known function. > Especially for compilers. And it is bad: some arrays inside g_eli, > that hold decryption keys are the local variables. And they are > not used after the final bzero() call, so optimizing compiler can > eliminate the bzero() completely, as the "not relevant". I don't > know if GCC does it -- I should check and will do this tomorrow, > because it is already too late in Russia for this day ;)) Generally speaking, there is nothing special in this finding: "Secure Programming Cookbook for C and C++" from O'Reilly warns the reader about it (page 705 and below, citing by the current edition, http://www.oreilly.com/catalog/secureprgckbk/). OK, gcc 4.2.1 does the dead code removal and sometimes bzero/memset can be omitted (gcc 3.4.6, the one I have at hand from the 3.x branch, is never omitting these functions). It really depends on the size of the local array. The test program is: ----- #include #include #include #define bar(n) \ void bar ## n(void) \ { \ char b[n]; \ scanf("%" #n "s", b); \ memset(b, '\0', sizeof(b)); \ } #define foo(n) \ void foo ## n(void) \ { \ char b[n]; \ scanf("%" #n "s", b); \ bzero(b, sizeof(b)); \ } foo(16) foo(24) foo(32) foo(1024) bar(16) bar(24) bar(28) bar(30) bar(31) bar(32) bar(1024) int main(void) { foo16(); foo24(); foo28(); foo32(); foo1024(); bar16(); bar24(); bar28(); bar30(); bar31(); bar32(); bar1024(); return 0; } ----- Compiled with '-O' switch, it eliminates bzero/memset for all functions with the local array size < 32. For example, this is the assembler code of bar31 (taken from 'gcc -O -S -o test.s test.c'): ----- .globl bar31 .type bar31, @function bar31: pushl %ebp movl %esp, %ebp subl $40, %esp leal -31(%ebp), %eax movl %eax, 4(%esp) movl $.LC2, (%esp) call scanf leave ret .size bar31, .-bar31 .section .rodata.str1.1 .LC3: .string "%30s" .text .p2align 4,,15 ----- The simple PoC session transcript follows: ----- $ cat poc.c #include #include #include void pass(void) { char buffer[31]; printf("Password: "); scanf("%30s", buffer); memset(buffer, 0, sizeof(buffer)); } int main(void) { pass(); return 0; } $ gcc -O -g -o poc poc.c $ gdb poc GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-marcel-freebsd"... (gdb) b main Breakpoint 1 at 0x8048450: file poc.c, line 15. (gdb) run Starting program: /some/path/poc Breakpoint 1, main () at poc.c:15 15 { (gdb) step main () at poc.c:16 16 pass(); (gdb) step pass () at poc.c:9 9 printf("Password: "); (gdb) print &buffer $1 = (char (*)[31]) 0xbfbfec69 (gdb) step 10 scanf("%30s", buffer); (gdb) Password: ASDFGHJKLQWERTYUIO 12 } (gdb) main () at poc.c:18 18 } (gdb) 0x080483af in _start () (gdb) x/35c 0xbfbfec69 0xbfbfec69: 65 'A' 83 'S' 68 'D' 70 'F' 71 'G' 72 'H' 74 'J' 75 'K' 0xbfbfec71: 76 'L' 81 'Q' 87 'W' 69 'E' 82 'R' 84 'T' 89 'Y' 85 'U' 0xbfbfec79: 73 'I' 79 'O' 0 '\0' 1 '\001' 0 '\0' 0 '\0' 0 '\0' -36 'Ü' 0xbfbfec81: -20 'ì' -65 '¿' -65 '¿' 0 '\0' 0 '\0' 0 '\0' 0 '\0' -104 '\230' 0xbfbfec89: -20 'ì' -65 '¿' -65 '¿' (gdb) ----- Currently, I was not able to identify gcc's code that removes the memset in question, so if anyone has ideas -- it will be appreciated. Just now there should be no flaws in the geli code if not dead code elimination occurs for array sizes >= 32: all local buffers that are cleared by bzero() are at least 64 bytes. But there can be other parts of a kernel that needs to be verified and gcc's conditions for the memset elimination should be analyzed. I will continue to investigate. And still, if some old/new gcc's (or other compilers, although they are not suspported for the kernel builds) will change their mind about elimination conditions for the tail memset(), we can be in trouble. Sure, stack-based variables can be rewritten by further calls to other functions, but it is better to be safe, then sorry ;-/ And it seems not to matter that kernel library has its own memset implementation, because a) gcc changes bzero -> memset in the single pass and the internal memset is tried first (gcc's builtins.c, lines 3529 and down); b) gcc tries to eliminate memset() even if there is such local function. Here is the test example: ----- $ cat poc1.c #include #include void * memset(void *b, int c, size_t len) { char *bb; for (bb = (char *)b; len--; ) *bb++ = c; return (b); } int foo31() { char buffer[31]; scanf("%30s", buffer); memset(buffer, 0, sizeof(buffer)); } int bar31() { char buffer[31]; scanf("%30s", buffer); bzero(buffer, sizeof(buffer)); } int main() { bar31(); foo31(); return 0; } ----- As one can verify with 'gcc -O -S -o poc1.s poc1.c', there will be no cleaning in both foo31() and bar31(). > For example, OpenSSL has the OPENSSL_cleanse() function whose purpose > is two-fold (from http://cvs.openssl.org/chngview?cn=9301): > ----- > *) New function OPENSSL_cleanse(), which is used to cleanse a section of > memory from it's contents. This is done with a counter that will > place alternating values in each byte. This can be used to solve > two issues: 1) the removal of calls to memset() by highly optimizing > compilers, and 2) cleansing with other values than 0, since those can > be read through on certain media, for example a swap space on disk. > [Geoff Thorpe] > ----- > > The '1)' is what I was talking about. '2)' is not very clear to > me now, I should research what Geoff meant. If anyone has an idea, > please comment. > > May be the crypto(4,9) framework should receive the function, that > will be simular to the OPENSSL_cleanse. And that function should > be used for wiping of the sensitive data. I rethinked this issue: the function should go (if it should be ever added) to the general kernel library. The reason is that there can be other code that will need data sanitization and it should not rely on the crypto framework. -- Eygene