From owner-svn-src-all@FreeBSD.ORG Tue May 6 04:46:47 2014 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 6BFEC914; Tue, 6 May 2014 04:46:47 +0000 (UTC) Received: from mail107.syd.optusnet.com.au (mail107.syd.optusnet.com.au [211.29.132.53]) by mx1.freebsd.org (Postfix) with ESMTP id 189B2F47; Tue, 6 May 2014 04:46:46 +0000 (UTC) Received: from c122-106-147-133.carlnfd1.nsw.optusnet.com.au (c122-106-147-133.carlnfd1.nsw.optusnet.com.au [122.106.147.133]) by mail107.syd.optusnet.com.au (Postfix) with ESMTPS id C0573D44AFE; Tue, 6 May 2014 14:46:43 +1000 (EST) Date: Tue, 6 May 2014 14:46:43 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Andrey Chernov Subject: Re: svn commit: r265367 - head/lib/libc/regex In-Reply-To: <5368162A.9000002@freebsd.org> Message-ID: <20140506135706.T1596@besplex.bde.org> References: <201405051641.s45GfFje086423@svn.freebsd.org> <5367CD77.40909@freebsd.org> <5367EB54.1080109@FreeBSD.org> <3C7CFFB7-5C84-4AC1-9A81-C718D184E87B@FreeBSD.org> <7D7A417E-17C3-4001-8E79-0B57636A70E1@gmail.com> <536807D8.9000005@freebsd.org> <9349EAA9-F92C-4170-A1C0-2200FD490E5F@FreeBSD.org> <5368162A.9000002@freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.1 cv=eojmkOZX c=1 sm=1 tr=0 a=7NqvjVvQucbO2RlWB8PEog==:117 a=PO7r1zJSAAAA:8 a=mAlE_CWfnZ8A:10 a=kj9zAlcOel0A:10 a=JzwRw_2MAAAA:8 a=6I5d2MoRAAAA:8 a=tdNqX4nCuuD4MD8TwAQA:9 a=qahZVF4bb32W-GKS:21 a=-Li8O5LmJa_Gh2pa:21 a=CjuIK1q_8ugA:10 a=SV7veod9ZcQA:10 Cc: src-committers , svn-src-all@freebsd.org, David Chisnall , Pedro Giffuni , svn-src-head@freebsd.org, Warner Losh X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 May 2014 04:46:47 -0000 On Tue, 6 May 2014, Andrey Chernov wrote: > On 06.05.2014 2:12, David Chisnall wrote: >> On 5 May 2014, at 22:51, Andrey Chernov wrote: >> >>> For standard malloc/realloc interface it is up to the caller to check >>> n*size not overflows. You must trust caller already does such check. >> >> Do a search of the CVE database sometime to see how well placed that trust generally is. Or even look at the code in question, where none of the realloc() or malloc() calls does overflow checking. > > I know current situation and disagree with OpenBSD way to fix it. Public > interface assumes that caller should be trusted. Period. How well it is > really trusted is up to the caller and should be fixed in it clearly, > allowing human to trace the logic. > >>> Using calloc() to enforce it instead of caller is semantically wrong, >> >> Relying on a standard function to behave according to the standard is semantically wrong? The standard behaviour is undefined. It cannot be relied on. From C99 (n869.txt): % 7.20.3.1 The calloc function % % Synopsis % % [#1] % % #include % void *calloc(size_t nmemb, size_t size); % % Description % % [#2] The calloc function allocates space for an array of % nmemb objects, each of whose size is size. The space is % initialized to all bits zero.238) Oops, there is no object to begin with, so perhaps the behaviour is defined after all. This is unclear. It is also unclear if objects can have size too large to represent as a size_t. C99 says that sizeof(object) is the size in bytes of an object, but it also says that the value of sizeof() is implementation-defined. If the multiplication overflows, then it can be argued that the behaviour is undefined (because the object cannot exist since sizeof() is required to actually return the size), and it can be argued that the behaviour is defined in some cases even when the multiplication overflows (because sizeof() is only required to return an implementation-defined value like the actual size modulo SIZE_MAX; then the object might exist). calloc() may have been actually useful orginally to handle the weird second case. In K&R1, size_t didn't exist and whether sizeof() worked was even less clear than now. The type of sizeof() was "an integer". malloc() took an int arg IIRC. malloc() is not even in the index in K&R1. But objects of size larger than INT_MAX were useful, and it would be reasonable to ask for calloc() to allocate one. K&R1 has calloc() in the index and documents it as calloc(n, sizeof(object)), where n and sizeof() are apparently implicit-int. So calloc(16368, 2) should give an object of size 32768 if possible. sizeof(this) is then unrepresentable as a 16-bit int. I used arrays larger than 32768 quite often on 16-bit systems, but only with 16-bit unsigned size_t. > Yes. Generally it is using a function outside of its purpose. I.e. you > can use calloc() just to check n*size and nothing else (free() result > immediately afterwards) instead of writing just single check by > yourself. It will be legal usage but semantically wrong and misleading. > >>> and especially strange when the caller is standard C library under your >>> control. >> >> I don't follow this. If libc can't rely on standards conformance from itself then other code stands no chance. calloc() in FreeBSD is controlled too, but in 4.4BSD it just did the multiplication blindly. This was fixed (if it is a bug) in FreeBSD in 2002 (by tjr). The errno was the nondescript ENOMEM. Now, calloc() is sophisticated but the errno still seems to be ENOMEM. I think calloc() should check for overflow but callers shouldn't depend on this. In practice, the multiplication is less likely to overflow than malloc() is to fail, which "can't happen". You could limit the number of elements to something reasonable like 2**28 to ensure that the multiplication can't overflow with an element size of 8. The rare program that needs to support allocating more than 2**28 elements on 32-bit systems according to user input can be more careful. On 64-bit systems, you can use a less modest limit. Bruce