From owner-svn-src-all@FreeBSD.ORG  Tue May  6 04:46:47 2014
Return-Path: <owner-svn-src-all@FreeBSD.ORG>
Delivered-To: svn-src-all@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id 6BFEC914;
 Tue,  6 May 2014 04:46:47 +0000 (UTC)
Received: from mail107.syd.optusnet.com.au (mail107.syd.optusnet.com.au
 [211.29.132.53]) by mx1.freebsd.org (Postfix) with ESMTP id 189B2F47;
 Tue,  6 May 2014 04:46:46 +0000 (UTC)
Received: from c122-106-147-133.carlnfd1.nsw.optusnet.com.au
 (c122-106-147-133.carlnfd1.nsw.optusnet.com.au [122.106.147.133])
 by mail107.syd.optusnet.com.au (Postfix) with ESMTPS id C0573D44AFE;
 Tue,  6 May 2014 14:46:43 +1000 (EST)
Date: Tue, 6 May 2014 14:46:43 +1000 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Andrey Chernov <ache@freebsd.org>
Subject: Re: svn commit: r265367 - head/lib/libc/regex
In-Reply-To: <5368162A.9000002@freebsd.org>
Message-ID: <20140506135706.T1596@besplex.bde.org>
References: <201405051641.s45GfFje086423@svn.freebsd.org>
 <5367CD77.40909@freebsd.org>
 <B11B5B25-8E05-4225-93D5-3A607332F19A@FreeBSD.org>
 <5367EB54.1080109@FreeBSD.org>
 <3C7CFFB7-5C84-4AC1-9A81-C718D184E87B@FreeBSD.org>
 <7D7A417E-17C3-4001-8E79-0B57636A70E1@gmail.com>
 <A4B5E0E8-93CB-4E80-9065-5D25A007B726@FreeBSD.org>
 <536807D8.9000005@freebsd.org>
 <9349EAA9-F92C-4170-A1C0-2200FD490E5F@FreeBSD.org>
 <5368162A.9000002@freebsd.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Optus-CM-Score: 0
X-Optus-CM-Analysis: v=2.1 cv=eojmkOZX c=1 sm=1 tr=0
 a=7NqvjVvQucbO2RlWB8PEog==:117 a=PO7r1zJSAAAA:8 a=mAlE_CWfnZ8A:10
 a=kj9zAlcOel0A:10 a=JzwRw_2MAAAA:8 a=6I5d2MoRAAAA:8
 a=tdNqX4nCuuD4MD8TwAQA:9 a=qahZVF4bb32W-GKS:21 a=-Li8O5LmJa_Gh2pa:21
 a=CjuIK1q_8ugA:10 a=SV7veod9ZcQA:10
Cc: src-committers <src-committers@freebsd.org>, svn-src-all@freebsd.org,
 David Chisnall <theraven@freebsd.org>, Pedro Giffuni <pfg@freebsd.org>,
 svn-src-head@freebsd.org, Warner Losh <imp@bsdimp.com>
X-BeenThere: svn-src-all@freebsd.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: "SVN commit messages for the entire src tree \(except for &quot;
 user&quot; and &quot; projects&quot; \)" <svn-src-all.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/svn-src-all>,
 <mailto:svn-src-all-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/svn-src-all/>
List-Post: <mailto:svn-src-all@freebsd.org>
List-Help: <mailto:svn-src-all-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/svn-src-all>,
 <mailto:svn-src-all-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 06 May 2014 04:46:47 -0000

On Tue, 6 May 2014, Andrey Chernov wrote:

> On 06.05.2014 2:12, David Chisnall wrote:
>> On 5 May 2014, at 22:51, Andrey Chernov <ache@freebsd.org> wrote:
>>
>>> For standard malloc/realloc interface it is up to the caller to check
>>> n*size not overflows. You must trust caller already does such check.
>>
>> Do a search of the CVE database sometime to see how well placed that trust generally is.  Or even look at the code in question, where none of the realloc() or malloc() calls does overflow checking.
>
> I know current situation and disagree with OpenBSD way to fix it. Public
> interface assumes that caller should be trusted. Period. How well it is
> really trusted is up to the caller and should be fixed in it clearly,
> allowing human to trace the logic.
>
>>> Using calloc() to enforce it instead of caller is semantically wrong,
>>
>> Relying on a standard function to behave according to the standard is semantically wrong?

The standard behaviour is undefined.  It cannot be relied on.  From C99
(n869.txt):

%        7.20.3.1  The calloc function
% 
%        Synopsis
% 
%        [#1]
% 
%                #include <stdlib.h>
%                void *calloc(size_t nmemb, size_t size);
% 
%        Description
% 
%        [#2] The calloc function allocates space  for  an  array  of
%        nmemb  objects,  each  of  whose size is size.  The space is
%        initialized to all bits zero.238)

Oops, there is no object to begin with, so perhaps the behaviour is
defined after all.  This is unclear.  It is also unclear if objects
can have size too large to represent as a size_t.  C99 says that
sizeof(object) is the size in bytes of an object, but it also says
that the value of sizeof() is implementation-defined.  If the multiplication
overflows, then it can be argued that the behaviour is undefined (because
the object cannot exist since sizeof() is required to actually return the
size), and it can be argued that the behaviour is defined in some cases
even when the multiplication overflows (because sizeof() is only required
to return an implementation-defined value like the actual size modulo
SIZE_MAX; then the object might exist).

calloc() may have been actually useful orginally to handle the weird
second case.  In K&R1, size_t didn't exist and whether sizeof() worked
was even less clear than now.  The type of sizeof() was "an integer".
malloc() took an int arg IIRC.  malloc() is not even in the index in
K&R1.  But objects of size larger than INT_MAX were useful, and it
would be reasonable to ask for calloc() to allocate one.  K&R1 has
calloc() in the index and documents it as calloc(n, sizeof(object)),
where n and sizeof() are apparently implicit-int.  So calloc(16368, 2)
should give an object of size 32768 if possible.  sizeof(this) is
then unrepresentable as a 16-bit int.  I used arrays larger than
32768 quite often on 16-bit systems, but only with 16-bit unsigned
size_t.

> Yes. Generally it is using a function outside of its purpose. I.e. you
> can use calloc() just to check n*size and nothing else (free() result
> immediately afterwards) instead of writing just single check by
> yourself. It will be legal usage but semantically wrong and misleading.
>
>>> and especially strange when the caller is standard C library under your
>>> control.
>>
>> I don't follow this.  If libc can't rely on standards conformance from itself then other code stands no chance.

calloc() in FreeBSD is controlled too, but in 4.4BSD it just did the
multiplication blindly.  This was fixed (if it is a bug) in FreeBSD
in 2002 (by tjr).  The errno was the nondescript ENOMEM.  Now, calloc()
is sophisticated but the errno still seems to be ENOMEM.

I think calloc() should check for overflow but callers shouldn't depend
on this.

In practice, the multiplication is less likely to overflow than malloc()
is to fail, which "can't happen".  You could limit the number of
elements to something reasonable like 2**28 to ensure that the
multiplication can't overflow with an element size of 8.  The rare
program that needs to support allocating more than 2**28 elements on
32-bit systems according to user input can be more careful.  On 64-bit
systems, you can use a less modest limit.

Bruce