From owner-freebsd-security@FreeBSD.ORG Mon Sep 24 21:57:06 2012 Return-Path: Delivered-To: freebsd-security@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id C4A271065670; Mon, 24 Sep 2012 21:57:06 +0000 (UTC) (envelope-from mariusz.gromada@gmail.com) Received: from mail-we0-f182.google.com (mail-we0-f182.google.com [74.125.82.182]) by mx1.freebsd.org (Postfix) with ESMTP id B83AC8FC0C; Mon, 24 Sep 2012 21:57:05 +0000 (UTC) Received: by weyx43 with SMTP id x43so875776wey.13 for ; Mon, 24 Sep 2012 14:57:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; bh=Rdnhgn+H0k5eVHEA/GEJ3cndRipYEgh6+Oqr3ECqct8=; b=a8iPoWp3zriCWRD0goAoPWEL6NewmVpv2vsu+FimeZo6ji4X4/dRXlCCkY9w8bnMby 30BIQgLSPFQ/Fr7fhXDt2LA4c9XAhc6yNRX+a3S0aamXSnIplA0EAxUqI+4reopJQ29s rKP/AfWzFVyFI/FMq+c7M7K09nwZJthhpxLSIRxV9PoU9G7Bc2oa50b72uNDvvdBmG1T iYSrLHUo8T7Ud9tYkzkWxRykQDUcIakDaiqDI91g8+VozolArygFufnijHWfI8Aah7Qc SsHbIe0ct6xReoDiqeT7z/tyk649JjMBQ8TRiE4UTRHWHZa3gTWeuU5wXSokeOzBT2u9 awDw== Received: by 10.180.83.66 with SMTP id o2mr17006228wiy.14.1348523824685; Mon, 24 Sep 2012 14:57:04 -0700 (PDT) Received: from [192.168.1.100] (89-76-147-86.dynamic.chello.pl. [89.76.147.86]) by mx.google.com with ESMTPS id k20sm16811345wiv.11.2012.09.24.14.57.02 (version=SSLv3 cipher=OTHER); Mon, 24 Sep 2012 14:57:03 -0700 (PDT) Message-ID: <5060D723.6020305@gmail.com> Date: Mon, 24 Sep 2012 23:56:51 +0200 From: Mariusz Gromada User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:15.0) Gecko/20120907 Thunderbird/15.0.1 MIME-Version: 1.0 To: Pawel Jakub Dawidek References: <20120918211422.GA1400@garage.freebsd.pl> <20120919231051.4bc5335b@gumby.homeunix.com> <20120920102104.GA1397@garage.freebsd.pl> <201209200758.51924.jhb@freebsd.org> <20120922080323.GA1454@garage.freebsd.pl> <20120922195325.GH1454@garage.freebsd.pl> <505E59DC.7090505@gmail.com> <20120923151706.GN1454@garage.freebsd.pl> In-Reply-To: <20120923151706.GN1454@garage.freebsd.pl> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 8bit X-Mailman-Approved-At: Mon, 24 Sep 2012 22:03:16 +0000 Cc: Ben Laurie , freebsd-security@freebsd.org, RW , Jonathan Anderson , John Baldwin Subject: Re: Collecting entropy from device_attach() times. X-BeenThere: freebsd-security@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Security issues \[members-only posting\]" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 24 Sep 2012 21:57:06 -0000 W dniu 2012-09-23 17:17, Pawel Jakub Dawidek pisze: > On Sun, Sep 23, 2012 at 02:37:48AM +0200, Mariusz Gromada wrote: >> W dniu 2012-09-22 21:53, Pawel Jakub Dawidek pisze: >>> Mariusz, can you confirm my findings? >> >> Pawel, >> >> Your conclusions can be easily confirmed by shape analysis of the EDF. >> Usually maximum quantile difference (called D-statistic) gives you a >> kind of overview, function shape gives you a strong feeling, p-value >> gives you a formal proof. >> D-statistic values (your data): >> >> 6bit: 0.33% >> 7bit: 0.29% >> 8bit: 0.27% >> 9bit: 0.21% >> 10bit: 6.34% >> 11bit: 19.07% >> 12bit: 54.80% >> >> What I would say: increasing the number of bits from 6 to 9 does not >> affect distribution "uniformity", reaching the tenth bit results in >> sudden increase in the difference measure - the more bits, the more >> difference is observed. Distribution shape analysis for the 10th bit >> shows non-linear function. Lack of "randomness" in the quntile >> difference curve - chart shows completely lack of noise (pure >> functional relation). These are very strong indicators that starting >> from 10th bit distribution was changed and is no longer uniform. >> >> To formally confirm above conclusion for i.e. 5% significance level, >> which means that confidence level is 95%, I need some extra data >> regarding sample sizes. Please pass to me number of collected >> observations in each 6-12 bit experiment. > > Total number of observations was 162833. > Ok, finally I have some formal results. To be completely honest I need to point out that, in fact, we have a discrete data (for example integers 0, 1, ..., 63, but not continues numbers spread across 0 and 63). That is way I am going to use two sample Kolmogorov-Smirnov test. Methodology is simple: - Pawel’s data will be called empirical one - Theoretical data will be generated as a sequence of unique integer numbers from 0 to 2**n -1, where n is the number of bits. Assumption - each number appears in theoretical data only once representing ideal uniform distribution. Calculations will be done in the R-cran package Loading empirical data form files: > e6 = read.table("E:\\pawel\\dhr2_6bit_sorted.txt") > e7 = read.table("E:\\pawel\\dhr2_7bit_sorted.txt") > e8 = read.table("E:\\pawel\\dhr2_8bit_sorted.txt") > e9 = read.table("E:\\pawel\\dhr2_9bit_sorted.txt") > e10 = read.table("E:\\pawel\\dhr2_10bit_sorted.txt") > e11 = read.table("E:\\pawel\\dhr2_11bit_sorted.txt") > e12 = read.table("E:\\pawel\\dhr2_12bit_sorted.txt") Generating ideal theoretical data: > t6 = c(0:(2**6-1)) > t7 = c(0:(2**7-1)) > t8 = c(0:(2**8-1)) > t9 = c(0:(2**9-1)) > t10 = c(0:(2**10-1)) > t11 = c(0:(2**11-1)) > t12 = c(0:(2**12-1)) Performing KS tests: > ks.test(e6, t6) D = 0.0032, p-value = 1 > ks.test(e7, t7) D = 0.0029, p-value = 1 > ks.test(e8, t8) D = 0.0027, p-value = 1 > ks.test(e9, t9) D = 0.0022, p-value = 1 > ks.test(e10, t10) D = 0.0634, p-value = 0.0005562 > ks.test(e11, t11) D = 0.1907, p-value < 2.2e-16 > ks.test(e12, t12) D = 0.5479, p-value < 2.2e-16 As you can see D-statistics are almost the same as calculated by Pawel (considering roundings). P-values are very interesting due to very high number of observations generated by Pawel. Between 6 bits and 9 bits estimated p-values are equal to 1, so it means that it is impossible (at any significance level) to reject null hypothesis stating that compared distributions are equal. Final conclusion: it has to be random, and for sure it is random! Additionally starting form 10 bits we can observe dramatic decrease of p-value (from 100% to c.a. 0,06% and much less for the 11-12 bits). So low p-value means that it is impossible not to reject null hypothesis stating that compared distributions are equal. Final conclusion: it cannot be random, and for sure it is not random. I did the same comparison for the previous real device attach data (2081 obs.). R code and the results are below: > e16 = read.table("E:\\pawel\\device_attach_16bit.log") > t16 = c(0:(2**16-1)) > ks.test(e16, t16) D = 0.0178, p-value = 0.5422 Again, D-statistic an p-value are almost the same as previously calculated "manually". P-value is very high (it is not as high as in the 6-12 bits tests, but consider much lower number of observations: 2081 vs 162833), giving almost sureness that you have captured real 16-bits entropy! Regards, Mariusz