From owner-freebsd-current@freebsd.org  Thu Aug 23 03:22:05 2018
Return-Path: <owner-freebsd-current@freebsd.org>
Delivered-To: freebsd-current@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 76D30109F694
 for <freebsd-current@mailman.ysv.freebsd.org>;
 Thu, 23 Aug 2018 03:22:05 +0000 (UTC)
 (envelope-from tcaputi@datto.com)
Received: from mail-oi0-x232.google.com (mail-oi0-x232.google.com
 [IPv6:2607:f8b0:4003:c06::232])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G3" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 064677BF4C
 for <freebsd-current@freebsd.org>; Thu, 23 Aug 2018 03:22:04 +0000 (UTC)
 (envelope-from tcaputi@datto.com)
Received: by mail-oi0-x232.google.com with SMTP id q204-v6so6857721oig.9
 for <freebsd-current@freebsd.org>; Wed, 22 Aug 2018 20:22:04 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=datto-com.20150623.gappssmtp.com; s=20150623;
 h=mime-version:references:in-reply-to:from:date:message-id:subject:to
 :cc:content-transfer-encoding;
 bh=gChVgyp9EnY8azFfwvDURSLkGC2mG2iBlJVCk2Dtj/o=;
 b=MeuPsbT8OPmbtxB5kBcElX/KM7m0oa1Bez2fDqtU3pFX00+Q9PeO44scC26aZaBzd1
 Te1e0rVaXJDkSTct0oJuk2BlXfH+XlGDt/SlswhWrj9SU1vQ0tHoFXOI4w9icF+8uOwf
 nijgZMEx7K9RpZq+3TAch2W1r1/jronPdT78Xw5ZTzjEWuneR6Oi4tUj3aGpjUccq7sb
 KwPj2NyPIYgw327wW9alDAlFV1S7dbjtyv25j8Da4URSp7nL8DLLlZcKdBTkg/zSrqcY
 AY7Z/XoPsbEoyBb5WRo+QccZwYO+L8W7+LarirboI0TJAXQtyZZKzGJLa0PArQCjaUoe
 LToQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:references:in-reply-to:from:date
 :message-id:subject:to:cc:content-transfer-encoding;
 bh=gChVgyp9EnY8azFfwvDURSLkGC2mG2iBlJVCk2Dtj/o=;
 b=lMpL3Ll2wlxM51Vq67/jKAY4C7oqgiQYlRRkvoI2XQhKJy4LZ5qchc0UZa1v/TyJ+X
 +pX+AnIsS9eaXbEoLk6i9W7YuDL9P/DiGZRAnTqCXutkvhKKsqaSpkPGfJ798yGNjFqQ
 Mku7HpPsjf1VU+COGtjHc4H361bJSknR4fIrPm9AaZGIDvwMfLk4oN+KGfSDZx7sKGcB
 T+XRwm8iRQlUBSjV1zAZAVeYKMu5rDBIoWFTn5W9QkdWkdCi8Met5/LWmrzV7ttMjlBy
 zO/gywsImuImVSatQD8iDSMP+6NMXH0zDXAI2CGKg7VKXqZZ7k3liSPFXcoMR/JfDV4f
 OG3Q==
X-Gm-Message-State: APzg51AgumgBvCUr2fqSZjCA7yos9KISXNkigBq0nbowNHHKlXW55qOk
 2+b+gXO0vbsZfP5QqLcZGpCmnsJXMk7ylzaskf4nhA==
X-Google-Smtp-Source: ANB0VdY7IAlyN3YJmY4Rwt47wPz7ygUaoHjPs/JTS/QP7K5M/EiB/Sc+AH5+1zI04ryur54EBYslEwQI5dtc59kGMYY=
X-Received: by 2002:aca:90d:: with SMTP id 13-v6mr6152825oij.300.1534994523936; 
 Wed, 22 Aug 2018 20:22:03 -0700 (PDT)
MIME-Version: 1.0
References: <CAPrugNomNQQUZZNgngYRjDEVEU=_KbE2pgG4ajO1Jr4+Gov2gQ@mail.gmail.com>
 <CAPrugNpKOYe9VS6Q-Q43t4i51qsxrP0SKW76208rtX-ENWxS5g@mail.gmail.com>
 <CAOtMX2jGQWm9ZFM_0kqvEt41xrm+FTpq6JVK4iK-c20NQjisRg@mail.gmail.com>
 <AD1101E9-9A3E-41CB-B313-1723123C607B@ixsystems.com>
 <CAOtMX2gvtzKg=DJChZdcYCiuADNVm9JvhgLNJ7bmwCLArgigjw@mail.gmail.com>
 <9FDF249A-E320-4652-834E-7EEC5C4FB7CA@ixsystems.com>
 <CAOtMX2iMuLWEQV68MTcvpURacXB5wZMT8yAYySisOfnmCNn=SA@mail.gmail.com>
 <E415D5A9-DBEE-45DC-9AE2-7E50A74B8C2D@ixsystems.com>
 <CAOtMX2jaPZj1pQj2f_pzBFXCo6G2ksZ0=mQxCX0MxXnSJpEVuA@mail.gmail.com>
 <CAPrugNpstMxFJcFUyVnQOdS9EzBJMqBJ17oJdZ8px_aek4ghEg@mail.gmail.com>
In-Reply-To: <CAPrugNpstMxFJcFUyVnQOdS9EzBJMqBJ17oJdZ8px_aek4ghEg@mail.gmail.com>
From: Thomas Caputi <tcaputi@datto.com>
Date: Wed, 22 Aug 2018 23:21:53 -0400
Message-ID: <CAF2oFs-68Js-F0=r+OCDZWQgYThSJkk1yAj+OT=dyq_LjgZjbg@mail.gmail.com>
Subject: Re: Native Encryption for ZFS on FreeBSD CFT
To: mmacy@freebsd.org
Cc: Sean Fagan <sef@ixsystems.com>, asomers@freebsd.org,
 freebsd-current@freebsd.org, freebsd-fs@freebsd.org
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Mailman-Approved-At: Thu, 23 Aug 2018 10:42:01 +0000
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.27
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
 <freebsd-current.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-current>, 
 <mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current/>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
 <mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 23 Aug 2018 03:22:05 -0000

> That doesn't answer the question about what happens when dedup is turned =
off.  In that case, is the HMAC still used as the IV?  If so, then watermar=
king attacks are still possible.

Quoting the comment from the code above: "For non-dedup blocks we
derive the IV randomly". When dedup is enabled, we do leak this
information, but the dedup table already leaks that information
anyway. The dedup table needs to be in plaintext so that we can repair
it even when keys are not loaded. This is a known and documented trade
off of using encryption + dedup.

> Only encrypting L0 blocks also leaks a lot of information.  That means th=
at, if encryption is set to anything but "off", watermarking attacks will s=
till be possible based on the size and sparsity of a file.  Because I belie=
ve that with any encryption mode, ZFS turns continuous runs of zeros into h=
oles

First of all, with encryption=3Doff, watermarking attacks are really
quite easy :). The information that can be gained about a file from
ZFS by looking at the raw disk are:

1) The size of the file (rounded up to the nearest sector size):
Almost all applications that encrypt data will leak the approximate
size of the protected payload.

2) The locations of holes within a file: ZFS does not turn runs of
zeros into holes if you have compression off. However, data that is
never written is maintained as a hole (ie if you never write any data
to block 3 of a file). You are correct that technically this is a
small leak of information, but we decided while designing the
encryption scheme that the performance and space savings are worth it
here. Is this enough information to be an attack vector? I would argue
not, but if you are paranoid you could always turn compression off and
fill in all the holes of your files with zeros.

3) If dedup is on, you can see which blocks have deduped against other
blocks within a clone family. Encrypted dedup only works within
applications that share the same master encryption key, which is
essentially just snapshots and clones of snapshots. You cannot write
data to one encrypted dataset and analyze the dedup tables to see i
the data you wrote deduped against another dataset's data.

4) If compression + encryption is on a CRIME attack is possible, but
in almost every scenario this attack is impractical. It requires the
filesystem to have the key loaded, an application that appends a
secret to the data controlled by an attacker, the attacker requires
root access to the running system (to read the raw disk without
rebooting and unloading the encryption key), and the attacker needs to
be able to do many iterations of writing this attacker + secret data
to disk and checking the resulting plaintext.


During the implementation of native ZFS encryption we evaluated these
and came to the conclusion that the security risks here are easily
outweighed by the usability and performance benefits. If you have any
further questions about the design, feel free to email me again or
take a look at the (largely diagram based) docs on the implementation:
https://docs.google.com/presentation/d/1km-z3MVNHYwlQLY6yEC3iq-TD05eredH9Ih=
4umGdkJw/edit?usp=3Dsharing
On Wed, Aug 22, 2018 at 6:39 PM Matthew Macy <mmacy@freebsd.org> wrote:
>
> Hi Thomas,
>
> Alan believes that, even with dedup disabled, the ZFS native encryption s=
upport is vulnerable to watermarking attacks. I don't have enough exposure =
to crypto to pass any judgement and was hoping that you'd share your point =
of view. Thanks in advance.
>
> -M
>
>
>
> On Wed, Aug 22, 2018 at 12:42 PM Alan Somers <asomers@freebsd.org> wrote:
>>
>> Only encrypting L0 blocks also leaks a lot of information.  That means t=
hat, if encryption is set to anything but "off", watermarking attacks will =
still be possible based on the size and sparsity of a file.  Because I beli=
eve that with any encryption mode, ZFS turns continuous runs of zeros into =
holes.  And I don't see anything in zio_crypt.c that addresses that.
>> -Alan
>>
>> On Wed, Aug 22, 2018 at 1:23 PM Sean Fagan <sef@ixsystems.com> wrote:
>>>
>>> On Aug 22, 2018, at 12:20 PM, Alan Somers <asomers@freebsd.org> wrote:
>>> > ]That doesn't answer the question about what happens when dedup is tu=
rned off.  In that case, is the HMAC still used as the IV?  If so, then wat=
ermarking attacks are still possible.  If ZFS switches to a random IV when =
dedup is off, then it would probably be ok.
>>>
>>> From the same file:
>>>
>>>  * Initialization Vector (IV):
>>>  * An initialization vector for the encryption algorithms. This is used=
 to
>>>  * "tweak" the encryption algorithms so that two blocks of the same dat=
a are
>>>  * encrypted into different ciphertext outputs, thus obfuscating block =
patterns.
>>>  * The supported encryption modes (AES-GCM and AES-CCM) require that an=
 IV is
>>>  * never reused with the same encryption key. This value is stored unen=
crypted
>>>  * and must simply be provided to the decryption function. We use a 96 =
bit IV
>>>  * (as recommended by NIST) for all block encryption. For non-dedup blo=
cks we
>>>  * derive the IV randomly. The first 64 bits of the IV are stored in th=
e second
>>>  * word of DVA[2] and the remaining 32 bits are stored in the upper 32 =
bits of
>>>  * blk_fill. This is safe because encrypted blocks can't use the upper =
32 bits
>>>  * of blk_fill. We only encrypt level 0 blocks, which normally have a f=
ill count
>>>  * of 1. The only exception is for DMU_OT_DNODE objects, where the fill=
 count of
>>>  * level 0 blocks is the number of allocated dnodes in that block. The =
on-disk
>>>  * format supports at most 2^15 slots per L0 dnode block, because the m=
aximum
>>>  * block size is 16MB (2^24). In either case, for level 0 blocks this n=
umber
>>>  * will still be smaller than UINT32_MAX so it is safe to store the IV =
in the
>>>  * top 32 bits of blk_fill, while leaving the bottom 32 bits of the fil=
l count
>>>  * for the dnode code.
>>>
>>> Sean
>>>
>>>