From owner-freebsd-fs@freebsd.org Thu Aug 23 03:22:05 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7BF57109F695 for ; Thu, 23 Aug 2018 03:22:05 +0000 (UTC) (envelope-from tcaputi@datto.com) Received: from mail-oi0-x22d.google.com (mail-oi0-x22d.google.com [IPv6:2607:f8b0:4003:c06::22d]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 0650A7BF4E for ; Thu, 23 Aug 2018 03:22:04 +0000 (UTC) (envelope-from tcaputi@datto.com) Received: by mail-oi0-x22d.google.com with SMTP id m11-v6so6900373oic.2 for ; Wed, 22 Aug 2018 20:22:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=datto-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=gChVgyp9EnY8azFfwvDURSLkGC2mG2iBlJVCk2Dtj/o=; b=MeuPsbT8OPmbtxB5kBcElX/KM7m0oa1Bez2fDqtU3pFX00+Q9PeO44scC26aZaBzd1 Te1e0rVaXJDkSTct0oJuk2BlXfH+XlGDt/SlswhWrj9SU1vQ0tHoFXOI4w9icF+8uOwf nijgZMEx7K9RpZq+3TAch2W1r1/jronPdT78Xw5ZTzjEWuneR6Oi4tUj3aGpjUccq7sb KwPj2NyPIYgw327wW9alDAlFV1S7dbjtyv25j8Da4URSp7nL8DLLlZcKdBTkg/zSrqcY AY7Z/XoPsbEoyBb5WRo+QccZwYO+L8W7+LarirboI0TJAXQtyZZKzGJLa0PArQCjaUoe LToQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=gChVgyp9EnY8azFfwvDURSLkGC2mG2iBlJVCk2Dtj/o=; b=RyI6XGFaJWbFDQ30dBBq4fXEzCtB+yIXvKP8lOOx2x9cT9tgnA25nhWPPaoGHouSuU U3sK83wOyJCLheygSjEmbap+BwhqQ2r/T6Z46OwdDUrHtsCF7y3E5y/QXUK6EAwC92z/ NGnIJEVkbH89ocLaTPq73PRRPshZAFOjtwXs6FY3vmTnmBjPnf16Hys2jTqaFA1tt9S6 c/9l6LDcvFXIsa+oC966hSmef+k5miCgbat3AiVDWcglQ25RaTNFqqK5nGSjFIEc4k7j xNR9IJ6Z4ZROit7SfH9Sb9ZfRsvMic2vd/frnLkNvnzVlfbcMartpgzljnSP7i65XmNL 4rzQ== X-Gm-Message-State: APzg51A1I5u2mcoYPvsRAv60tePXHqScRbIenz+qoGhOlN6xNkxNVGfT WYlCmNjXfalIjUAcWaZhDKRBBLLclcaznFKZOkCvfw== X-Google-Smtp-Source: ANB0VdY7IAlyN3YJmY4Rwt47wPz7ygUaoHjPs/JTS/QP7K5M/EiB/Sc+AH5+1zI04ryur54EBYslEwQI5dtc59kGMYY= X-Received: by 2002:aca:90d:: with SMTP id 13-v6mr6152825oij.300.1534994523936; Wed, 22 Aug 2018 20:22:03 -0700 (PDT) MIME-Version: 1.0 References: <9FDF249A-E320-4652-834E-7EEC5C4FB7CA@ixsystems.com> In-Reply-To: From: Thomas Caputi Date: Wed, 22 Aug 2018 23:21:53 -0400 Message-ID: Subject: Re: Native Encryption for ZFS on FreeBSD CFT To: mmacy@freebsd.org Cc: Sean Fagan , asomers@freebsd.org, freebsd-current@freebsd.org, freebsd-fs@freebsd.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 23 Aug 2018 03:22:05 -0000 > That doesn't answer the question about what happens when dedup is turned = off. In that case, is the HMAC still used as the IV? If so, then watermar= king attacks are still possible. Quoting the comment from the code above: "For non-dedup blocks we derive the IV randomly". When dedup is enabled, we do leak this information, but the dedup table already leaks that information anyway. The dedup table needs to be in plaintext so that we can repair it even when keys are not loaded. This is a known and documented trade off of using encryption + dedup. > Only encrypting L0 blocks also leaks a lot of information. That means th= at, if encryption is set to anything but "off", watermarking attacks will s= till be possible based on the size and sparsity of a file. Because I belie= ve that with any encryption mode, ZFS turns continuous runs of zeros into h= oles First of all, with encryption=3Doff, watermarking attacks are really quite easy :). The information that can be gained about a file from ZFS by looking at the raw disk are: 1) The size of the file (rounded up to the nearest sector size): Almost all applications that encrypt data will leak the approximate size of the protected payload. 2) The locations of holes within a file: ZFS does not turn runs of zeros into holes if you have compression off. However, data that is never written is maintained as a hole (ie if you never write any data to block 3 of a file). You are correct that technically this is a small leak of information, but we decided while designing the encryption scheme that the performance and space savings are worth it here. Is this enough information to be an attack vector? I would argue not, but if you are paranoid you could always turn compression off and fill in all the holes of your files with zeros. 3) If dedup is on, you can see which blocks have deduped against other blocks within a clone family. Encrypted dedup only works within applications that share the same master encryption key, which is essentially just snapshots and clones of snapshots. You cannot write data to one encrypted dataset and analyze the dedup tables to see i the data you wrote deduped against another dataset's data. 4) If compression + encryption is on a CRIME attack is possible, but in almost every scenario this attack is impractical. It requires the filesystem to have the key loaded, an application that appends a secret to the data controlled by an attacker, the attacker requires root access to the running system (to read the raw disk without rebooting and unloading the encryption key), and the attacker needs to be able to do many iterations of writing this attacker + secret data to disk and checking the resulting plaintext. During the implementation of native ZFS encryption we evaluated these and came to the conclusion that the security risks here are easily outweighed by the usability and performance benefits. If you have any further questions about the design, feel free to email me again or take a look at the (largely diagram based) docs on the implementation: https://docs.google.com/presentation/d/1km-z3MVNHYwlQLY6yEC3iq-TD05eredH9Ih= 4umGdkJw/edit?usp=3Dsharing On Wed, Aug 22, 2018 at 6:39 PM Matthew Macy wrote: > > Hi Thomas, > > Alan believes that, even with dedup disabled, the ZFS native encryption s= upport is vulnerable to watermarking attacks. I don't have enough exposure = to crypto to pass any judgement and was hoping that you'd share your point = of view. Thanks in advance. > > -M > > > > On Wed, Aug 22, 2018 at 12:42 PM Alan Somers wrote: >> >> Only encrypting L0 blocks also leaks a lot of information. That means t= hat, if encryption is set to anything but "off", watermarking attacks will = still be possible based on the size and sparsity of a file. Because I beli= eve that with any encryption mode, ZFS turns continuous runs of zeros into = holes. And I don't see anything in zio_crypt.c that addresses that. >> -Alan >> >> On Wed, Aug 22, 2018 at 1:23 PM Sean Fagan wrote: >>> >>> On Aug 22, 2018, at 12:20 PM, Alan Somers wrote: >>> > ]That doesn't answer the question about what happens when dedup is tu= rned off. In that case, is the HMAC still used as the IV? If so, then wat= ermarking attacks are still possible. If ZFS switches to a random IV when = dedup is off, then it would probably be ok. >>> >>> From the same file: >>> >>> * Initialization Vector (IV): >>> * An initialization vector for the encryption algorithms. This is used= to >>> * "tweak" the encryption algorithms so that two blocks of the same dat= a are >>> * encrypted into different ciphertext outputs, thus obfuscating block = patterns. >>> * The supported encryption modes (AES-GCM and AES-CCM) require that an= IV is >>> * never reused with the same encryption key. This value is stored unen= crypted >>> * and must simply be provided to the decryption function. We use a 96 = bit IV >>> * (as recommended by NIST) for all block encryption. For non-dedup blo= cks we >>> * derive the IV randomly. The first 64 bits of the IV are stored in th= e second >>> * word of DVA[2] and the remaining 32 bits are stored in the upper 32 = bits of >>> * blk_fill. This is safe because encrypted blocks can't use the upper = 32 bits >>> * of blk_fill. We only encrypt level 0 blocks, which normally have a f= ill count >>> * of 1. The only exception is for DMU_OT_DNODE objects, where the fill= count of >>> * level 0 blocks is the number of allocated dnodes in that block. The = on-disk >>> * format supports at most 2^15 slots per L0 dnode block, because the m= aximum >>> * block size is 16MB (2^24). In either case, for level 0 blocks this n= umber >>> * will still be smaller than UINT32_MAX so it is safe to store the IV = in the >>> * top 32 bits of blk_fill, while leaving the bottom 32 bits of the fil= l count >>> * for the dnode code. >>> >>> Sean >>> >>>