From owner-freebsd-arch@freebsd.org Tue Nov 13 22:51:55 2018 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 03D1011355BA; Tue, 13 Nov 2018 22:51:55 +0000 (UTC) (envelope-from cse.cem@gmail.com) Received: from mail-io1-f65.google.com (mail-io1-f65.google.com [209.85.166.65]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 6A01B7CA60; Tue, 13 Nov 2018 22:51:54 +0000 (UTC) (envelope-from cse.cem@gmail.com) Received: by mail-io1-f65.google.com with SMTP id r6-v6so7501029ioj.8; Tue, 13 Nov 2018 14:51:54 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:reply-to :from:date:message-id:subject:to:cc; bh=+PmeIs7zhpfoqy5mSMDcleez4OCZmHr4/DaKgEJgtmM=; b=czHGGn2GlZ9q0aVDLcW9FOk+mzVHdGEtauyTMcelEurB8qhIZVFw//mSOXbor7gDM5 P/11f3rnhB9g8iaPOAEUImZQe9t4Jyqs8Yf34lNLlAbJEzy2c7Cm5bv3N0H1ZdvFE2tF 4tH1rntE8D25UHeKDt97qmVLxj5xp7SaM/y0N9H37mW2GHlhx3pzIhZ3iAfZ45e8yKXs 2Fg31k/OmpOksQUkiP6Ruq3Hc1+muNlUi+g4iFelMRLfSN/BpeHsqThA2M446+qMBHUb CihbSe0WkUStwRPYB/N/AHz6WfpILe43Z10v1XdBD+T5ufbrmbu1+WdwizscLYAKtVbo /Dhw== X-Gm-Message-State: AGRZ1gKdF1/EF8ZPGrORpgcMbduIlJW//IewPWl2e0knDLh5Edsvdzyw CrTuEoK5Nz7SFvqkPX0b8fGXYSCG X-Google-Smtp-Source: AJdET5cR3tEbbKbErDV+4Ic0TxiSnOqcjbzm3WV1ETlka1uV8ZmW3Tk8Vsrygn5sFLNDCTTQn7bMfQ== X-Received: by 2002:a6b:bc86:: with SMTP id m128-v6mr4453692iof.212.1542149508211; Tue, 13 Nov 2018 14:51:48 -0800 (PST) Received: from mail-io1-f54.google.com (mail-io1-f54.google.com. [209.85.166.54]) by smtp.gmail.com with ESMTPSA id u68-v6sm5805177itd.1.2018.11.13.14.51.48 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 13 Nov 2018 14:51:48 -0800 (PST) Received: by mail-io1-f54.google.com with SMTP id w7so2584505iom.12; Tue, 13 Nov 2018 14:51:48 -0800 (PST) X-Received: by 2002:a6b:6119:: with SMTP id v25mr6119881iob.107.1542149507555; Tue, 13 Nov 2018 14:51:47 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: Reply-To: cem@freebsd.org From: Conrad Meyer Date: Tue, 13 Nov 2018 14:51:36 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Hole-punching, TRIM, etc To: Alan Somers Cc: "freebsd-arch@freebsd.org" , freebsd-fs , freebsd-current Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 6A01B7CA60 X-Spamd-Result: default: False [-4.03 / 200.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; HAS_REPLYTO(0.00)[cem@freebsd.org]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:209.85.128.0/17]; REPLYTO_ADDR_EQ_FROM(0.00)[]; RCVD_COUNT_THREE(0.00)[4]; MX_GOOD(-0.01)[cached: alt3.gmail-smtp-in.l.google.com]; NEURAL_HAM_SHORT(-0.93)[-0.926,0]; FORGED_SENDER(0.30)[cem@freebsd.org,csecem@gmail.com]; RCVD_TLS_LAST(0.00)[]; R_DKIM_NA(0.00)[]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:209.85.128.0/17, country:US]; TAGGED_FROM(0.00)[]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; FROM_NEQ_ENVFROM(0.00)[cem@freebsd.org,csecem@gmail.com]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; TO_MATCH_ENVRCPT_ALL(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[freebsd.org]; IP_SCORE(-1.10)[ipnet: 209.85.128.0/17(-3.47), asn: 15169(-1.92), country: US(-0.09)]; RCVD_IN_DNSWL_NONE(0.00)[65.166.85.209.list.dnswl.org : 127.0.5.0]; RWL_MAILSPIKE_POSSIBLE(0.00)[65.166.85.209.rep.mailspike.net : 127.0.0.17] X-Rspamd-Server: mx1.freebsd.org X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Nov 2018 22:51:55 -0000 Hi Alan, On Tue, Nov 13, 2018 at 2:10 PM Alan Somers wrote: > > Hole-punching has been discussed on these lists before[1]. It basically > means to turn a dense file into a sparse file by deallocating storage for > some of the blocks in the middle. There's no standard API for it. Linux > uses fallocate(2); Solaris and OSX add a new opcode to fcntl(2). > > A related concept is telling a block device that some blocks are no longer > used. SATA calls this "TRIM", SCSI calls it "UNMAP", NVMe calls it > "Deallocate", ZBC and ZAC call it "Reset Write Pointer". They all do > basically the same thing, and it's analogous to hole-punching for regular > files. They are also all inaccessible from FreeBSD's userland except by > using pass(4), which is inconvenient and protocol-specific. Geom devices have the DIOCGDELETE ioctl, which translates into BIO_DELETE (which is TRIM, as I understand it). It's available in libgeom as g_delete() and used by hastd, newfs_nandfs, and nandtool. > Linux has a BLKDISCARD ioctl for issuing TRIM-like commands from userland, > but it's totally undocumented and doesn't work on regular files. > > I propose adding support for all of these things using the fcntl(2) API. > Using the same syntax that Solaris defined, you would be able to punch a > hole in a regular file or TRIM blocks from an SSD. ZFS already supports it > (though FreeBSD's port never did, and the code was deleted in r303763). > Here's what I would do: > > 1) Add the F_FREESP command to fcntl(2). > 2) Add a .fo_space field for struct fileops > 3) Add a devfs_space method that implements .fo_space > 4) Add a .d_space field to struct cdevsw > 5) Add a g_dev_space method for GEOM that implements .d_space using > BIO_DELETE. > 6) Add a VOP_SPACE vop > 7) Implement VOP_SPACE for tmpfs > 8) Add aio_freesp(2), an asynchronous version of fcntl(F_FREESP). Why not just add DIOCGDELETE support to various VOP_IOCTL implementations? The file objects forward correctly through vn_ioctl to VOP_IOCTL for both regular files and devfs VCHR nodes. We can emulate the Linux API if we want to be compatible there, but I wouldn't bother with Solaris. Best, Conrad