From owner-freebsd-arch@freebsd.org Tue Nov 13 22:09:46 2018 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D9790113409E; Tue, 13 Nov 2018 22:09:45 +0000 (UTC) (envelope-from asomers@gmail.com) Received: from mail-lf1-f51.google.com (mail-lf1-f51.google.com [209.85.167.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 44ECE7ACF2; Tue, 13 Nov 2018 22:09:45 +0000 (UTC) (envelope-from asomers@gmail.com) Received: by mail-lf1-f51.google.com with SMTP id h192so10055903lfg.3; Tue, 13 Nov 2018 14:09:45 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=eVSb/6yWf5kfSUFGYd4VOZ0Zm6v+RP8IQB1hOBn/UUs=; b=dTqeIfyeDFK70ix6c3VncTBNCbV+D2YboKyaCWCfiCRmUdGNmwrGtduo8qqeH5zH36 K/09DtRDKFywHiM3EHNJsTwqnbR8FnyNQ883ry9AcQmnPzssktB10u7nCUzgabyGoMqd vhuAcEPrbZklMEW6ub+/Mm7WltNI0c4XPMK6u00r1evFF3Zy92VFzZ8chPpqNJuuLug6 7CMFMtAh5EByLtmZ+mio2SrgVuXr9NV7InvXYGgDuMt5j1B2LYXP5Q5XIkR+xhlLgsd5 QqFcfGpXXMi/C8jGx6cakMoKvKNme+OQ/89+SViYG97XrUVtghlEDU5Wr23VUIX+uwAy 9vog== X-Gm-Message-State: AGRZ1gJLpkKF8yHNTDZB5bMnt629704YbBktY6NzRxeYa+n3ZA8CsNsS V2rOs3BRejzBxlqxIcAJF/+d4fFWL4913HjHX+PYPw== X-Google-Smtp-Source: AJdET5c4NMBOKlpRUaMW5suYzwf7VLZr7ATJp7y6adlIpOLqhpcLEVNuUejTc+FTiXmCy+z5WenAM3ZkhQS7zDWnVX4= X-Received: by 2002:a19:d8ea:: with SMTP id r103mr4262631lfi.146.1542146977324; Tue, 13 Nov 2018 14:09:37 -0800 (PST) MIME-Version: 1.0 From: Alan Somers Date: Tue, 13 Nov 2018 15:09:24 -0700 Message-ID: Subject: Hole-punching, TRIM, etc To: freebsd-arch@freebsd.org, freebsd-fs , FreeBSD CURRENT X-Rspamd-Queue-Id: 44ECE7ACF2 X-Spamd-Result: default: False [-4.06 / 200.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; R_SPF_ALLOW(-0.20)[+ip4:209.85.128.0/17]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; RCVD_TLS_LAST(0.00)[]; DMARC_NA(0.00)[freebsd.org]; TO_DN_SOME(0.00)[]; NEURAL_HAM_LONG(-1.00)[-0.999,0]; IP_SCORE(-1.10)[ipnet: 209.85.128.0/17(-3.50), asn: 15169(-1.91), country: US(-0.09)]; MX_GOOD(-0.01)[cached: alt3.gmail-smtp-in.l.google.com]; NEURAL_HAM_SHORT(-0.95)[-0.946,0]; RCVD_IN_DNSWL_NONE(0.00)[51.167.85.209.list.dnswl.org : 127.0.5.0]; FORGED_SENDER(0.30)[asomers@freebsd.org,asomers@gmail.com]; R_DKIM_NA(0.00)[]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:209.85.128.0/17, country:US]; FROM_NEQ_ENVFROM(0.00)[asomers@freebsd.org,asomers@gmail.com]; RCVD_COUNT_TWO(0.00)[2] X-Rspamd-Server: mx1.freebsd.org Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Nov 2018 22:09:46 -0000 Hole-punching has been discussed on these lists before[1]. It basically means to turn a dense file into a sparse file by deallocating storage for some of the blocks in the middle. There's no standard API for it. Linux uses fallocate(2); Solaris and OSX add a new opcode to fcntl(2). A related concept is telling a block device that some blocks are no longer used. SATA calls this "TRIM", SCSI calls it "UNMAP", NVMe calls it "Deallocate", ZBC and ZAC call it "Reset Write Pointer". They all do basically the same thing, and it's analogous to hole-punching for regular files. They are also all inaccessible from FreeBSD's userland except by using pass(4), which is inconvenient and protocol-specific. Linux has a BLKDISCARD ioctl for issuing TRIM-like commands from userland, but it's totally undocumented and doesn't work on regular files. I propose adding support for all of these things using the fcntl(2) API. Using the same syntax that Solaris defined, you would be able to punch a hole in a regular file or TRIM blocks from an SSD. ZFS already supports it (though FreeBSD's port never did, and the code was deleted in r303763). Here's what I would do: 1) Add the F_FREESP command to fcntl(2). 2) Add a .fo_space field for struct fileops 3) Add a devfs_space method that implements .fo_space 4) Add a .d_space field to struct cdevsw 5) Add a g_dev_space method for GEOM that implements .d_space using BIO_DELETE. 6) Add a VOP_SPACE vop 7) Implement VOP_SPACE for tmpfs 8) Add aio_freesp(2), an asynchronous version of fcntl(F_FREESP). The greatest beneficiaries of this work would be type 2 hypervisors like QEMU and VirtualBox with guests that use TRIM, and userland filesystems such as fusefs-ext2 and fusefs-exfat. High-performance storage systems using SPDK would also benefit. The last item, aio_freesp(2), may seem unnecessary but it would really benefit my application. Questions, objections, flames? -Alan [1] https://lists.freebsd.org/pipermail/freebsd-fs/2011-March/010881.html From owner-freebsd-arch@freebsd.org Tue Nov 13 22:50:49 2018 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id DAEC3113531C for ; Tue, 13 Nov 2018 22:50:48 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: from mail-it1-x130.google.com (mail-it1-x130.google.com [IPv6:2607:f8b0:4864:20::130]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id B7C297C778 for ; Tue, 13 Nov 2018 22:50:47 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: by mail-it1-x130.google.com with SMTP id m34-v6so21000329iti.1 for ; Tue, 13 Nov 2018 14:50:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=Ws4zhpBlCIfAr0x3JvA2o1yRCxRrylKMoR6MwFPBvaE=; b=1R+KTNqoF1Z4nStNPyYG7CucyA2U+za7fsnNc3kheEpPciZIPQdOeFLzaS6+n//J60 HkBg+Or22y8O0VpoxKN0JbTrpyBMyPPQiubvrx0xSnI+7goCblXqzzt93fxn8JhcOm13 gelIb4gGLhpdy3XNBXKi/tR2SxFIk/l8DG/fOJ9p+kpEwIGRth5tCEVikXTYrhHLmIJW FdpISED3upbDFP9SB6SQSpQ1nfKd2bVLqRuHVcVDibd2JihJpLY2PcTucMYdMi0Mdkpz hHOq/ntYFjrRW9BjKWYBrA01guXDnXENkLsZ30dUoYxsQlo1vVXAa6ks0aQ6Vq+tsUxL jI2g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Ws4zhpBlCIfAr0x3JvA2o1yRCxRrylKMoR6MwFPBvaE=; b=mrHXAuL8vJn/7ShX+cvJ2X8Fvqa19xmG9KLQdbgTvw1mxHNQFyi3cf01b5XS9KaIcn NoF/WT/Eg6t9bcxfRMaV984Nz4tC86qDtTQuSzpedXFDJ6bbGa1NLHG4bsU5NFltJVQV zRxcNQFr7HuxXcHWR+M8r79mOONq8UnqXgofeSQkNzoC8TG7NM8hmV9/QfNim6PsWrb1 x/ICEnC6qRkRiD2X+IDrrLpOVR1/KZglAZPV5r6knsjLMHJKf+LzZMkdbU5GFxKC3jIp EDKWYC6TsPJoeh6EU3Ip6aqplSvjB7uCs2pFADGpmz+SRpHb+gWMwrLwIaF6e3G2gZw3 Z77A== X-Gm-Message-State: AGRZ1gLloVcKeYt5/irFkyK/N+cVl1wZb4+Osfwg4B2CvqCYB+eOOgeY o7b8YHOQQyQFcTJ+3vVgCzr9jfhbshH1s+zLjBxHsg== X-Google-Smtp-Source: AJdET5cadNmAB1aQwTMDeQsVM8FPmwqWdweg09TChImN/MG1LLTwFQJbtIaUDzfr2Dbiof3Aak0RpVrTeJjhBuiB3VQ= X-Received: by 2002:a24:eb0b:: with SMTP id h11-v6mr5341529itj.47.1542149446783; Tue, 13 Nov 2018 14:50:46 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Warner Losh Date: Tue, 13 Nov 2018 15:50:35 -0700 Message-ID: Subject: Re: Hole-punching, TRIM, etc To: Alan Somers Cc: "freebsd-arch@freebsd.org" , FreeBSD FS , FreeBSD Current X-Rspamd-Queue-Id: B7C297C778 X-Spamd-Result: default: False [-5.56 / 200.00]; ARC_NA(0.00)[]; TO_DN_EQ_ADDR_SOME(0.00)[]; R_DKIM_ALLOW(-0.20)[bsdimp-com.20150623.gappssmtp.com]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; TO_DN_SOME(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-arch@freebsd.org]; DMARC_NA(0.00)[bsdimp.com]; TO_MATCH_ENVRCPT_SOME(0.00)[]; DKIM_TRACE(0.00)[bsdimp-com.20150623.gappssmtp.com:+]; MX_GOOD(-0.01)[cached: ALT1.aspmx.l.google.com]; RCVD_IN_DNSWL_NONE(0.00)[0.3.1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.2.0.0.4.6.8.4.0.b.8.f.7.0.6.2.list.dnswl.org : 127.0.5.0]; NEURAL_HAM_SHORT(-0.98)[-0.978,0]; R_SPF_NA(0.00)[]; FORGED_SENDER(0.30)[imp@bsdimp.com,wlosh@bsdimp.com]; RCVD_TLS_LAST(0.00)[]; IP_SCORE(-2.57)[ip: (-7.97), ipnet: 2607:f8b0::/32(-2.89), asn: 15169(-1.92), country: US(-0.09)]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US]; FROM_NEQ_ENVFROM(0.00)[imp@bsdimp.com,wlosh@bsdimp.com]; RCVD_COUNT_TWO(0.00)[2] X-Rspamd-Server: mx1.freebsd.org Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Nov 2018 22:50:49 -0000 On Tue, Nov 13, 2018 at 3:10 PM Alan Somers wrote: > Hole-punching has been discussed on these lists before[1]. It basically > means to turn a dense file into a sparse file by deallocating storage for > some of the blocks in the middle. There's no standard API for it. Linux > uses fallocate(2); Solaris and OSX add a new opcode to fcntl(2). > > A related concept is telling a block device that some blocks are no longer > used. SATA calls this "TRIM", SCSI calls it "UNMAP", NVMe calls it > "Deallocate", ZBC and ZAC call it "Reset Write Pointer". They all do > basically the same thing, and it's analogous to hole-punching for regular > files. They are also all inaccessible from FreeBSD's userland except by > using pass(4), which is inconvenient and protocol-specific. > > Linux has a BLKDISCARD ioctl for issuing TRIM-like commands from userland, > but it's totally undocumented and doesn't work on regular files. > > I propose adding support for all of these things using the fcntl(2) API. > Using the same syntax that Solaris defined, you would be able to punch a > hole in a regular file or TRIM blocks from an SSD. ZFS already supports it > (though FreeBSD's port never did, and the code was deleted in r303763). > Here's what I would do: > > 1) Add the F_FREESP command to fcntl(2). > 2) Add a .fo_space field for struct fileops > 3) Add a devfs_space method that implements .fo_space > 4) Add a .d_space field to struct cdevsw > 5) Add a g_dev_space method for GEOM that implements .d_space using > BIO_DELETE. > 6) Add a VOP_SPACE vop > 7) Implement VOP_SPACE for tmpfs > 8) Add aio_freesp(2), an asynchronous version of fcntl(F_FREESP). > > The greatest beneficiaries of this work would be type 2 hypervisors like > QEMU and VirtualBox with guests that use TRIM, and userland filesystems > such as fusefs-ext2 and fusefs-exfat. High-performance storage systems > using SPDK would also benefit. The last item, aio_freesp(2), may seem > unnecessary but it would really benefit my application. > > Questions, objections, flames? > So the fcntl would deallocate blocks from a filesystem only. The filesystem may issue BIO_DELETE as a result, but that's up to the filesystem, correct? On a raw device it would be translated into a BIO_DELETE command directly, correct? Warner From owner-freebsd-arch@freebsd.org Tue Nov 13 22:51:55 2018 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 03D1011355BA; Tue, 13 Nov 2018 22:51:55 +0000 (UTC) (envelope-from cse.cem@gmail.com) Received: from mail-io1-f65.google.com (mail-io1-f65.google.com [209.85.166.65]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 6A01B7CA60; Tue, 13 Nov 2018 22:51:54 +0000 (UTC) (envelope-from cse.cem@gmail.com) Received: by mail-io1-f65.google.com with SMTP id r6-v6so7501029ioj.8; Tue, 13 Nov 2018 14:51:54 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:reply-to :from:date:message-id:subject:to:cc; bh=+PmeIs7zhpfoqy5mSMDcleez4OCZmHr4/DaKgEJgtmM=; b=czHGGn2GlZ9q0aVDLcW9FOk+mzVHdGEtauyTMcelEurB8qhIZVFw//mSOXbor7gDM5 P/11f3rnhB9g8iaPOAEUImZQe9t4Jyqs8Yf34lNLlAbJEzy2c7Cm5bv3N0H1ZdvFE2tF 4tH1rntE8D25UHeKDt97qmVLxj5xp7SaM/y0N9H37mW2GHlhx3pzIhZ3iAfZ45e8yKXs 2Fg31k/OmpOksQUkiP6Ruq3Hc1+muNlUi+g4iFelMRLfSN/BpeHsqThA2M446+qMBHUb CihbSe0WkUStwRPYB/N/AHz6WfpILe43Z10v1XdBD+T5ufbrmbu1+WdwizscLYAKtVbo /Dhw== X-Gm-Message-State: AGRZ1gKdF1/EF8ZPGrORpgcMbduIlJW//IewPWl2e0knDLh5Edsvdzyw CrTuEoK5Nz7SFvqkPX0b8fGXYSCG X-Google-Smtp-Source: AJdET5cR3tEbbKbErDV+4Ic0TxiSnOqcjbzm3WV1ETlka1uV8ZmW3Tk8Vsrygn5sFLNDCTTQn7bMfQ== X-Received: by 2002:a6b:bc86:: with SMTP id m128-v6mr4453692iof.212.1542149508211; Tue, 13 Nov 2018 14:51:48 -0800 (PST) Received: from mail-io1-f54.google.com (mail-io1-f54.google.com. [209.85.166.54]) by smtp.gmail.com with ESMTPSA id u68-v6sm5805177itd.1.2018.11.13.14.51.48 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 13 Nov 2018 14:51:48 -0800 (PST) Received: by mail-io1-f54.google.com with SMTP id w7so2584505iom.12; Tue, 13 Nov 2018 14:51:48 -0800 (PST) X-Received: by 2002:a6b:6119:: with SMTP id v25mr6119881iob.107.1542149507555; Tue, 13 Nov 2018 14:51:47 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: Reply-To: cem@freebsd.org From: Conrad Meyer Date: Tue, 13 Nov 2018 14:51:36 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Hole-punching, TRIM, etc To: Alan Somers Cc: "freebsd-arch@freebsd.org" , freebsd-fs , freebsd-current Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 6A01B7CA60 X-Spamd-Result: default: False [-4.03 / 200.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; HAS_REPLYTO(0.00)[cem@freebsd.org]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:209.85.128.0/17]; REPLYTO_ADDR_EQ_FROM(0.00)[]; RCVD_COUNT_THREE(0.00)[4]; MX_GOOD(-0.01)[cached: alt3.gmail-smtp-in.l.google.com]; NEURAL_HAM_SHORT(-0.93)[-0.926,0]; FORGED_SENDER(0.30)[cem@freebsd.org,csecem@gmail.com]; RCVD_TLS_LAST(0.00)[]; R_DKIM_NA(0.00)[]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:209.85.128.0/17, country:US]; TAGGED_FROM(0.00)[]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; FROM_NEQ_ENVFROM(0.00)[cem@freebsd.org,csecem@gmail.com]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; TO_MATCH_ENVRCPT_ALL(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[freebsd.org]; IP_SCORE(-1.10)[ipnet: 209.85.128.0/17(-3.47), asn: 15169(-1.92), country: US(-0.09)]; RCVD_IN_DNSWL_NONE(0.00)[65.166.85.209.list.dnswl.org : 127.0.5.0]; RWL_MAILSPIKE_POSSIBLE(0.00)[65.166.85.209.rep.mailspike.net : 127.0.0.17] X-Rspamd-Server: mx1.freebsd.org X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Nov 2018 22:51:55 -0000 Hi Alan, On Tue, Nov 13, 2018 at 2:10 PM Alan Somers wrote: > > Hole-punching has been discussed on these lists before[1]. It basically > means to turn a dense file into a sparse file by deallocating storage for > some of the blocks in the middle. There's no standard API for it. Linux > uses fallocate(2); Solaris and OSX add a new opcode to fcntl(2). > > A related concept is telling a block device that some blocks are no longer > used. SATA calls this "TRIM", SCSI calls it "UNMAP", NVMe calls it > "Deallocate", ZBC and ZAC call it "Reset Write Pointer". They all do > basically the same thing, and it's analogous to hole-punching for regular > files. They are also all inaccessible from FreeBSD's userland except by > using pass(4), which is inconvenient and protocol-specific. Geom devices have the DIOCGDELETE ioctl, which translates into BIO_DELETE (which is TRIM, as I understand it). It's available in libgeom as g_delete() and used by hastd, newfs_nandfs, and nandtool. > Linux has a BLKDISCARD ioctl for issuing TRIM-like commands from userland, > but it's totally undocumented and doesn't work on regular files. > > I propose adding support for all of these things using the fcntl(2) API. > Using the same syntax that Solaris defined, you would be able to punch a > hole in a regular file or TRIM blocks from an SSD. ZFS already supports it > (though FreeBSD's port never did, and the code was deleted in r303763). > Here's what I would do: > > 1) Add the F_FREESP command to fcntl(2). > 2) Add a .fo_space field for struct fileops > 3) Add a devfs_space method that implements .fo_space > 4) Add a .d_space field to struct cdevsw > 5) Add a g_dev_space method for GEOM that implements .d_space using > BIO_DELETE. > 6) Add a VOP_SPACE vop > 7) Implement VOP_SPACE for tmpfs > 8) Add aio_freesp(2), an asynchronous version of fcntl(F_FREESP). Why not just add DIOCGDELETE support to various VOP_IOCTL implementations? The file objects forward correctly through vn_ioctl to VOP_IOCTL for both regular files and devfs VCHR nodes. We can emulate the Linux API if we want to be compatible there, but I wouldn't bother with Solaris. Best, Conrad From owner-freebsd-arch@freebsd.org Tue Nov 13 22:52:59 2018 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0817511357C3; Tue, 13 Nov 2018 22:52:58 +0000 (UTC) (envelope-from asomers@gmail.com) Received: from mail-lf1-f53.google.com (mail-lf1-f53.google.com [209.85.167.53]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 2E8F17CD2D; Tue, 13 Nov 2018 22:52:58 +0000 (UTC) (envelope-from asomers@gmail.com) Received: by mail-lf1-f53.google.com with SMTP id b20so10083877lfa.12; Tue, 13 Nov 2018 14:52:58 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=ZK85MHLf8ksY8etL0PhkjbuwjI1pJ5lY+lQWrLz3F8M=; b=IQWi+OaheTkmYBao5MT7LwxvaVtSC6hXIyPnyu4/f/4AfWL3eoM9zPb0ryUXyVWFBj S0zvASIyUKu3WQVxYG+K2Rurkt1iaRScO3v7OME3RKqWSGjBP3hhgdZOdZ9YEqwwkDXv foGQcWQA+R9Vpi3SZPgsNyCi3WNw0IeAAvgv0kuNe2ItIQQl+PXeEDSk+Su/UGRoUOta ISWUcKjW8hXuEr7sUGgmcGR+nk8MNLYiZlEeSU3Y512EHuSrdgBXoD0gXzNF/4U/POXY GWkKXCnM1svCIRuqvhTLjrZ+zpZ24kvBb1iLmx057E6y5LE7noqeCTMI+B89JI9MLxcT 7aWQ== X-Gm-Message-State: AGRZ1gK3dfRiCHovLdPNCzKKYOlLjCxpoGB7HnWJCf/0G6nISTPmIiS1 fuMl9M5WxdltoosAFbGp93fcKKRS3kdyYRcXMT0= X-Google-Smtp-Source: AJdET5czx7NQnrqgZ9pOOzjLTTy4T92jKwtFHN/kNWaz2KeGw2ig9H0BqS7XSp/6UI5ZThwxo86DHbZJay1egIuGSgA= X-Received: by 2002:a19:c396:: with SMTP id t144mr3855397lff.110.1542149571269; Tue, 13 Nov 2018 14:52:51 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Alan Somers Date: Tue, 13 Nov 2018 15:52:39 -0700 Message-ID: Subject: Re: Hole-punching, TRIM, etc To: Warner Losh Cc: freebsd-arch@freebsd.org, freebsd-fs , FreeBSD CURRENT X-Rspamd-Queue-Id: 2E8F17CD2D X-Spamd-Result: default: False [-4.07 / 200.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; R_SPF_ALLOW(-0.20)[+ip4:209.85.128.0/17]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; RCVD_TLS_LAST(0.00)[]; DMARC_NA(0.00)[freebsd.org]; TO_DN_SOME(0.00)[]; TO_MATCH_ENVRCPT_SOME(0.00)[]; MX_GOOD(-0.01)[cached: alt3.gmail-smtp-in.l.google.com]; NEURAL_HAM_SHORT(-0.97)[-0.968,0]; RCVD_IN_DNSWL_NONE(0.00)[53.167.85.209.list.dnswl.org : 127.0.5.0]; IP_SCORE(-1.10)[ipnet: 209.85.128.0/17(-3.47), asn: 15169(-1.91), country: US(-0.09)]; FORGED_SENDER(0.30)[asomers@freebsd.org,asomers@gmail.com]; RWL_MAILSPIKE_POSSIBLE(0.00)[53.167.85.209.rep.mailspike.net : 127.0.0.17]; R_DKIM_NA(0.00)[]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:209.85.128.0/17, country:US]; FROM_NEQ_ENVFROM(0.00)[asomers@freebsd.org,asomers@gmail.com]; RCVD_COUNT_TWO(0.00)[2] X-Rspamd-Server: mx1.freebsd.org Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Nov 2018 22:52:59 -0000 On Tue, Nov 13, 2018 at 3:51 PM Warner Losh wrote: > > > On Tue, Nov 13, 2018 at 3:10 PM Alan Somers wrote: > >> Hole-punching has been discussed on these lists before[1]. It basically >> means to turn a dense file into a sparse file by deallocating storage for >> some of the blocks in the middle. There's no standard API for it. Linux >> uses fallocate(2); Solaris and OSX add a new opcode to fcntl(2). >> >> A related concept is telling a block device that some blocks are no longer >> used. SATA calls this "TRIM", SCSI calls it "UNMAP", NVMe calls it >> "Deallocate", ZBC and ZAC call it "Reset Write Pointer". They all do >> basically the same thing, and it's analogous to hole-punching for regular >> files. They are also all inaccessible from FreeBSD's userland except by >> using pass(4), which is inconvenient and protocol-specific. >> >> Linux has a BLKDISCARD ioctl for issuing TRIM-like commands from userland, >> but it's totally undocumented and doesn't work on regular files. >> >> I propose adding support for all of these things using the fcntl(2) API. >> Using the same syntax that Solaris defined, you would be able to punch a >> hole in a regular file or TRIM blocks from an SSD. ZFS already supports >> it >> (though FreeBSD's port never did, and the code was deleted in r303763). >> Here's what I would do: >> >> 1) Add the F_FREESP command to fcntl(2). >> 2) Add a .fo_space field for struct fileops >> 3) Add a devfs_space method that implements .fo_space >> 4) Add a .d_space field to struct cdevsw >> 5) Add a g_dev_space method for GEOM that implements .d_space using >> BIO_DELETE. >> 6) Add a VOP_SPACE vop >> 7) Implement VOP_SPACE for tmpfs >> 8) Add aio_freesp(2), an asynchronous version of fcntl(F_FREESP). >> >> The greatest beneficiaries of this work would be type 2 hypervisors like >> QEMU and VirtualBox with guests that use TRIM, and userland filesystems >> such as fusefs-ext2 and fusefs-exfat. High-performance storage systems >> using SPDK would also benefit. The last item, aio_freesp(2), may seem >> unnecessary but it would really benefit my application. >> >> Questions, objections, flames? >> > > So the fcntl would deallocate blocks from a filesystem only. The > filesystem may issue BIO_DELETE as a result, but that's up to the > filesystem, correct? > Correct. > > On a raw device it would be translated into a BIO_DELETE command directly, > correct? > Correct, modulo edge cases. > > Warner > From owner-freebsd-arch@freebsd.org Tue Nov 13 22:58:48 2018 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E293F1135C0B; Tue, 13 Nov 2018 22:58:47 +0000 (UTC) (envelope-from asomers@gmail.com) Received: from mail-lj1-f176.google.com (mail-lj1-f176.google.com [209.85.208.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 2FEB47D331; Tue, 13 Nov 2018 22:58:47 +0000 (UTC) (envelope-from asomers@gmail.com) Received: by mail-lj1-f176.google.com with SMTP id z80-v6so12393409ljb.8; Tue, 13 Nov 2018 14:58:47 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=DPf/CmtPspLOBYHDGh2sS9n6Y0SVtBnwrdowwYhItZM=; b=BpBJK9R1xttSJT/keHXAH8MEM2cltEiVaoTZ21iId+iWx8/1SztvUmjsjoANzr/tHK 4oTtSJ18IwmwESH5ZyyCgKyP1cWuhtIL188XHwnpcEf65VM1KU01nwXkItDwpPuqdIXO +NTnS1W75n4yjwtDvh5veus8G4GA74VdUrjUlFXHpSq13qpDSyyogx0WZaP5mt9EpD1f GErts28fxym3DeBeX1wKgzCNhUJ74cK1EsjDjB90uLV6nHLFbUHBqm7Oq5ayDz59RnIh o08gUUZacAblqEiNOWmfhyVWkDPNi2Z2MWeSNCdKQ/vYNuEgNBOD4rH9Pbnrf6g+FF0w A97Q== X-Gm-Message-State: AGRZ1gIpIHEGXcwUYG+ZXLBzDkKYDlMORvZEeJcMyZv/28WLeDFe94vr AY5rW4HWyxJuFtMeM26bk1xd9ZM7dRFhEhlzL1iZqQ== X-Google-Smtp-Source: AJdET5fuShSOKx89sTzrkUQgE/J5ru2Muf3WOGoXRSpQmEVHfwxp8uRnFpK0hX+D9WBeHYMK1CQipXkC/E07ZH1uPzM= X-Received: by 2002:a2e:9356:: with SMTP id m22-v6mr4156127ljh.135.1542149918206; Tue, 13 Nov 2018 14:58:38 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Alan Somers Date: Tue, 13 Nov 2018 15:58:26 -0700 Message-ID: Subject: Re: Hole-punching, TRIM, etc To: "Conrad E. Meyer" Cc: freebsd-arch@freebsd.org, freebsd-fs , FreeBSD CURRENT X-Rspamd-Queue-Id: 2FEB47D331 X-Spamd-Result: default: False [-4.07 / 200.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; R_SPF_ALLOW(-0.20)[+ip4:209.85.128.0/17]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; RCVD_TLS_LAST(0.00)[]; DMARC_NA(0.00)[freebsd.org]; TO_DN_SOME(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; IP_SCORE(-1.09)[ipnet: 209.85.128.0/17(-3.46), asn: 15169(-1.91), country: US(-0.09)]; MX_GOOD(-0.01)[cached: alt3.gmail-smtp-in.l.google.com]; NEURAL_HAM_SHORT(-0.97)[-0.968,0]; RCVD_IN_DNSWL_NONE(0.00)[176.208.85.209.list.dnswl.org : 127.0.5.0]; FORGED_SENDER(0.30)[asomers@freebsd.org,asomers@gmail.com]; R_DKIM_NA(0.00)[]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:209.85.128.0/17, country:US]; FROM_NEQ_ENVFROM(0.00)[asomers@freebsd.org,asomers@gmail.com]; RCVD_COUNT_TWO(0.00)[2] X-Rspamd-Server: mx1.freebsd.org Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Nov 2018 22:58:48 -0000 On Tue, Nov 13, 2018 at 3:51 PM Conrad Meyer wrote: > Hi Alan, > > On Tue, Nov 13, 2018 at 2:10 PM Alan Somers wrote: > > > > Hole-punching has been discussed on these lists before[1]. It basically > > means to turn a dense file into a sparse file by deallocating storage for > > some of the blocks in the middle. There's no standard API for it. Linux > > uses fallocate(2); Solaris and OSX add a new opcode to fcntl(2). > > > > A related concept is telling a block device that some blocks are no > longer > > used. SATA calls this "TRIM", SCSI calls it "UNMAP", NVMe calls it > > "Deallocate", ZBC and ZAC call it "Reset Write Pointer". They all do > > basically the same thing, and it's analogous to hole-punching for regular > > files. They are also all inaccessible from FreeBSD's userland except by > > using pass(4), which is inconvenient and protocol-specific. > > Geom devices have the DIOCGDELETE ioctl, which translates into > BIO_DELETE (which is TRIM, as I understand it). It's available in > libgeom as g_delete() and used by hastd, newfs_nandfs, and nandtool. > Ahh, I thought there must be such a thing, but I couldn't find it. > > > Linux has a BLKDISCARD ioctl for issuing TRIM-like commands from > userland, > > but it's totally undocumented and doesn't work on regular files. > > > > I propose adding support for all of these things using the fcntl(2) API. > > Using the same syntax that Solaris defined, you would be able to punch a > > hole in a regular file or TRIM blocks from an SSD. ZFS already supports > it > > (though FreeBSD's port never did, and the code was deleted in r303763). > > Here's what I would do: > > > > 1) Add the F_FREESP command to fcntl(2). > > 2) Add a .fo_space field for struct fileops > > 3) Add a devfs_space method that implements .fo_space > > 4) Add a .d_space field to struct cdevsw > > 5) Add a g_dev_space method for GEOM that implements .d_space using > > BIO_DELETE. > > 6) Add a VOP_SPACE vop > > 7) Implement VOP_SPACE for tmpfs > > 8) Add aio_freesp(2), an asynchronous version of fcntl(F_FREESP). > > Why not just add DIOCGDELETE support to various VOP_IOCTL > implementations? The file objects forward correctly through vn_ioctl > to VOP_IOCTL for both regular files and devfs VCHR nodes. > > We can emulate the Linux API if we want to be compatible there, but I > wouldn't bother with Solaris. > The only reason that I prefer the Solaris API is because it doesn't require adding another syscall, and because Linux's fallocate(2) does a whole bunch of other things besides hole-punching. What about an asynchronous version? ioctl(2) is still synchronous. Do you see any better way to hole-punch/TRIM asynchronously than with aio? > > Best, > Conrad > From owner-freebsd-arch@freebsd.org Tue Nov 13 22:58:57 2018 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7F7391135C8B for ; Tue, 13 Nov 2018 22:58:57 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: from mail-io1-xd2e.google.com (mail-io1-xd2e.google.com [IPv6:2607:f8b0:4864:20::d2e]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id D2DD77D3EF for ; Tue, 13 Nov 2018 22:58:56 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: by mail-io1-xd2e.google.com with SMTP id h4so6159586iom.5 for ; Tue, 13 Nov 2018 14:58:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=CqMUpo493atzjgSsixYS/AHpw8MT4SJMAwHthPoo5oI=; b=eQOOhp1HL/CWvF3DBRFGqOp99pDToSgvqqzefes/wNDYPLo1yxHxnvQiXh1a1wUBXM IYnhDxwMzUvvaDW6Rvk6zw1MfDChRam6SNTXQO3cET1gwIJWN1m7fi2kVgUYF5f7b39c h8X4nW4mwgxJ3EBsKGQgzCsHPav8MlCyASO0tbONpW5Dh/bJs6HYfkD4XjHYPtT0VecU 2GlZKhm2ckgl2zZlnT1FYTWOT4/c4zIrOVBH39+NR/5pQIPrRYzCUvcsiIMrlHlcbRpB eUkjro2DfkHiAMxIyIguJGDWPBVZadXsMss5XIJkiiy8b7fLBwQtd5Aj9XjgmNcAt4EZ GR/g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=CqMUpo493atzjgSsixYS/AHpw8MT4SJMAwHthPoo5oI=; b=HguhmJxXu83S2IVbGvuQGaFTuVDnkm/8S1a6d8WfiKtHIbG73OfqlpQqAPoE3m8uL9 3pw9sDgjYr4NARm9m7Yn/hUreICt0hiMiLzLoGYO4QE4K+7JSftpUvXrQO9nldqO28j4 6C7BuRkokxckVuNerpZ07JAQtuIxTvSTXomVUKVZdSU0NTDzTZYBTPemLQD8qFE5rJBE 4e9PiNitPnamUsw6VcDHpzV68o0vBxhnFWoxv6tSfl54H/UKPw+fv5+lohO1QnuzX2KL k/AHeWIpDk8tOuzbZjQVh5U8AJleEXPPVa2iP2Ns0NEkZZdS/+cPyNqE6nTU3KrtLxWL T6iA== X-Gm-Message-State: AGRZ1gJC/eSaDx2JEmu2PuP1NwENCML3G1eeuarvfxuqwiLmAnyeA9BH 4KGBAUjSeK9wSvhtyeG6Yf67GRcaLEF4xTES3WAl1GwD X-Google-Smtp-Source: AJdET5eu+7U4suaKglvNO4F0FtxTdgyrK2kOm2p4KnXNUA3T2fuVwgL45Duoe4g7HSmZz15N79zKm6ZMBEoaEuMgxkE= X-Received: by 2002:a6b:7809:: with SMTP id j9-v6mr5562930iom.299.1542149936083; Tue, 13 Nov 2018 14:58:56 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Warner Losh Date: Tue, 13 Nov 2018 15:58:45 -0700 Message-ID: Subject: Re: Hole-punching, TRIM, etc To: "Conrad E. Meyer" Cc: Alan Somers , FreeBSD FS , FreeBSD Current , "freebsd-arch@freebsd.org" X-Rspamd-Queue-Id: D2DD77D3EF X-Spamd-Result: default: False [-3.94 / 200.00]; ARC_NA(0.00)[]; TO_DN_EQ_ADDR_SOME(0.00)[]; R_DKIM_ALLOW(-0.20)[bsdimp-com.20150623.gappssmtp.com]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-arch@freebsd.org]; DMARC_NA(0.00)[bsdimp.com]; RCPT_COUNT_FIVE(0.00)[5]; TO_MATCH_ENVRCPT_SOME(0.00)[]; DKIM_TRACE(0.00)[bsdimp-com.20150623.gappssmtp.com:+]; MX_GOOD(-0.01)[cached: ALT1.aspmx.l.google.com]; RCVD_IN_DNSWL_NONE(0.00)[e.2.d.0.0.0.0.0.0.0.0.0.0.0.0.0.0.2.0.0.4.6.8.4.0.b.8.f.7.0.6.2.list.dnswl.org : 127.0.5.0]; NEURAL_HAM_SHORT(-0.95)[-0.950,0]; R_SPF_NA(0.00)[]; FORGED_SENDER(0.30)[imp@bsdimp.com,wlosh@bsdimp.com]; RCVD_TLS_LAST(0.00)[]; IP_SCORE(-0.98)[ipnet: 2607:f8b0::/32(-2.88), asn: 15169(-1.91), country: US(-0.09)]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US]; FROM_NEQ_ENVFROM(0.00)[imp@bsdimp.com,wlosh@bsdimp.com]; RCVD_COUNT_TWO(0.00)[2] X-Rspamd-Server: mx1.freebsd.org Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Nov 2018 22:58:57 -0000 On Tue, Nov 13, 2018 at 3:52 PM Conrad Meyer wrote: > Geom devices have the DIOCGDELETE ioctl, which translates into > BIO_DELETE (which is TRIM, as I understand it). > Correct. TRIM is both the catch-all term people use, as well as the name of a specific DSM (data set management) command in the ATA command set. All FLASH technologies have it (thought what it means under the covers varies a bit). Thin provisioned resources like in VMs also have it. Warner From owner-freebsd-arch@freebsd.org Tue Nov 13 22:59:46 2018 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 49EA81135D55; Tue, 13 Nov 2018 22:59:46 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) by mx1.freebsd.org (Postfix) with ESMTP id B331F7D581; Tue, 13 Nov 2018 22:59:45 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from critter.freebsd.dk (unknown [192.168.55.3]) by phk.freebsd.dk (Postfix) with ESMTP id 11E4A14860; Tue, 13 Nov 2018 22:59:45 +0000 (UTC) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.15.2/8.15.2) with ESMTPS id wADMxirQ004237 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Tue, 13 Nov 2018 22:59:44 GMT (envelope-from phk@critter.freebsd.dk) Received: (from phk@localhost) by critter.freebsd.dk (8.15.2/8.15.2/Submit) id wADMxieR004236; Tue, 13 Nov 2018 22:59:44 GMT (envelope-from phk) To: Warner Losh cc: Alan Somers , FreeBSD FS , FreeBSD Current , "freebsd-arch@freebsd.org" Subject: Re: Hole-punching, TRIM, etc In-reply-to: From: "Poul-Henning Kamp" References: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <4234.1542149984.1@critter.freebsd.dk> Content-Transfer-Encoding: quoted-printable Date: Tue, 13 Nov 2018 22:59:44 +0000 Message-ID: <4235.1542149984@critter.freebsd.dk> X-Rspamd-Queue-Id: B331F7D581 X-Spamd-Result: default: False [0.71 / 200.00]; ARC_NA(0.00)[]; TO_DN_EQ_ADDR_SOME(0.00)[]; NEURAL_HAM_MEDIUM(-0.17)[-0.173,0]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; NEURAL_SPAM_SHORT(0.08)[0.076,0]; NEURAL_HAM_LONG(-0.48)[-0.481,0]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[freebsd.dk]; AUTH_NA(1.00)[]; RCPT_COUNT_FIVE(0.00)[5]; RCVD_COUNT_THREE(0.00)[4]; TO_MATCH_ENVRCPT_SOME(0.00)[]; MX_GOOD(-0.01)[cached: phk.freebsd.dk]; R_SPF_NA(0.00)[]; FORGED_SENDER(0.30)[phk@phk.freebsd.dk,phk@critter.freebsd.dk]; RCVD_NO_TLS_LAST(0.10)[]; R_DKIM_NA(0.00)[]; IP_SCORE(-0.00)[country: EU(-0.00)]; ASN(0.00)[asn:1835, ipnet:130.225.0.0/16, country:EU]; FROM_NEQ_ENVFROM(0.00)[phk@phk.freebsd.dk,phk@critter.freebsd.dk] X-Rspamd-Server: mx1.freebsd.org X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Nov 2018 22:59:46 -0000 -------- In message , Warner Losh writes: >On a raw device it would be translated into a BIO_DELETE command directly= , >correct? We already have ioctl(DIOCGDELETE) for that. newfs(8) uses it. -- = Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe = Never attribute to malice what can adequately be explained by incompetence= . From owner-freebsd-arch@freebsd.org Tue Nov 13 23:09:05 2018 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5F99E113658B; Tue, 13 Nov 2018 23:09:05 +0000 (UTC) (envelope-from cse.cem@gmail.com) Received: from mail-it1-f179.google.com (mail-it1-f179.google.com [209.85.166.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 9D92B7E388; Tue, 13 Nov 2018 23:09:04 +0000 (UTC) (envelope-from cse.cem@gmail.com) Received: by mail-it1-f179.google.com with SMTP id v11so21183381itj.0; Tue, 13 Nov 2018 15:09:04 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:reply-to :from:date:message-id:subject:to:cc:content-transfer-encoding; bh=9stKphtTKlpfIWzZb+DMF+rZL4fj6TT/NCpcLOJ/7BA=; b=HrGqAB+OdmOXMmjltowy9196FACbdzsUom2W9Vj9Z1pPn6oa27yTzetGIfDPfQCEvj qeT1AO4yWAbwjJ1jBDoYADUW7jdKwt3b/jiUazfqwOx77FJrUQUTlxzJvz8PIk3aojNo E+om/W9QiscLWt/9paG5sqrqm0MCD+HG3DXc10SVTPT3VbSxgYRjIFQ7STvxFCZuiNCB dq5OmKLa6pqSOKqHRyIiIRV3yivw5Z2fPny9gbsleVlwjCSqlreIcWHAwd9slcDd0UjB 0vAq+toyX0g5Cenbm1cS57LFr7ffhtVSLL3AGVuF/TlE0YVsn/EoVWcrPKw37fPHgHY3 wCPA== X-Gm-Message-State: AGRZ1gIMhUVt7RTFkc6Vk18ExU/mLxh/f/hpvKXXqYC09hYuURh+3s5E q2uCwHymAFIf27rO7SKhCIWLTIqb X-Google-Smtp-Source: AJdET5eFJzKMYLTriBR2sKkjueOV5D2rIGNuIXTNiNCkHl+xPT2zC6mWTIDsilVkfm3+rekJqFvvxA== X-Received: by 2002:a02:12c5:: with SMTP id 66mr6091541jap.54.1542150543558; Tue, 13 Nov 2018 15:09:03 -0800 (PST) Received: from mail-io1-f41.google.com (mail-io1-f41.google.com. [209.85.166.41]) by smtp.gmail.com with ESMTPSA id n24-v6sm2786499ioc.65.2018.11.13.15.09.03 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 13 Nov 2018 15:09:03 -0800 (PST) Received: by mail-io1-f41.google.com with SMTP id h4so6178227iom.5; Tue, 13 Nov 2018 15:09:03 -0800 (PST) X-Received: by 2002:a6b:6119:: with SMTP id v25mr6169014iob.107.1542150542921; Tue, 13 Nov 2018 15:09:02 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: Reply-To: cem@freebsd.org From: Conrad Meyer Date: Tue, 13 Nov 2018 15:08:51 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Hole-punching, TRIM, etc To: Alan Somers Cc: "freebsd-arch@freebsd.org" , freebsd-fs , freebsd-current Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 9D92B7E388 X-Spamd-Result: default: False [-4.05 / 200.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; HAS_REPLYTO(0.00)[cem@freebsd.org]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:209.85.128.0/17]; REPLYTO_ADDR_EQ_FROM(0.00)[]; RCVD_COUNT_THREE(0.00)[4]; MX_GOOD(-0.01)[cached: alt3.gmail-smtp-in.l.google.com]; NEURAL_HAM_SHORT(-0.95)[-0.949,0]; FORGED_SENDER(0.30)[cem@freebsd.org,csecem@gmail.com]; RCVD_TLS_LAST(0.00)[]; R_DKIM_NA(0.00)[]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:209.85.128.0/17, country:US]; TAGGED_FROM(0.00)[]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; FROM_NEQ_ENVFROM(0.00)[cem@freebsd.org,csecem@gmail.com]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; TO_MATCH_ENVRCPT_ALL(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[freebsd.org]; IP_SCORE(-1.09)[ipnet: 209.85.128.0/17(-3.45), asn: 15169(-1.90), country: US(-0.09)]; RCVD_IN_DNSWL_NONE(0.00)[179.166.85.209.list.dnswl.org : 127.0.5.0] X-Rspamd-Server: mx1.freebsd.org X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Nov 2018 23:09:05 -0000 On Tue, Nov 13, 2018 at 2:59 PM Alan Somers wrote: > > On Tue, Nov 13, 2018 at 3:51 PM Conrad Meyer wrote: >> >> On Tue, Nov 13, 2018 at 2:10 PM Alan Somers wrote: >> > ... >> > 8) Add aio_freesp(2), an asynchronous version of fcntl(F_FREESP). >> >> Why not just add DIOCGDELETE support to various VOP_IOCTL >> implementations? The file objects forward correctly through vn_ioctl >> to VOP_IOCTL for both regular files and devfs VCHR nodes. >> >> We can emulate the Linux API if we want to be compatible there, but I >> wouldn't bother with Solaris. > > The only reason that I prefer the Solaris API is because it doesn't requi= re adding another syscall, and because Linux's fallocate(2) does a whole bu= nch of other things besides hole-punching. I am imagining that if we went this route, we would implement Linux fallocate as a library shim around the native FreeBSD ioctl (or whatever) rather than an independent system call. This would be for API compatibility, not ABI compatibility. But Linux compat can be set aside for now, I think =E2=80=94 it's a secondary concern. > What about an asynchronous version? ioctl(2) is still synchronous. Do y= ou see any better way to hole-punch/TRIM asynchronously than with aio? Yeah, this is a good consideration. No, I don't have any better suggestion for an asynchronous API. In general our VOPs tend to be synchronous. Aio does seem like the logical home for a new asynchronous API. Best regards, Conrad