From nobody Thu Feb 2 16:52:34 2023 X-Original-To: hackers@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4P74bM35N2z3kM63 for ; Thu, 2 Feb 2023 16:52:47 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: from mail-ej1-x632.google.com (mail-ej1-x632.google.com [IPv6:2a00:1450:4864:20::632]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4P74bM0cDYz3rR7 for ; Thu, 2 Feb 2023 16:52:47 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-ej1-x632.google.com with SMTP id bk15so7695652ejb.9 for ; Thu, 02 Feb 2023 08:52:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20210112.gappssmtp.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=kMPXAsqukwGEKN2a2WZUNHRhYP3peWhXdJroy1XhKUE=; b=jBdHlYGc/Su6nOr98jhEnooR2ljpCGeHLveTn6bdsx/s7gwrRVXHaj2Imi2uL28hWv G3boplbhvRYiXrPNd6LCuuTPU7eZfQdFlhpG9YE+j1W1kiNZdWomVrVZE5YwGeNgTRG9 22NbV+UsOISnbYs6rdO8BYA6nDUkeFmRq5BzToUWMzquU41Ufb2cbZ8nhnqaoMwSq/li HXrqEPhjRTfOyeaI3o5fM4y+l7wGn9h7Tawk6rN6Ds7nVgm311pTVSxVZul4m+7MvkeQ 0NN+41GVOStljPQMhf2stdHftDLX5ah1VYkGMNsG1HyGZ0t/A9y2UavmS93hxZtrbvfI 68eA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=kMPXAsqukwGEKN2a2WZUNHRhYP3peWhXdJroy1XhKUE=; b=dTgBIsTKAddfTpF17aEMRph9x9HJPZsBI/8/eLksdjzMgLWV0M6O2a4Kk0pICxFrCw 1Im2adz4HkOzjtkftdGlRFxr6IK486N56jQK2UdT3cduvg8l5a0upHDQIHwAUhdf5sdt P6QB9uo8/nDFGnIlXnJIoomGI6DW119Nl3jO/4S8+O13vCWJNh9kOaRLDWFcasfFsbMN ecIM2GReHCXlenZCjDmMsnKiUqJ6iy3mMxQxuCNrXFPHOfd3DWQ63/kkMQ3d0isBnZ/3 BLo8ib9sue2+HBsIdAkqX2knunsYINpNGzJ17CwqaU+yy+W4kSM6/RqRfr/7dsbEaXhS ad1Q== X-Gm-Message-State: AO0yUKVcgDvqYA7L1L3c1J0Kn8Lp8hpWLRuKN5RB+xTX36R4T/dQIx1S p7MlQDvTiCxakAZVq12trmQNxK+Gj7jRcr3KXPzxiA== X-Google-Smtp-Source: AK7set85vxj5O4hUiYd2uWSZgz0YRo0YFftvRVaBT1zb4DJrGgLk4MSSWRmIKqfRw2m2a8wHcqKnGBLj8/vjc30d52c= X-Received: by 2002:a17:906:5390:b0:888:1f21:4424 with SMTP id g16-20020a170906539000b008881f214424mr2281506ejo.141.1675356765066; Thu, 02 Feb 2023 08:52:45 -0800 (PST) List-Id: Technical discussions relating to FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-hackers List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-hackers@freebsd.org MIME-Version: 1.0 References: <3DEFA166-79E8-49AD-B0DD-7E524C8EAAE8@iitbombay.org> In-Reply-To: <3DEFA166-79E8-49AD-B0DD-7E524C8EAAE8@iitbombay.org> From: Warner Losh Date: Thu, 2 Feb 2023 09:52:34 -0700 Message-ID: Subject: Re: Swap, ZFS & ARC To: Bakul Shah Cc: jbo@insane.engineer, "hackers@freebsd.org" Content-Type: multipart/alternative; boundary="000000000000dab04b05f3ba6460" X-Rspamd-Queue-Id: 4P74bM0cDYz3rR7 X-Spamd-Bar: ---- X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US] X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-ThisMailContainsUnwantedMimeParts: N --000000000000dab04b05f3ba6460 Content-Type: text/plain; charset="UTF-8" On Thu, Feb 2, 2023 at 9:17 AM Bakul Shah wrote: > > On Feb 2, 2023, at 6:28 AM, jbo@insane.engineer wrote: > > > Hello folks, > > Based on a discussion on the forums not so long ago I tried to figure out > how swap usage on a ZFS system plays together with ARC. However, I could > find very little to no information on this which leads me to believe that > there is some "core concept" I might be oblivious to. > > The main question is basically this: Your system starts to swap out data > from RAM to your swap partition. This swap data on disk ultimately resides > somewhere in a ZFS pool. If this data then gets accessed, it might be > cached by ARC essentially eating up memory again which seems counter > productive. > Is there any magic which prevents swap partitions from being loaded into > ARC? Or is this a non-issue for some other reason? > > > I suspect this bug affects FreeBSD as well: > > https://github.com/openzfs/zfs/issues/7734 > > From https://github.com/openzfs/zfs/issues/7734#issuecomment-422082279 > > I'm not an expert in this area of the code, but I think that swap on ZVOL > is inherently unreliable due to writes to the swap ZVOL having to go > through the normal TXG sync and ZIO write paths, which can require lots of > memory allocations by design (and these memory allocations can stall due to > a low memory situation). I believe this to be true for swap on ZVOL for > illumos, as well as Linux, and presumably FreeBSD too (although I have no > experience using it on FreeBSD, so I could be wrong). > > > FYI. I don't know enough about zfs internals so can't say if this poster > is right or not but I too have just used a disk partition as opposed to a > zvolume for swap. > They have it right. FreeBSD tries hard (though isn't perfect in this regard) to not allocate memory in the I/O path, especially if it knows the request is for swap. It's possible to swap to zvols, and it actually works when the memory pressures are light (since the stalls usually aren't too long and usually complete). When memory pressure is high, though, these can turn into deadlocks. Warner --000000000000dab04b05f3ba6460 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


=
On Thu, Feb 2, 2023 at 9:17 AM Bakul = Shah <bakul@iitbombay.org>= wrote:

On Feb 2, 2023, at 6:28 AM, jbo@insane.engineer wrote:

Hello folks,

Based on a discussion on the f= orums not so long ago I tried to figure out how swap usage on a ZFS system = plays together with ARC. However, I could find very little to no informatio= n on this which leads me to believe that there is some "core concept&q= uot; I might be oblivious to.

The main question is basically this: Y= our system starts to swap out data from RAM to your swap partition. This sw= ap data on disk ultimately resides somewhere in a ZFS pool. If this data th= en gets accessed, it might be cached by ARC essentially eating up memory ag= ain which seems counter productive.
Is there any magic which prevents sw= ap partitions from being loaded into ARC? Or is this a non-issue for some o= ther reason?

I suspect this bug affects FreeBSD as well= :

https://github.com/openzfs/zfs/issues/7734

From=C2=A0https://github.com/openzfs/zfs/issues/7734#issuecomment-4= 22082279

I'm not an expert in this area of the code, but I t= hink that swap on ZVOL is inherently unreliable due to writes to the swap Z= VOL having to go through the normal TXG sync and ZIO write paths, which can= require lots of memory allocations by design (and these memory allocations= can stall due to a low memory situation). I believe this to be true for sw= ap on ZVOL for illumos, as well as Linux, and presumably FreeBSD too (altho= ugh I have no experience using it on FreeBSD, so I could be wrong).

FYI. I don't know enough about zfs internals so can= 't say if this poster is right or not but I too have just used a disk p= artition as opposed to a zvolume for swap.

They have it right. FreeBSD tries hard (though isn't = perfect in this regard) to not allocate memory in the I/O path, especially = if it knows the request is for swap. It's possible to swap to zvols, an= d it actually works when the memory pressures are light (since the stalls u= sually aren't too long and usually complete). When memory pressure is h= igh, though, these can turn into deadlocks.

Warner=
--000000000000dab04b05f3ba6460--