From owner-freebsd-arch@freebsd.org Sat Nov 14 01:23:32 2020 Return-Path: Delivered-To: freebsd-arch@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id C64014658A3 for ; Sat, 14 Nov 2020 01:23:32 +0000 (UTC) (envelope-from scottl@samsco.org) Received: from out2-smtp.messagingengine.com (out2-smtp.messagingengine.com [66.111.4.26]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4CXyJz47PFz3j0c for ; Sat, 14 Nov 2020 01:23:31 +0000 (UTC) (envelope-from scottl@samsco.org) Received: from compute2.internal (compute2.nyi.internal [10.202.2.42]) by mailout.nyi.internal (Postfix) with ESMTP id 6521F5C00E9; Fri, 13 Nov 2020 20:23:30 -0500 (EST) Received: from mailfrontend1 ([10.202.2.162]) by compute2.internal (MEProxy); Fri, 13 Nov 2020 20:23:30 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=samsco.org; h= content-type:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; s=fm1; bh=e wsocnK+HCJFZuKX+fAKuWJBINFirJ7JdDHCEa5t8SA=; b=ammi3UoQVEj7bdWS9 qpkfOPHZiMBggVAeFzTEyvzLGTeOg1HvHDczIhgtUoDs/vDfT89j7/zpZb+XRkMf IVKHeG3WXZWTYZUObNnpbOZB7+h4al65raF1w95IqzTUorQFoFD0ZAxamQVsmCIW RzVmdCo4yx3lvGRePqvb5ryLIkgi2u4fR2G6vy1mZD8D0Lae91pVD9w0t4RceKUK 56gy5tFXOxuPbZ9YikMZK4xMrYBNsGoOEbRS3GkbUK8bCDyoeYIoODzgvin3+Q/R 4Dpp9gbEhNzHmcmXiRfPQVPaEeFz3G0nGvix1GyP60Iug+QDvnzvXPKyZwl8XLJT XoCFQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm1; bh=ewsocnK+HCJFZuKX+fAKuWJBINFirJ7JdDHCEa5t8 SA=; b=lgN/nbgF5tcG82MHvqJMq0bPZv99e3me9IFedxkRAayAaCfymk84SGn8S fv2ObuZDZqYiYS/G+55CY+kwfXo7DgQs1MYofTCyQio6IC9AhJkLaeFr6VubADgj Nm4I4XljdLiXrw0B0SdQOGQOaY5yqWIXE45K8NaUFLKI2zkgwRsgSrpMvIpbEYnh /83GO31a1+GyNolN+y3xKWzjh6g7fk/erwRSwX3fpwaRBp/hDGnFkfZOYsz3gzN2 84FoLi7a/vt1b3OvToZF3bIt6eZuM/h4FMZIF+o/32YKh8pI1ACPoff4IfP8RSFN DermW5uDI1Atl0zmne1mmkwev5z+Q== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedruddviedgfeefucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpegtggfuhfgjfffgkfhfvffosehtqh hmtdhhtdejnecuhfhrohhmpefutghothhtucfnohhnghcuoehstghothhtlhesshgrmhhs tghordhorhhgqeenucggtffrrghtthgvrhhnpeduudevkeehheeiudekkeelleevudefve eftedugfdtffetffelheehffeufffgheenucffohhmrghinhepfhhrvggvsghsugdrohhr ghenucfkphepkedrgeeirdekledrvddufeenucevlhhushhtvghrufhiiigvpedtnecurf grrhgrmhepmhgrihhlfhhrohhmpehstghothhtlhesshgrmhhstghordhorhhg X-ME-Proxy: Received: from [192.168.0.114] (unknown [8.46.89.213]) by mail.messagingengine.com (Postfix) with ESMTPA id CE3A63280059; Fri, 13 Nov 2020 20:23:29 -0500 (EST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.4\)) Subject: Re: MAXPHYS bump for FreeBSD 13 From: Scott Long In-Reply-To: Date: Fri, 13 Nov 2020 18:23:29 -0700 Cc: "freebsd-arch@freebsd.org" Content-Transfer-Encoding: quoted-printable Message-Id: <926C3A98-03BF-46FD-9B22-9EFBDC0F44A4@samsco.org> References: To: Warner Losh X-Mailer: Apple Mail (2.3608.120.23.2.4) X-Rspamd-Queue-Id: 4CXyJz47PFz3j0c X-Spamd-Bar: --- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=samsco.org header.s=fm1 header.b=ammi3UoQ; dkim=pass header.d=messagingengine.com header.s=fm1 header.b=lgN/nbgF; dmarc=none; spf=pass (mx1.freebsd.org: domain of scottl@samsco.org designates 66.111.4.26 as permitted sender) smtp.mailfrom=scottl@samsco.org X-Spamd-Result: default: False [-3.10 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; TO_DN_SOME(0.00)[]; MV_CASE(0.50)[]; R_SPF_ALLOW(-0.20)[+ip4:66.111.4.26]; RWL_MAILSPIKE_GOOD(0.00)[66.111.4.26:from]; RCVD_COUNT_THREE(0.00)[4]; DKIM_TRACE(0.00)[samsco.org:+,messagingengine.com:+]; RCPT_COUNT_TWO(0.00)[2]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; RBL_DBL_DONT_QUERY_IPS(0.00)[66.111.4.26:from]; ASN(0.00)[asn:11403, ipnet:66.111.4.0/24, country:US]; MIME_TRACE(0.00)[0:+]; MID_RHS_MATCH_FROM(0.00)[]; RCVD_IN_DNSWL_LOW(-0.10)[66.111.4.26:from]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[samsco.org:s=fm1,messagingengine.com:s=fm1]; FREEFALL_USER(0.00)[scottl]; FROM_HAS_DN(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[samsco.org]; SPAMHAUS_ZRD(0.00)[66.111.4.26:from:127.0.2.255]; TO_MATCH_ENVRCPT_SOME(0.00)[]; MAILMAN_DEST(0.00)[freebsd-arch] X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 14 Nov 2020 01:23:32 -0000 I have mixed feelings on this. The Netflix workload isn=E2=80=99t = typical, and this change represents a fairly substantial increase in memory usage for bufs. It=E2=80=99s also a config tunable, so it=E2=80=99s not like this = represents a meaningful diff reduction for Netflix. The upside is that it will likely help benchmarks out of the box. Is = that enough of an upside for the downsides of memory pressure on small memory and high iops systems? I=E2=80=99m not convinced. I really would like = to see the years of talk about fixing this correctly put into action. Scott > On Nov 13, 2020, at 11:33 AM, Warner Losh wrote: >=20 > Greetings, >=20 > We currently have a MAXPHYS of 128k. This is the maximum size of I/Os = that > we normally use (though there are exceptions). >=20 > I'd like to propose that we bump MAXPHYS to 1MB, as well as bumping > DFLTPHYS to 1MB. >=20 > 128k was good back in the 90s/2000s when memory was smaller, drives = did > smaller I/Os, etc. Now, however, it doesn't make much sense. Modern = I/O > devices can easily do 1MB or more and there's performance benefits = from > scheduling larger I/Os. >=20 > Bumping this will mean larger struct buf and struct bio. Without some > concerted effort, it's hard to make this be a sysctl tunable. While = that's > desirable, perhaps, it shouldn't gate this bump. The increase in size = for > 1MB is modest enough. >=20 > The NVMe driver currently is limited to 1MB transfers due to = limitations in > the NVMe scatter gather lists and a desire to preallocate as much as > possible up front. Most NVMe drivers have maximum transfer sizes = between > 128k and 1MB, with larger being the trend. >=20 > The mp[rs] drivers can use larger MAXPHYS, though resource limitations = on > some cards hamper bumping it beyond about 2MB. >=20 > The AHCI driver is happy with 1MB and larger sizes. >=20 > Netflix has run MAXPHYS of 8MB for years, though that's likely 2x too = large > even for our needs due to limiting factors in the upper layers making = it > hard to schedule I/Os larger than 3-4MB reliably. >=20 > So this should be a relatively low risk, and high benefit. >=20 > I don't think other kernel tunables need to change, but I always run = into > trouble with runningbufs :) >=20 > Comments? Anything I forgot? >=20 > Warner > _______________________________________________ > freebsd-arch@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-arch > To unsubscribe, send any mail to = "freebsd-arch-unsubscribe@freebsd.org"