From owner-freebsd-net@freebsd.org Tue Mar 30 14:33:01 2021 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id C97B75783E8 for ; Tue, 30 Mar 2021 14:33:01 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mailman.nyi.freebsd.org (unknown [127.0.1.3]) by mx1.freebsd.org (Postfix) with ESMTP id 4F8sP959cbz4p0C for ; Tue, 30 Mar 2021 14:33:01 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: by mailman.nyi.freebsd.org (Postfix) id B185F57871E; Tue, 30 Mar 2021 14:33:01 +0000 (UTC) Delivered-To: net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id B143F578649 for ; Tue, 30 Mar 2021 14:33:01 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4F8sP94VHJz4p5t for ; Tue, 30 Mar 2021 14:33:01 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 8D56427D37 for ; Tue, 30 Mar 2021 14:33:01 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 12UEX17t047318 for ; Tue, 30 Mar 2021 14:33:01 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 12UEX1Ru047317 for net@FreeBSD.org; Tue, 30 Mar 2021 14:33:01 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: net@FreeBSD.org Subject: [Bug 254333] [tcp] sysctl net.inet.tcp.hostcache.list hangs Date: Tue, 30 Mar 2021 14:33:01 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.4-STABLE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Many People X-Bugzilla-Who: rscheff@freebsd.org X-Bugzilla-Status: In Progress X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: rscheff@freebsd.org X-Bugzilla-Flags: mfc-stable13? mfc-stable12? mfc-stable11? X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 30 Mar 2021 14:33:01 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D254333 --- Comment #13 from Richard Scheffenegger --- Thanks. I can not comment as to why the hostcache.count counts up and beyond the limit. But the other data confirms that what you are seeing is the (unsuccessful) attempt in allocating a huge amount of kernel memory by the sbuf_new functi= on. When trying to dump all the entries of the hostcache, the hostcache.list tries to grab hostcache.cachelimit * 128 bytes, or 1966080 * 128 ~=3D 25= 0 MB of contingeous kernel memory - twice (!). (Also, if the count really is > cachelimit, the hostcache.list may eventual= ly simply fail, due to insufficient memory...) The following Diffs are under review, but should address these particular issues: o) immediately respond with buffer required without actually allocating/preparing that=20 o) allocating only one bucket's worth of output and moving the output bucket-by-bucket to userspace (reducing the memory footprint temporarily required from 2x 250 MB down to 1x 4kB). See=20 https://reviews.freebsd.org/D29471 https://reviews.freebsd.org/D29481 https://reviews.freebsd.org/D29483 Patching with only D29471 should mostly address the issue, although there remain issues around keeping a lock for an extensive period of time, while moving the output to userspace repeatedly. This may have undesired sideeffe= cts, so you want to probably go with all three. To "unstuck" the system in this state - refrain from issuing hostcache.list multiple times, and try to free up the above mentioned chunk of in-kernel memory (at least temporarily). That should allow all the stalled malloc processes to succeed eventually, and return properly. But on systems with a high uptime or kernel memory churn, and a very large hostcache.cachelimit, it really is only a question of time, until such a hu= ge malloc blocks "indefinitely" - thus the above patches try to be much more s= mart in what kernel memory really is needed. The downside is, that there is a higher chance, that hash buckets may change more, than they would with that huge memory allocation (if successful). But hostcache.list does not provide a "complete" snapshot of all the entrie= s in the hostcache at a very specific moment in time - they may change between t= he evaluation of different buckets already (but the longer time for moving dat= a to userspace may allow more changes to happen). --=20 You are receiving this mail because: You are on the CC list for the bug.=