From owner-dev-commits-src-main@freebsd.org Sat Sep 25 04:53:27 2021 Return-Path: Delivered-To: dev-commits-src-main@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 6B76067B241; Sat, 25 Sep 2021 04:53:27 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4HGc3q18Y3z3JNh; Sat, 25 Sep 2021 04:53:27 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 08712242B2; Sat, 25 Sep 2021 04:53:27 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.16.1/8.16.1) with ESMTP id 18P4rQ0L016060; Sat, 25 Sep 2021 04:53:26 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.16.1/8.16.1/Submit) id 18P4rQ3G016059; Sat, 25 Sep 2021 04:53:26 GMT (envelope-from git) Date: Sat, 25 Sep 2021 04:53:26 GMT Message-Id: <202109250453.18P4rQ3G016059@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org From: Marko Zec Subject: git: 43880c511cef - main - [fib_algo][dxr] Split unused range chunk list in multiple buckets MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: zec X-Git-Repository: src X-Git-Refname: refs/heads/main X-Git-Reftype: branch X-Git-Commit: 43880c511cef68098a9ca5a797f28e2c4d29cfad Auto-Submitted: auto-generated X-BeenThere: dev-commits-src-main@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Commit messages for the main branch of the src repository List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 25 Sep 2021 04:53:27 -0000 The branch main has been updated by zec: URL: https://cgit.FreeBSD.org/src/commit/?id=43880c511cef68098a9ca5a797f28e2c4d29cfad commit 43880c511cef68098a9ca5a797f28e2c4d29cfad Author: Marko Zec AuthorDate: 2021-09-25 04:29:48 +0000 Commit: Marko Zec CommitDate: 2021-09-25 04:29:48 +0000 [fib_algo][dxr] Split unused range chunk list in multiple buckets Traversing a single list of unused range chunks in search for a block of optimal size was suboptimal. The experience with real-world BGP workloads has shown that on average unused range chunks are tiny, mostly in length from 1 to 4 or 5, when DXR is configured with K = 20 which is the current default (D16X4R). Therefore, introduce a limited amount of buckets to accomodate descriptors of empty blocks of fixed (small) size, so that those can be found in O(1) time. If no empty chunks of the requested size can be found in fixed-size buckets, the search continues in an unsorted list of empty chunks of variable lengths, which should only happen infrequently. This change should permit us to manage significantly more empty range chunks without sacrifying the speed of incremental range table updating. MFC after: 3 days --- sys/netinet/in_fib_dxr.c | 47 ++++++++++++++++++++++++++++++++--------------- 1 file changed, 32 insertions(+), 15 deletions(-) diff --git a/sys/netinet/in_fib_dxr.c b/sys/netinet/in_fib_dxr.c index c4f42abdda27..6a9f414c3ab0 100644 --- a/sys/netinet/in_fib_dxr.c +++ b/sys/netinet/in_fib_dxr.c @@ -115,6 +115,8 @@ CTASSERT(DXR_TRIE_BITS >= 16 && DXR_TRIE_BITS <= 24); #define XTBL_SIZE_INCR (DIRECT_TBL_SIZE / 16) +#define UNUSED_BUCKETS 8 + /* Lookup structure elements */ struct direct_entry { @@ -181,7 +183,7 @@ struct dxr_aux { struct trie_desc *trietbl[D_TBL_SIZE]; LIST_HEAD(, chunk_desc) chunk_hashtbl[CHUNK_HASH_SIZE]; LIST_HEAD(, chunk_desc) all_chunks; - LIST_HEAD(, chunk_desc) unused_chunks; /* abuses hash link entry */ + LIST_HEAD(, chunk_desc) unused_chunks[UNUSED_BUCKETS]; LIST_HEAD(, trie_desc) trie_hashtbl[TRIE_HASH_SIZE]; LIST_HEAD(, trie_desc) all_trie; LIST_HEAD(, trie_desc) unused_trie; /* abuses hash link entry */ @@ -387,6 +389,7 @@ chunk_ref(struct dxr_aux *da, uint32_t chunk) uint32_t base = fdesc->base; uint32_t size = chunk_size(da, fdesc); uint32_t hash = chunk_hash(da, fdesc); + int i; /* Find an existing descriptor */ LIST_FOREACH(cdp, &da->chunk_hashtbl[hash & CHUNK_HASH_MASK], @@ -401,15 +404,18 @@ chunk_ref(struct dxr_aux *da, uint32_t chunk) return (0); } - /* No matching chunks found. Recycle an empty or allocate a new one */ - cdp = NULL; - LIST_FOREACH(empty_cdp, &da->unused_chunks, cd_hash_le) - if (empty_cdp->cd_max_size >= size && (cdp == NULL || - empty_cdp->cd_max_size < cdp->cd_max_size)) { - cdp = empty_cdp; - if (empty_cdp->cd_max_size == size) - break; - } + /* No matching chunks found. Find an empty one to recycle. */ + for (cdp = NULL, i = size; cdp == NULL && i < UNUSED_BUCKETS; i++) + cdp = LIST_FIRST(&da->unused_chunks[i]); + + if (cdp == NULL) + LIST_FOREACH(empty_cdp, &da->unused_chunks[0], cd_hash_le) + if (empty_cdp->cd_max_size >= size && (cdp == NULL || + empty_cdp->cd_max_size < cdp->cd_max_size)) { + cdp = empty_cdp; + if (empty_cdp->cd_max_size == size) + break; + } if (cdp != NULL) { /* Copy from heap into the recycled chunk */ @@ -423,11 +429,17 @@ chunk_ref(struct dxr_aux *da, uint32_t chunk) empty_cdp = uma_zalloc(chunk_zone, M_NOWAIT); if (empty_cdp == NULL) return (1); + LIST_INSERT_BEFORE(cdp, empty_cdp, cd_all_le); + empty_cdp->cd_base = cdp->cd_base + size; empty_cdp->cd_cur_size = 0; empty_cdp->cd_max_size = cdp->cd_max_size - size; - empty_cdp->cd_base = cdp->cd_base + size; - LIST_INSERT_BEFORE(cdp, empty_cdp, cd_all_le); - LIST_INSERT_AFTER(cdp, empty_cdp, cd_hash_le); + + i = empty_cdp->cd_max_size; + if (i >= UNUSED_BUCKETS) + i = 0; + LIST_INSERT_HEAD(&da->unused_chunks[i], empty_cdp, + cd_hash_le); + da->all_chunks_cnt++; da->unused_chunks_cnt++; cdp->cd_max_size = size; @@ -480,6 +492,7 @@ chunk_unref(struct dxr_aux *da, uint32_t chunk) uint32_t base = fdesc->base; uint32_t size = chunk_size(da, fdesc); uint32_t hash = chunk_hash(da, fdesc); + int i; /* Find the corresponding descriptor */ LIST_FOREACH(cdp, &da->chunk_hashtbl[hash & CHUNK_HASH_MASK], @@ -538,7 +551,10 @@ chunk_unref(struct dxr_aux *da, uint32_t chunk) return; } - LIST_INSERT_HEAD(&da->unused_chunks, cdp, cd_hash_le); + i = cdp->cd_max_size; + if (i >= UNUSED_BUCKETS) + i = 0; + LIST_INSERT_HEAD(&da->unused_chunks[i], cdp, cd_hash_le); } #ifdef DXR2 @@ -891,7 +907,8 @@ dxr_build(struct dxr *dxr) LIST_REMOVE(cdp, cd_all_le); uma_zfree(chunk_zone, cdp); } - LIST_INIT(&da->unused_chunks); + for (i = 0; i < UNUSED_BUCKETS; i++) + LIST_INIT(&da->unused_chunks[i]); da->all_chunks_cnt = da->unused_chunks_cnt = 0; da->rtbl_top = 0; da->updates_low = 0;