From owner-freebsd-net@freebsd.org Fri Feb 14 18:00:24 2020 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 8252023C51E for ; Fri, 14 Feb 2020 18:00:24 +0000 (UTC) (envelope-from olivier@freebsd.org) Received: from smtp.freebsd.org (smtp.freebsd.org [IPv6:2610:1c1:1:606c::24b:4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 48K1Ph2zYqz3xGx for ; Fri, 14 Feb 2020 18:00:24 +0000 (UTC) (envelope-from olivier@freebsd.org) Received: from mail-pj1-f45.google.com (mail-pj1-f45.google.com [209.85.216.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) (Authenticated sender: olivier/mail) by smtp.freebsd.org (Postfix) with ESMTPSA id 3600BEC0C for ; Fri, 14 Feb 2020 18:00:24 +0000 (UTC) (envelope-from olivier@freebsd.org) Received: by mail-pj1-f45.google.com with SMTP id n96so4182107pjc.3 for ; Fri, 14 Feb 2020 10:00:24 -0800 (PST) X-Gm-Message-State: APjAAAUHV+I8jvS743fyDMKpZ0s1Qhs7cgTnG8voPPakhiFB/HAlEcxm 2qdag7bfP0ZSqlxQSag3GD+XAV6YFp6O2poInwM= X-Google-Smtp-Source: APXvYqwYNvT07VI/9hHJTVTCJ9YgDHEDBxWGCJ3cWYuxLx8c8MANKWGW+FMvImPCsWaBtOKu3nQTRtU+/4cFjiT/EpE= X-Received: by 2002:a17:902:8f8a:: with SMTP id z10mr4611658plo.169.1581703223216; Fri, 14 Feb 2020 10:00:23 -0800 (PST) MIME-Version: 1.0 References: <1aa78c6e-e640-623c-73d3-473df132eb72@monkeybrains.net> <428f3cdf-9035-90a7-14f8-f294c2131682@monkeybrains.net> In-Reply-To: <428f3cdf-9035-90a7-14f8-f294c2131682@monkeybrains.net> From: =?UTF-8?Q?Olivier_Cochard=2DLabb=C3=A9?= Date: Fri, 14 Feb 2020 19:00:11 +0100 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Issue with BGP router / high interrupt / Chelsio / FreeBSD 12.1 To: Rudy Cc: freebsd-net@freebsd.org Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 Feb 2020 18:00:24 -0000 On Fri, Feb 14, 2020 at 6:25 PM Rudy wrote: > On 2/12/20 7:21 PM, Rudy wrote: > > I'm having issues with a box that is acting as a BGP router for my > network. 3 Chelsio cards, two T5 and one T6. It was working great > until I turned up our first port on the T6. It seems like traffic > passing in from a T5 card and out the T6 causes a really high load (and > high interrupts). > > > Looking better! I made some changes based on BSDRP which I hadn't known > about -- I think ifqmaxlen was the tunable I overlooked. > > # > > https://github.com/ocochard/BSDRP/blob/master/BSDRP/Files/boot/loader.conf.local > net.link.ifqmaxlen="16384" > > This net.link.ifqmaxlen was set to help in case of lagg usage: I was not aware it could improve your use case. >From your first post, it looks like your setup is a 2 packages, 10 cores, 20 threads (disabled). And you have configured your Chelsio to use 16 queues (hw.cxgbe.Xrx=16): It's a good think to have a power of 2 number of queues with Chelsio, but I'm not sure it's a good idea to spread those queue across the 2 packages. So perhaps you should try: 1. To reduce queues to 8 queues and bind them to the local domain 2. Or keeping 16 queues, but re-enabling HyperThreading and bing them to the local domain too. (on -head with recent CPU and machdep.hyperthreading_intr_allowed, using hyper-threading improve forwarding performance). But anyway even with 16 queues spread over 2 domains, you should have better performance: https://github.com/ocochard/netbenches/blob/master/Xeon_E5-2650v4_2x12Cores-Chelsio_T520-CR/hw.cxgbe.nXxq/results/fbsd12-stable.r354440.BSDRP.1.96/README.md Notice that I never monitoring the CPU load during my benches. Increasing the hw.cxgbe.holdoff_timer_idx was a good idea: I would expect lower interrupt usage too. Did you monitor the QPI link usage ? (kldload cpuctl && pcm-numa.x)