From nobody Wed Jan 12 17:16:19 2022 X-Original-To: freebsd-net@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 8291417F10CF for ; Wed, 12 Jan 2022 17:16:28 +0000 (UTC) (envelope-from mavbsd@gmail.com) Received: from mail-qv1-xf2d.google.com (mail-qv1-xf2d.google.com [IPv6:2607:f8b0:4864:20::f2d]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4JYvNp3g4Qz4sL4; Wed, 12 Jan 2022 17:16:26 +0000 (UTC) (envelope-from mavbsd@gmail.com) Received: by mail-qv1-xf2d.google.com with SMTP id q3so3642366qvc.7; Wed, 12 Jan 2022 09:16:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=sender:message-id:date:mime-version:user-agent:subject :content-language:to:cc:references:from:in-reply-to :content-transfer-encoding; bh=ZkBRSCsA+wWMj82LHvt06KLqaABUbPDn9C5ba9RbMSE=; b=jhh/slAGL2/BMdqxjZTUMnMienp6ORkpNOB2vkj5WPexHulmm4R1zA0/W/x2fCWLph HZFkRWGXdHLmmAEt2uKdeA+9ynqKgIoQK0/6Tpakh/c61qPpAryYfufK99H2AnqBCjSt 5fi/SFQLIeygfWlOiJ1u0jn+HBj4VfWeIvShV4ZfK4jR8jeck/c/kxrif3pgTyYfJmZD NCdMsKDgb1x27XdcjrBN+7sKEw2h8TIyakQTRbyp+E6g/7WMDuWoNGs9PrHGpWkkJqil auutlLss+5drolKwCcxd4ZHSqTAejsYLipueCsYs9hk9C8wqPFVcclc7mXV8ugaK59Py xcsQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:sender:message-id:date:mime-version:user-agent :subject:content-language:to:cc:references:from:in-reply-to :content-transfer-encoding; bh=ZkBRSCsA+wWMj82LHvt06KLqaABUbPDn9C5ba9RbMSE=; b=VYAbsnIr4QWK4BsodymY7Xleml6UD4LRWaUmRcN/M5dVv/6Ml98MD2iXd0xkrjHjfd KfywrM4ryyKg09Nwz522YPiCi8O5OTdrhlFXDOZdSj8q0uQjAxZDf5Dr4pCC6MrK+sRT LLtteOqsSO/csVMQRhhCCfz2QSHuHohV238HQ9nzYyDRi0091ASQ4Gw7RGsBmlIkirI8 OETruIElRsaelE57by6NFQJklWKxsHf0cw0+NJu69EI/d+nKO/VqewWLEHld3G5/p3As ik+sNJPM28xrAH9AB7vArvXcXszIRG90TYXyy3HxQW+A2/g2RoaJ2b9QPqwy1ItdLVz5 vAXw== X-Gm-Message-State: AOAM533NpfqA3uYKfjSJknswT0/8JJ1SbNx4JbBJxkwUuzNb3K7bTenv pphNYst3CE3l3vAqlk0GvTs= X-Google-Smtp-Source: ABdhPJz6K4UxJy1qTyatfoyWccjw/CfF0GkBrTl0SRc1k+EWOLS4K3KyIdxrUDnpeq6pH/eNazmeyg== X-Received: by 2002:a05:6214:d43:: with SMTP id 3mr565258qvr.91.1642007780308; Wed, 12 Jan 2022 09:16:20 -0800 (PST) Received: from [10.231.1.66] ([38.32.73.2]) by smtp.gmail.com with ESMTPSA id u17sm211123qki.2.2022.01.12.09.16.19 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 12 Jan 2022 09:16:19 -0800 (PST) Message-ID: Date: Wed, 12 Jan 2022 12:16:19 -0500 List-Id: Networking and TCP/IP with FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-net List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-net@freebsd.org MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:91.0) Gecko/20100101 Thunderbird/91.4.1 Subject: Re: iflib_timer() vs ixl_admin_timer() race Content-Language: en-US To: "Galazka, Krzysztof" , Eric Joyner Cc: "Joyner, Eric" , freebsd-net@freebsd.org References: <27d6f051-3108-a08a-6f1d-670dd8fdb9bb@FreeBSD.org> From: Alexander Motin In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 4JYvNp3g4Qz4sL4 X-Spamd-Bar: / Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20210112 header.b="jhh/slAG"; dmarc=none; spf=pass (mx1.freebsd.org: domain of mavbsd@gmail.com designates 2607:f8b0:4864:20::f2d as permitted sender) smtp.mailfrom=mavbsd@gmail.com X-Spamd-Result: default: False [0.51 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20210112]; MID_RHS_MATCH_FROM(0.00)[]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; R_SPF_ALLOW(-0.20)[+ip6:2607:f8b0:4000::/36:c]; NEURAL_SPAM_SHORT(1.00)[1.000]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[FreeBSD.org]; NEURAL_HAM_MEDIUM(-0.68)[-0.675]; TO_DN_SOME(0.00)[]; RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[]; DKIM_TRACE(0.00)[gmail.com:+]; NEURAL_SPAM_LONG(0.39)[0.390]; RCVD_IN_DNSWL_NONE(0.00)[2607:f8b0:4864:20::f2d:from]; FORGED_SENDER(0.30)[mav@FreeBSD.org,mavbsd@gmail.com]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US]; FROM_NEQ_ENVFROM(0.00)[mav@FreeBSD.org,mavbsd@gmail.com]; RCVD_TLS_ALL(0.00)[]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim] X-ThisMailContainsUnwantedMimeParts: N Thanks, Krzystof, Grepping now for iflib_admin_intr_deferred() through the sources I see the same issue in other Intel NIC drivers, plus bnxt(4). So the main controversy I see is this: either admin intr should not stop and restart the callouts (and then question is why it does that now), or if it should be so heavy-weight, it should not be called so regularly (and then question why so many drivers do it). In few drivers I've found it even more interesting: iflib_timer() calls IFDI_TIMER(), which calls ixgbe_if_timer(), which calls iflib_admin_intr_deferred(), which in its turn restart the iflib_timer(). Ouroboros. ;) On 12.01.2022 11:46, Galazka, Krzysztof wrote: > Hi Alexander, > > Thank you for pointing this out. ixl_admin_timer is used by a callout so > I don’t think we can acquire any locks there. IIRC it was added to let > tools for NVM update interact with FW while interface is down, so maybe > stopping it when interface goes UP would be an option. Let me think this > through. > > Thanks, > > Krzysiek > > *From:* Eric Joyner > *Sent:* Wednesday, January 12, 2022 8:22 AM > *To:* Alexander Motin > *Cc:* Joyner, Eric ; Galazka, Krzysztof > > *Subject:* Re: iflib_timer() vs ixl_admin_timer() race > > Well, I think the situation is more-or-less as you say -- it's just that > the intent of ixl_admin_timer() is supposed to have the adapter's admin > queue processed constantly, regardless of interrupt status or down/up > status. I think as a shorthand we just called _task_fn_admin because it > handles a bunch of other things as well as getting down to calling > ixl_if_update_admin_status() which does the admin queue processing. But > as you found, I don't think it was originally written to be called > periodically on a regular basis like iflib_timer(), so the callouts are > interacting badly. > > My first thought would be to replace the call to > iflib_admin_intr_deferred() in ixl_admin_timer() with > ixl_if_update_admin_status() while taking the CTX_LOCK(). I'm not sure > everything else in _task_fn_admin() needs to run periodically like that > in the driver. That would avoid the callouts getting stopped every 500 > ticks. > > I CC'd my coworker Krzysztof who currently owns the driver; he might > have better thoughts on this. > > - Eric > > On Tue, Jan 11, 2022 at 10:46 AM Alexander Motin > wrote: > > Yes.  I've reverted my patch for now to not break anything, but all > this > situation does not look right for me too, so I'd appreciate your look. > > On 11.01.2022 12:21, Eric Joyner wrote: > > Hi, > > > > I came back from vacation yesterday, but I'll look into this for you > > today. It's not a good situation when changing the period of the > iflib > > timer breaks something in the driver... > > > > - Eric > > > > On Sun, Jan 9, 2022 at 8:19 PM Alexander Motin > > >> wrote: > > > >     Hi Eric, > > > >     In 90bc1cf65778 I've done what looked like a trivial > optimization.  But > >     just after committing it I've noticed weird race it created > between > >     iflib_timer() and ixl_admin_timer() timers.  I see ixl(4) calls > >     iflib_admin_intr_deferred() every 0.5s, which calls > _task_fn_admin(), > >     which every time stops and restart txq->ift_timer callout (AKA > >     iflib_timer()), which actually has exactly the same period of > 0.5s.  It > >     seems before my change iflib_timer() managed to run once for > all TX > >     queues before being stopped, but after the change due to > additional > >     jitter many of callouts are getting stopped before firing. > > > >     Could you please sched some light for me on what is going on > there? > >     That race between two callouts looks like potential bug to > me, working > >     by some coinncedence, especially considering iflib_timer() > period is > >     tunable. > > > >     Thanks in advance. > > > >     -- > >     Alexander Motin > > > > -- > Alexander Motin > > ------------------------------------------------------------------------ > *Intel Technology Poland sp. z o.o. > * ul. Słowackiego 173 | 80-298 Gdańsk | Sąd Rejonowy Gdańsk Północ | VII > Wydział Gospodarczy Krajowego Rejestru Sądowego - KRS 101882 | NIP > 957-07-52-316 | Kapitał zakładowy 200.000 PLN. > > Ta wiadomość wraz z załącznikami jest przeznaczona dla określonego > adresata i może zawierać informacje poufne. W razie przypadkowego > otrzymania tej wiadomości, prosimy o powiadomienie nadawcy oraz trwałe > jej usunięcie; jakiekolwiek przeglądanie lub rozpowszechnianie jest > zabronione. > This e-mail and any attachments may contain confidential material for > the sole use of the intended recipient(s). If you are not the intended > recipient, please contact the sender and delete all copies; any review > or distribution by others is strictly prohibited. > -- Alexander Motin