From owner-freebsd-current@freebsd.org Mon Mar 29 01:11:08 2021 Return-Path: Delivered-To: freebsd-current@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 412B55B5FBC for ; Mon, 29 Mar 2021 01:11:08 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: from mail-lj1-x236.google.com (mail-lj1-x236.google.com [IPv6:2a00:1450:4864:20::236]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4F7vfN0sBMz3swK; Mon, 29 Mar 2021 01:11:07 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: by mail-lj1-x236.google.com with SMTP id u20so13991402lja.13; Sun, 28 Mar 2021 18:11:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=d3B4WcaJYl8oK0x0uV9F94JaTiiJl0BJz0iNX+SYnA8=; b=sj8Sp87/mgxCMw7XrLqIzjr1waERmUweND7zaYX9SArg7SmKFmiCaV6rr/Px6+ndLd sFd9qCfHqBuUZrpOVLQBNn14yUf2tjHF3ErWW7Lu7y/Ujs3k/o7Dh/olYaD7NyqVtCPt SRsU0fKBTafnqOudsEXr68lKtLXShBHjQd2cr5ke/U35CWRCiAkx9Ss1fYdeB/BXt0g9 Xf/a5PFIrr4azbyfGm/ZplTmSKvx6N9Naecs8w6JrGS14jCHsnkE7QYAFLjW8SWKJfVQ U3paom+RG38B06L9guMqTFxOAhsf9OTXoynE5poPQfjgxWDKwgYuJ1PQ8WjlbXmtEpGW s24w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=d3B4WcaJYl8oK0x0uV9F94JaTiiJl0BJz0iNX+SYnA8=; b=DHZUpFfFgKYqgJu3c3qZXesBoXmJxYQ84AFDU4TlXu5V//b/v2Gs7AiTFAh03AUrjK pTymOnQVrRMrLTzUtNBp1B1WdXz4c/4AMdfGVxqg8zSxGUVvBoFlTRmmr0Xm9qfiR1QJ CIkUkLRL69qVg0bPpIL/VursHCk6CM/KtDkNsQ9eEmsfJ6PktOUDgYcDpr+/3gDQQn+3 eH4zRBn5AwQmgH7DFmp7OPoqg49dcJ7aTuvxHDjojH240H+1+00w1m3lBpW5Wfc9Wmax WgZcWsvsJHRCEQ8nOpmtOSpIZV93gpsa73hCxfpm/cTzkKYHOc2xjqK5ySmphdvRM265 PTmw== X-Gm-Message-State: AOAM533cd9s6Cit+DUyqxW9mse2dY/SX09cDyNBqckbdXN8AusRjuR+x Qukchl1DzuV/1TN6Ok7HZxN1LAwIDTDccndEc5q/40Ck X-Google-Smtp-Source: ABdhPJzDMOZNAsOIYDq8PkTdUoMqvt3IXwPqafQ3hBHU5R1Tcq4sutGKzFSIyzu4yxaTnTm68/g2pukufQAIMsN8BzY= X-Received: by 2002:a2e:7807:: with SMTP id t7mr16671886ljc.313.1616980266334; Sun, 28 Mar 2021 18:11:06 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a2e:b54e:0:0:0:0:0 with HTTP; Sun, 28 Mar 2021 18:11:05 -0700 (PDT) In-Reply-To: References: <58bea0f0-5c3d-4263-ebee-f939a7e169e9@freebsd.org> From: Mateusz Guzik Date: Mon, 29 Mar 2021 03:11:05 +0200 Message-ID: Subject: Re: Strange behavior after running under high load To: Stefan Esser Cc: Andriy Gapon , FreeBSD CURRENT Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 4F7vfN0sBMz3swK X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 29 Mar 2021 01:11:08 -0000 This may be the problem fixed in e9272225e6bed840b00eef1c817b188c172338ee ("vfs: fix vnlru marker handling for filtered/unfiltered cases"). However, there is a long standing performance bug where if vnode limit is hit, and there is nothing to reclaim, the code is just going to sleep for one second. On 3/28/21, Stefan Esser wrote: > Am 28.03.21 um 17:44 schrieb Andriy Gapon: >> On 28/03/2021 17:39, Stefan Esser wrote: >>> After a period of high load, my now idle system needs 4 to 10 seconds to >>> run any trivial command - even after 20 minutes of no load ... >>> >>> >>> I have run some Monte-Carlo simulations for a few hours, with initially >>> 35 >>> processes running in parallel for some 10 seconds each. >> >> I saw somewhat similar symptoms with 13-CURRENT some time ago. >> To me it looked like even small kernel memory allocations took a very long >> time. >> But it was hard to properly diagnose that as my favorite tool, dtrace, was >> also >> affected by the same problem. > > That could have been the case - but I had to reboot to recover the system. > > I had let it sit idle fpr a few hours and the last "time uptime" before > the reboot took 15 second real time to complete. > > Response from within the shell (e.g. "echo *") was instantaneous, though. > > I tried to trace the program execution of "uptime" with truss and found, > that the loading of shared libraries proceeded at about one or two per > second until all were attached and then the program quickly printed the > expected results. > > I could probably recreate the issue by running the same set of programs > that triggered it a few hours ago, but this is a production system and > I need it to be operational through the week ... > > Regards, STefan > > -- Mateusz Guzik