From owner-freebsd-current@freebsd.org Tue Mar 27 14:00:20 2018 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id BE65CF563E2 for ; Tue, 27 Mar 2018 14:00:19 +0000 (UTC) (envelope-from agapon@gmail.com) Received: from mail-lf0-f43.google.com (mail-lf0-f43.google.com [209.85.215.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 2AA1D86BF5; Tue, 27 Mar 2018 14:00:19 +0000 (UTC) (envelope-from agapon@gmail.com) Received: by mail-lf0-f43.google.com with SMTP id a22-v6so33459829lfg.9; Tue, 27 Mar 2018 07:00:19 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=5iovp3NXYFs/relao2u2NrVam2OH7/q3k4klVfkIUw0=; b=bsU6PyhMirhhcJK/JZcoqlYpZ4+v6a3MummtvLVdZcMmuBorBMlVXfFFgqnQgGlqjA Jbxm0UE9GqodrtcHNag+4YbAiaAnPbgytLjX+8uB5Cuvn06zdmT/e7d1jsr6QdLjihyO FU6GxlWfz31BJEhs3+brYKlBDaqTwH3jIe2sjuf01ww3pYWom2o+aowFIn/HjONZMfl3 oD7jh3xrVbQbUuDW8cLOtILjAszN8JjO2BmjK06FrFS8uoTHY1VFRSDfFPLytUwWRvui xdR0/6t9Ir2LxUaU1/wLRQXwmdAMJwz8ijBydCRG8p4yDb2rxp8pcof9FgyPLOXcZhAK ZTPQ== X-Gm-Message-State: AElRT7FLloJNXyf/uStEklM9UvpAvvAknbvqmYlqsZmPoR86WivP0csE FoxhZ4w23/ExdEklRk6/Mf3YQr61 X-Google-Smtp-Source: AG47ELviP4hExqDVmOHof8ILP+0+dNTLlipBL/9kk4qoKxDTU59Kui/rMZARG0+610KZkqn3aVGCCA== X-Received: by 2002:a19:4d46:: with SMTP id a67-v6mr30707962lfb.36.1522159212057; Tue, 27 Mar 2018 07:00:12 -0700 (PDT) Received: from [192.168.0.88] (east.meadow.volia.net. [93.72.151.96]) by smtp.googlemail.com with ESMTPSA id q68-v6sm251667lfd.58.2018.03.27.07.00.10 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 27 Mar 2018 07:00:11 -0700 (PDT) Subject: Re: Strange ARC/Swap/CPU on yesterday's -CURRENT To: Bryan Drewery , Peter Jeremy , Jeff Roberson Cc: FreeBSD current References: <20180306173455.oacyqlbib4sbafqd@ler-imac.lerctr.org> <201803061816.w26IGaW5050053@pdx.rh.CN85.dnsmgr.net> <20180306193645.vv3ogqrhauivf2tr@ler-imac.lerctr.org> <20180306221554.uyshbzbboai62rdf@dx240.localdomain> <20180307103911.GA72239@kloomba> <20180311004737.3441dbf9@thor.intern.walstatt.dynvpn.de> <20180320070745.GA12880@server.rulingia.com> <2b3db2af-03c7-65ff-25e7-425cfd8815b6@FreeBSD.org> From: Andriy Gapon Message-ID: <1fd2b47b-b559-69f8-7e39-665f0f599c8f@FreeBSD.org> Date: Tue, 27 Mar 2018 17:00:09 +0300 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 In-Reply-To: <2b3db2af-03c7-65ff-25e7-425cfd8815b6@FreeBSD.org> Content-Type: text/plain; charset=windows-1252 Content-Language: en-US Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Mar 2018 14:00:20 -0000 On 24/03/2018 01:21, Bryan Drewery wrote: > On 3/20/2018 12:07 AM, Peter Jeremy wrote: >> >> On 2018-Mar-11 10:43:58 -1000, Jeff Roberson wrote: >>> Also, if you could try going back to r328953 or r326346 and let me know if >>> the problem exists in either. That would be very helpful. If anyone is >>> willing to debug this with me contact me directly and I will send some >>> test patches or debugging info after you have done the above steps. >> >> I ran into this on 11-stable and tracked it to r326619 (MFC of r325851). >> I initially got around the problem by reverting that commit but either >> it or something very similar is still present in 11-stable r331053. >> >> I've seen it in my main server (32GB RAM) but haven't managed to reproduce >> it in smaller VBox guests - one difficulty I faced was artificially filling >> ARC. First, it looks like maybe several different issues are being discussed and possibly conflated in this thread. I see reports related to ZFS, I see reports where ZFS is not used at all. Some people report problems that appeared very recently while others chime in with "yes, yes, I've always had this problem". This does not help to differentiate between problems and to analyze them. > Looking at the ARC change you referred to from r325851 > https://reviews.freebsd.org/D12163, I am convinced that ARC backpressure > is completely broken. Does your being convinced come from the code review or from experiments? If the former, could you please share your analysis? > On my 78GB RAM system with ARC limited to 40GB and > doing a poudriere build of all LLVM and GCC packages at once in tmpfs I > can get swap up near 50GB and yet the ARC remains at 40GB through it > all. It's always been slow to give up memory for package builds but it > really seems broken right now. Well, there are multiple angles. Maybe it's ARC that does not react properly, or maybe it's not being asked properly. It would be useful to monitor the system during its transition to the state that you reported. There are some interesting DTrace probes in ARC, specifically arc-available_memory and arc-needfree are first that come to mind. Their parameters and how frequently they are called are of interest. VM parameters could be of interest as well. A rant. Basically, posting some numbers and jumping to conclusions does not help at all. Monitoring, graphing, etc does help. ARC is a complex dynamic system. VM (pagedaemon, UMA caches) is a complex dynamic system. They interact in a complex dynamic ways. Sometimes a change in ARC is incorrect and requires a fix. Sometimes a change in VM is incorrect and requires a fix. Sometimes a change in VM requires a change in ARC. These three kinds of problems can all appear as a "problem with ARC". For instance, when vm.lowmem_period was introduced you wouldn't find any mention of ZFS/ARC. But it does affect period between arc_lowmem() calls. Also, pin-pointing a specific commit requires proper bisecting and proper testing to correctly attribute systemic behavior changes to code changes. -- Andriy Gapon