From owner-freebsd-arch@FreeBSD.ORG Sat Apr 25 18:45:11 2015 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id EE80EE74; Sat, 25 Apr 2015 18:45:10 +0000 (UTC) Received: from mail-ig0-x22f.google.com (mail-ig0-x22f.google.com [IPv6:2607:f8b0:4001:c05::22f]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id B84921E1F; Sat, 25 Apr 2015 18:45:10 +0000 (UTC) Received: by igblo3 with SMTP id lo3so36602877igb.0; Sat, 25 Apr 2015 11:45:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=s03rWzS3VQoPm0+hax9cAlGUqh/2tjRN9inL+1Q1cwA=; b=beQbViBu02LpsXfAZS/ZZqpjVe4tmw5VNyYtPpfvctoZBNrszgeg/7BH1VsPI46WD7 1jSPqndD155BCLzN/JpHqNWRsI94XpdvxiTT9acfKClXRpkWbkQumB795+vJsWFT2pFr Kn5+N28ij0xDa++MSvyAWjgtieK13XxiTj6F/TquJA6NafMRVvZSwvXheSlqtNn67X85 YkZJbplEq4vCalvkAr/1FAhlChtQ7//PU7WeYFcTsWA3lfNMXaumNfmxbZDyd+5AE3o1 lAuE0sWWQdJBQITC+TqfOjuxUcJTe6zdX403f2aDwrwTlm2QjxducCBVcddlRW5q34K7 1TQw== MIME-Version: 1.0 X-Received: by 10.107.136.25 with SMTP id k25mr4927011iod.88.1429987510146; Sat, 25 Apr 2015 11:45:10 -0700 (PDT) Sender: adrian.chadd@gmail.com Received: by 10.36.38.133 with HTTP; Sat, 25 Apr 2015 11:45:10 -0700 (PDT) In-Reply-To: References: Date: Sat, 25 Apr 2015 11:45:10 -0700 X-Google-Sender-Auth: J7cjSiXLCSOaYEM2YxuBtayScPE Message-ID: Subject: Re: RFC: setting performance_cx_lowest=C2 in -HEAD to avoid lock contention on many-CPU boxes From: Adrian Chadd To: Davide Italiano Cc: "freebsd-arch@freebsd.org" Content-Type: text/plain; charset=UTF-8 X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 25 Apr 2015 18:45:11 -0000 On 25 April 2015 at 11:18, Davide Italiano wrote: > On Sat, Apr 25, 2015 at 9:31 AM, Adrian Chadd wrote: >> Hi! >> >> I've been doing some NUMA testing on large boxes and I've found that >> there's lock contention in the ACPI path. It's due to my change a >> while ago to start using sleep states above ACPI C1 by default. The >> ACPI C3 state involves a bunch of register fiddling in the ACPI sleep >> path that grabs a serialiser lock, and on an 80 thread box this is >> costly. >> >> I'd like to drop performance_cx_lowest to C2 in -HEAD. ACPI C2 state >> doesn't require the same register fiddling (to disable bus mastering, >> if I'm reading it right) and so it doesn't enter that particular >> serialised path. I've verified on Westmere-EX, Sandybridge, Ivybridge >> and Haswell boxes that ACPI C2 does let one drop down into a deeper >> CPU sleep state (C6 on each of these). I think is still a good default >> for both servers and desktops. >> >> If no-one has a problem with this then I'll do it after the weekend. >> > > This sounds to me just a way to hide a problem. > Very few people nowaday run on NUMA and they can tune the machine as > they like when they do testing. > If there's a lock contention problem, it needs to be fixed and not > hidden under another default. The lock contention problem is inside ACPI and how it's designed/implemented. We're not going to easily be able to make ACPI lock "better" as we're constrained by how ACPI implements things in the shared ACPICA code. > Also, as already noted this is a problem on 80-core machines but > probably not on a 2-core Atom. I think you need to understand factors > better and come up with a more sensible relation. In other words, your > bet needs to be proven before changing a default useful for frew that > can impact many. I've just described the differences in behaviour. I've checked the C states on all the intel servers too - with power plugged in, ACPI C2 and ACPI C3 still result in entering CPU C6 state, not CPU C7 state - so it's not going to result in worse behaviour. For reference, "all" being the following list: * westmere-EX * nehalem * sandybridge * sandybridge mobile * sandybridge xeon * ivybridge mobile * ivybridge xeon * haswell mobile * haswell * haswell xeon * haswell xeon v3 -adrian