From owner-freebsd-current@FreeBSD.ORG  Thu Oct 24 10:48:29 2013
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: current@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTP id BB9C921A
 for <current@FreeBSD.org>; Thu, 24 Oct 2013 10:48:29 +0000 (UTC)
 (envelope-from avg@FreeBSD.org)
Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140])
 by mx1.freebsd.org (Postfix) with ESMTP id 1888F2255
 for <current@FreeBSD.org>; Thu, 24 Oct 2013 10:48:28 +0000 (UTC)
Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua
 [212.40.38.100])
 by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id NAA26744
 for <current@FreeBSD.org>; Thu, 24 Oct 2013 13:48:27 +0300 (EEST)
 (envelope-from avg@FreeBSD.org)
Received: from localhost ([127.0.0.1])
 by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD))
 id 1VZISp-0005yD-Gl
 for current@FreeBSD.org; Thu, 24 Oct 2013 13:48:27 +0300
Message-ID: <5268FAC3.5070803@FreeBSD.org>
Date: Thu, 24 Oct 2013 13:47:31 +0300
From: Andriy Gapon <avg@FreeBSD.org>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:24.0) Gecko/20100101 Thunderbird/24.0.1
MIME-Version: 1.0
To: FreeBSD Current <current@FreeBSD.org>
Subject: some experience with a many core machine: event timer, hwpmc
X-Enigmail-Version: 1.6
Content-Type: text/plain; charset=X-VIET-VPS
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
 <freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-current>, 
 <mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
 <mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 24 Oct 2013 10:48:29 -0000


I don't think that I have seen observations like the following posted before.

I had some brief contact with a 48 core Opteron system (4 packages).

Observation #1.

Event timers subsystem picked a HPET timer as its source.  This resulted in a
lot of inter-core / inter-package traffic to re-distribute timer interrupts.
This also caused contention on a lock used internally by the kern_et code in the
case of a single global timer, because many CPUs tried to grab it concurrently.
 Additionally, I saw some statistics artifacts like top reported weird and
unstable results.

I believe that there should be some logic to prefer per-CPU timers over global
timers as number of CPUs increases.

Observation #2.

hwpmc was quite unusable on that system.  Attempts to use it resulted in lockups
or panics like waiting too long on spinlock.  It appears that hwpmc performs
some actions on each CPU and those actions are driven by timer interrupts.  The
actions use a single global lock for arbitration.  It appears that contention on
that lock make hwpmc unusable.  Just in case, this was the case even after I
switched the timer to per-CPU LAPIC timers.  HZ was default 1000.  So perhaps
1ms / 42 (~24us) was not enough for hwpmc to do its per tick per CPU actions
before the next tick.

The contention appeared to be in pmclog_reserve (called from
pmclog_process_callchain).


Some details about the hardware just in case:

CPU: AMD Opteron(tm) Processor 6172 (2100.07-MHz K8-class CPU)
  Origin = "AuthenticAMD"  Id = 0x100f91  Family = 0x10  Model = 0x9  Stepping = 1

Features=0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT>
  Features2=0x802009<SSE3,MON,CX16,POPCNT>
  AMD Features=0xee500800<SYSCALL,NX,MMX+,FFXSR,Page1GB,RDTSCP,LM,3DNow!+,3DNow!>
  AMD
Features2=0x837ff<LAHF,CMP,SVM,ExtAPIC,CR8,ABM,SSE4A,MAS,Prefetch,OSVW,IBS,SKINIT,WDT,NodeId>
  TSC: P-state invariant

FreeBSD/SMP: Multiprocessor System Detected: 48 CPUs
FreeBSD/SMP: 4 package(s) x 12 core(s)

-- 
Andriy Gapon