From owner-freebsd-amd64@FreeBSD.ORG Wed Feb 12 21:04:57 2014 Return-Path: Delivered-To: freebsd-amd64@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id E3E49203; Wed, 12 Feb 2014 21:04:57 +0000 (UTC) Received: from ns1.invoca.ch (mx1.invoca.ch [157.161.91.34]) by mx1.freebsd.org (Postfix) with ESMTP id 743CE1CF3; Wed, 12 Feb 2014 21:04:56 +0000 (UTC) Received: from xxl.bi.corp.invoca.ch (cust.static.46-14-177-70.swisscomdata.ch [46.14.177.70]) by ns1.invoca.ch (Postfix) with ESMTP id B4A88240B1; Wed, 12 Feb 2014 22:04:53 +0100 (CET) Received: from webmail.bi.corp.invoca.ch (localhost [127.0.0.1]) by xxl.bi.corp.invoca.ch (Postfix) with ESMTP id 51AC242E7F; Wed, 12 Feb 2014 22:04:53 +0100 (CET) Received: from 192.168.10.25 (SquirrelMail authenticated user simix) by webmail.bi.corp.invoca.ch with HTTP; Wed, 12 Feb 2014 22:04:53 +0100 Message-ID: <18f55baf8a13457d3d6a89fc4f4ffc61.squirrel@webmail.bi.corp.invoca.ch> In-Reply-To: <201402121423.48285.jhb@freebsd.org> References: <201402120740.s1C7e1Mn005809@freefall.freebsd.org> <201402121423.48285.jhb@freebsd.org> Date: Wed, 12 Feb 2014 22:04:53 +0100 Subject: Re: amd64/186061: FreeBSD 10 crashes as KVM guest on GNU/Linux on AMD family 10h CPUs From: "Simon Matter" To: "John Baldwin" User-Agent: SquirrelMail/1.4.22 MIME-Version: 1.0 Content-Type: text/plain;charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Priority: 3 (Normal) Importance: Normal X-Mailman-Approved-At: Wed, 12 Feb 2014 22:06:05 +0000 Cc: freebsd-amd64@freebsd.org X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Feb 2014 21:04:58 -0000 > On Wednesday, February 12, 2014 2:40:01 am Simon Matter wrote: >> The following reply was made to PR amd64/186061; it has been noted by >> GNATS. >> >> From: "Simon Matter" >> To: bug-followup@FreeBSD.org >> Cc: simon.matter@invoca.ch >> Subject: Re: amd64/186061: FreeBSD 10 crashes as KVM guest on GNU/Linux >> on >> AMD family 10h CPUs >> Date: Wed, 12 Feb 2014 08:30:51 +0100 >> >> ------=_20140212083051_97180 >> Content-Type: text/plain; charset="iso-8859-1" >> Content-Transfer-Encoding: 8bit >> >> As noted by John Baldwin the change to mca.c is not needed. Attached >> patch >> is what I'm using now with success. >> >> BTW: setting vm.pmap.pg_ps_enabled="0" in loader.conf does also >> mitigate >> the issue but I guess it's not the optimal solution. > > Talking with Alan Cox, we do think the right fix is to change the test to > enable the workaround. However, we'd rather not penalize VM's on other I'm afraid that will not work in all situations, no matter how good the tests are (see below why I think so). So as a last resort, I suggest that it should be possible to enable the "AMD Erratum 383" workaround via loader.conf. > CPUs. What we would like to do instead is figure out a set of feature > flags > we can test that will assure us we are not on an AMD 10h CPU (e.g. > features > that are known to be Intel only, or are known to be on newer AMD CPUs). Here comes the problem. At least on KVM the common configuration is so that the guest doesn't see the real cpu flags but a common subset of flags available on all servers where the VM could be migrated to. That makes it possible to hot migrate VMs between different server with different CPUs. So, detecting the real cpu hardware goes always wrong. In such configurations only a sysctl switch may work reliable. The only CPU configuration of KVM/QEMU which works with the real CPU flags is if -cpu is set to "host". That's usually only done if one wants the full speed out of the VM and won't move the VM to any other host at runtime. I hope I was able to explain what I mean. > However, we haven't come up with that list yet. Assuming you do have 10h > CPUs, it would be nice to see what cpuid flags are set for your > processors. I have at least two servers with affected CPUs. One is a small HP ProLiant Microserver N36L, the other one a 48 core HP ProLiant DL585 G7. I'll paste the cpu info below. Hope that's what you were asking. Regards, Simon N36L: processor : 1 vendor_id : AuthenticAMD cpu family : 16 model : 6 model name : AMD Athlon(tm) II Neo N36L Dual-Core Processor stepping : 3 cpu MHz : 800.000 cache size : 1024 KB physical id : 0 siblings : 2 core id : 1 cpu cores : 2 apicid : 1 initial apicid : 1 fpu : yes fpu_exception : yes cpuid level : 5 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nonstop_tsc extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a 3dnowprefetch osvw ibs skinit wdt nodeid_msr npt lbrv svm_lock nrip_save bogomips : 2595.78 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 48 bits physical, 48 bits virtual power management: ts ttp tm stc 100mhzsteps hwpstate DL585 G7: processor : 47 vendor_id : AuthenticAMD cpu family : 16 model : 9 model name : AMD Opteron(tm) Processor 6174 stepping : 1 cpu MHz : 2200.000 cache size : 512 KB physical id : 3 siblings : 12 core id : 5 cpu cores : 12 apicid : 75 initial apicid : 59 fpu : yes fpu_exception : yes cpuid level : 5 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nonstop_tsc extd_apicid amd_dcm pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt nodeid_msr npt lbrv svm_lock nrip_save pausefilter bogomips : 4389.15 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 48 bits physical, 48 bits virtual power management: ts ttp tm stc 100mhzsteps hwpstate