From owner-freebsd-virtualization@FreeBSD.ORG Wed Jan 15 16:04:40 2014 Return-Path: Delivered-To: freebsd-virtualization@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 40D9EDAA for ; Wed, 15 Jan 2014 16:04:40 +0000 (UTC) Received: from SMTP02.CITRIX.COM (smtp02.citrix.com [66.165.176.63]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 8C2C61311 for ; Wed, 15 Jan 2014 16:04:38 +0000 (UTC) X-IronPort-AV: E=Sophos;i="4.95,663,1384300800"; d="scan'208";a="91018095" Received: from accessns.citrite.net (HELO FTLPEX01CL02.citrite.net) ([10.9.154.239]) by FTLPIPO02.CITRIX.COM with ESMTP; 15 Jan 2014 16:04:31 +0000 Received: from [IPv6:::1] (10.80.16.47) by smtprelay.citrix.com (10.13.107.79) with Microsoft SMTP Server id 14.2.342.4; Wed, 15 Jan 2014 11:04:30 -0500 Message-ID: <52D6B18D.4080204@citrix.com> Date: Wed, 15 Jan 2014 17:04:29 +0100 From: =?UTF-8?B?Um9nZXIgUGF1IE1vbm7DqQ==?= User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 MIME-Version: 1.0 To: Julian Stecklina Subject: Re: KVM Clock References: <52D5623C.7060401@freebsd.org> <52D68831.2000503@citrix.com> <1389793530.2668.13.camel@janus.xn--pl-wia.net> In-Reply-To: <1389793530.2668.13.camel@janus.xn--pl-wia.net> X-Enigmail-Version: 1.6 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit X-DLP: MIA1 Cc: freebsd-virtualization@freebsd.org X-BeenThere: freebsd-virtualization@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: "Discussion of various virtualization techniques FreeBSD supports." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 15 Jan 2014 16:04:40 -0000 On 15/01/14 14:45, Julian Stecklina wrote: > On Mi, 2014-01-15 at 14:08 +0100, Roger Pau Monné wrote: >> On 15/01/14 13:05, Julian Stecklina wrote: >>> On 01/14/2014 05:13 PM, Peter Grehan wrote: >>>> Hi Julian, >>>> >>>>> is anyone working on KVM clock support for FreeBSD? If not, I >>>>> might take a shot at it. >>>> >>>> None I know of: go for it :) >>> >>> Works for me so far: >>> https://github.com/blitz/freebsd/commit/cdc5f872b3e48cc0dda031fc7d6bdedc65c3148f >> >> Looking >>> >> at the code it seems some common routines could be shared >> between the KVM PV clock and the Xen PV clock >> (sys/dev/xen/timer/timer.c). The data passed from the hypervisor to >> the guest has exactly the same format (see struct vcpu_time_info in >> Xen public headers). > > Yes, I know. Didn't get around to making it pretty yesterday evening. ;) > I'll post an updated patch, when I have some time. Any suggestions where > to put the two common functions? > >> At a first sight the KVM clock can benefit from using scale_delta >> (which is going to be faster than the C version implemented in >> kvmclock_get_timecount), > > At least somewhat on amd64. 32-bit looks pretty identical. > >> and xen_fetch_vcpu_tinfo is exactly the same >> as kvmclock_fetch. > > I think xen_fetch_vcpu_tinfo has a subtle bug: > 217 do { > 218 dst->version = src->version; > 219 rmb(); > 220 dst->tsc_timestamp = src->tsc_timestamp; > 221 dst->system_time = src->system_time; > 222 dst->tsc_to_system_mul = src->tsc_to_system_mul; > 223 dst->tsc_shift = src->tsc_shift; > 224 rmb(); > 225 } while ((src->version & 1) | (dst->version ^ > src->version)); > > In line 225 src->version is potentially read twice. If you consider the > following schedule: > > host starts updating data, sets src->version to 1 > guest reads src->version (1) and stores it into dst->version. > guest copies inconsistent data > guest reads src->version (1) and computes xor with dst->version. > host finishes updating data and sets src->version to 2 > guest reads src->version (2) and checks whether lower bit is not set. > while loop exits with inconsistent data! > > I think the C standard (at least C11) permits this as it does not > specify in which order the two reads in line 225 need to happen. According to the operator precedence and associativity in C (http://en.wikipedia.org/wiki/Operators_in_C_and_C%2B%2B#Operator_precedence), if I'm reading it right, the condition in the while line will be evaluated as follows (because of the left-to-right associativity of the | operator): 1. (src->version & 1) 2. (dst->version ^ src->version) 3. result of 1 | result of 2 So AFAICT the flow that you describe could never happen, because (src->version & 1) is always done before (dst->version ^ src->version). Roger.