Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 15 Jan 2014 17:04:29 +0100
From:      =?UTF-8?B?Um9nZXIgUGF1IE1vbm7DqQ==?= <roger.pau@citrix.com>
To:        Julian Stecklina <jsteckli@os.inf.tu-dresden.de>
Cc:        freebsd-virtualization@freebsd.org
Subject:   Re: KVM Clock
Message-ID:  <52D6B18D.4080204@citrix.com>
In-Reply-To: <1389793530.2668.13.camel@janus.xn--pl-wia.net>
References:  <lb3jnb$qo8$1@ger.gmane.org> <52D5623C.7060401@freebsd.org>	 <lb5thm$3r3$1@ger.gmane.org> <52D68831.2000503@citrix.com> <1389793530.2668.13.camel@janus.xn--pl-wia.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On 15/01/14 14:45, Julian Stecklina wrote:
> On Mi, 2014-01-15 at 14:08 +0100, Roger Pau Monné wrote:
>> On 15/01/14 13:05, Julian Stecklina wrote:
>>> On 01/14/2014 05:13 PM, Peter Grehan wrote:
>>>> Hi Julian,
>>>>
>>>>> is anyone working on KVM clock support for FreeBSD? If not, I
>>>>> might take a shot at it.
>>>>
>>>> None I know of: go for it :)
>>>
>>> Works for me so far: 
>>> https://github.com/blitz/freebsd/commit/cdc5f872b3e48cc0dda031fc7d6bdedc65c3148f
>>
>> Looking
>>>
>> at the code it seems some common routines could be shared
>> between the KVM PV clock and the Xen PV clock
>> (sys/dev/xen/timer/timer.c). The data passed from the hypervisor to
>> the guest has exactly the same format (see struct vcpu_time_info in
>> Xen public headers).
> 
> Yes, I know. Didn't get around to making it pretty yesterday evening. ;)
> I'll post an updated patch, when I have some time. Any suggestions where
> to put the two common functions?
> 
>> At a first sight the KVM clock can benefit from using scale_delta
>> (which is going to be faster than the C version implemented in
>> kvmclock_get_timecount),
> 
> At least somewhat on amd64. 32-bit looks pretty identical.
> 
>>  and xen_fetch_vcpu_tinfo is exactly the same
>> as kvmclock_fetch.
> 
> I think xen_fetch_vcpu_tinfo has a subtle bug:
>   217         do {
>   218                 dst->version = src->version;
>   219                 rmb();
>   220                 dst->tsc_timestamp = src->tsc_timestamp;
>   221                 dst->system_time = src->system_time;
>   222                 dst->tsc_to_system_mul = src->tsc_to_system_mul;
>   223                 dst->tsc_shift = src->tsc_shift;
>   224                 rmb();
>   225         } while ((src->version & 1) | (dst->version ^
> src->version));
> 
> In line 225 src->version is potentially read twice. If you consider the
> following schedule:
> 
> host starts updating data, sets src->version to 1
> guest reads src->version (1) and stores it into dst->version.
> guest copies inconsistent data
> guest reads src->version (1) and computes xor with dst->version.
> host finishes updating data and sets src->version to 2
> guest reads src->version (2) and checks whether lower bit is not set.
> while loop exits with inconsistent data!
> 
> I think the C standard (at least C11) permits this as it does not
> specify in which order the two reads in line 225 need to happen.

According to the operator precedence and associativity in C
(http://en.wikipedia.org/wiki/Operators_in_C_and_C%2B%2B#Operator_precedence),
if I'm reading it right, the condition in the while line will be
evaluated as follows (because of the left-to-right associativity of the
| operator):

1. (src->version & 1)
2. (dst->version ^ src->version)
3. result of 1 | result of 2

So AFAICT the flow that you describe could never happen, because
(src->version & 1) is always done before (dst->version ^ src->version).

Roger.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?52D6B18D.4080204>