Date: Thu, 2 Jun 2016 21:16:47 -0700 From: Maxim Sobolev <sobomax@FreeBSD.org> To: FreeBSD Current <freebsd-current@freebsd.org>, freebsd-threads@freebsd.org, freebsd-questions@freebsd.org, freebsd-arch@freebsd.org Subject: CLOCK_MONOTONIC / CLOCK_UPTIME is not really monotonic between threads Message-ID: <CAH7qZfsBLM_wqdTXiiyVvevwU%2Bu257dX9JvZFftfntduUaZDMw@mail.gmail.com>
next in thread | raw e-mail | index | archive | help
Hi there, we have an application here which is trying to measure UDP command/response round-trip-time. It runs two posix threads (more actually, but that's probably irrelevant), one (let's call it A) that does high-level logic and the second one (B) that does network packet I/O. The sending side is done by first thread A forming the request, then calling the function clock_gettime(CLOCK_MONOTONIC) and passing the packet into the thread B. Obtained timestamp is stored with some logical transaction ID allowing us to pull that stored value later on when response arrives. Then we have a separate process that receives those requests, processing them and sending back some form of response. Upon receiving a response from the network, the network I/O thread (B) timestamps it by running clock_gettime(CLOCK_MONOTONIC) and passes the packet data along with that value via queue to the thread A for processing. So if we put things into timeline, what our app does would probably look something like the following: 1. Thread A generates request. 2. A calls clock_gettime(CLOCK_MONOTONIC), storing value as t1 internally 3. A passes packet to thread B 4. B sends out packet via sendto() to server process running on the same box (fully separate, not a thread) [some microseconds later] 5. B receives response from server with recvfrom() 6. B instantly calls clock_gettime(CLOCK_MONOTONIC), assigns returned value to t2 7. B passes packet data along with t2 to the A via queue 8. A picks up packet, parses it and retrieves corresponding t1 stored at step 2. 9. A calculates RTT by doing t2 - t1 assuming it's going to be positive... As you might have guessed if you are still reading, from time to time t2 - t1 comes out slightly negative! Provided it's not some obscure bug in our app, there is no way this could happen if clock_gettime(CLOCK_MONOTONIC) would work as advertised. Event (2) could not possibly happen earlier than (6), which is guaranteed by the fact that the request needs to be processed by the external entity first in order for the response to be seen by our app at all. I've added some logs and it seems to be confirming that the server only sees a single request, there is no chance for the client to receive some other packet and confuse it. I've also confirmed with tcpdump, which shows reasonable time delay between request and reply of few hundreds microseconds. I've checked all logic and I could not find any mistakes on my end here, so I added some logging for such events. The distribution appears to be centered around 0.00006s, but there are some events that go as far up as 0.012s. I've also tried using CLOCK_UPTIME_PRECISE instead, but it makes no difference whatsoever. My questions therefore are: 1. Is it intended/expected behavior of the said API? 2. Has anyone else bumped into this? 3. I know we are doing some clever optimizations using TSC to speed those APIs up, could be it related to that? 4. If the answer for (3) is yes, then what is the method to disable using TSC and use slower but possibly more reliable method? 5. Is there any way/plan to fix this long-term? 6. If the behavior is intended/expected, what is the maximum discrepancy we can expect from this effect? In general some time ago we've spend quite lot of time switching our app from using REALTIME into MONOTIME in the hope to eliminate any wizardry needed to deal with realtime possibly jumping back and forth due to NTP, leap seconds etc, it's shame that this is not working either. Apart from measuring command processing delay, that app does lot of high-volume voip call billing, so even such small discrepancy can easily build up into a bigger problem, not to mention the fact that every time we deal with duration we now need to have some check in place to make sure we don't have some negative values popping our of nowhere. Any hints, suggestions, pointers are appreciated. I can also give more debug information as needed. Thanks! -Max
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAH7qZfsBLM_wqdTXiiyVvevwU%2Bu257dX9JvZFftfntduUaZDMw>