Date: Mon, 21 Feb 2011 17:24:05 +0100 From: Jerome Flesch <jerome.flesch@netasq.com> To: freebsd-current@freebsd.org Subject: Process timing issue Message-ID: <4D6291A5.4050206@netasq.com>
next in thread | raw e-mail | index | archive | help
This is a cryptographically signed message in MIME format.
--------------ms090401030701030108000803
Content-Type: multipart/mixed; boundary="------------070706020906040708070509"
This is a multi-part message in MIME format.
--------------070706020906040708070509
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: quoted-printable
Hello,
While investigating a timing issue with one of our program, we found out =
something weird: We've written a small test program that just calls=20
clock_gettime() a lot of times and checks that the time difference=20
between calls makes sense. In the end, it seems it doesn't always do.
Calling twice in a row clock_gettime() takes usually less than 1ms. But=20
with an average load, about 1 time in 200000, more than 10ms are spent=20
between both calls for no apparent reason. According to our tests, when=20
it happens, the time between both calls can go from few milliseconds to=20
many seconds (our best score so far is 10 seconds :). Same goes for=20
gettimeofday().
To reproduce this issue, we use the small test program joined to this=20
mail and openssl speed test commands (usually 'openssl speed -elapsed=20
dsa2048'). It only appears if one of these openssl speed test run on the =
same core than our test progam, so we have to start as many openssl=20
instances as there are cores on the test computer.
For instance:
- FreeBSD 7.3 + Core 2 duo
- FreeBSD 7.4/STABLE + Core 2 duo
- FreeBSD 8.1 + Core 2 duo
- FreeBSD 8.2-RC3 + Core 2 duo (clean setup)
- FreeBSD 9.0/CURRENT + Core 2 duo (clean setup, checked out last Friday)=
On all these setups, with 2 instances of openssl speed tests, on=20
average, for 2000000*2 calls to clock_gettime(), we got about 10 pauses=20
of between 100 and 300ms
- FreeBSD 8.2-RC1 + Xeon W3680 (6 cores):
With 12 instances of openssl speed tests, for 10000000*2 calls to=20
clock_gettime(), we got pauses of between 300ms and 10s.
- FreeBSD 6.3 + VIA C3:
With 1 instance of openssl, we got pauses of between 70ms and 300ms=
- FreeBSD 7.3 + VIA C7:
With 1 instance of openssl, we got pauses of between 150ms and 1s=20
(Note: this computer may be slightly more loaded then the previous one)
For comparison purpose, we also tried on 2 Linux systems:
- Linux 2.6.32.15 (Via nano, with a bunch of services running on it):
- 1 openssl speed tests + 2000000 iterations: max 10ms
- 10 openssl speed tests + 2000000 iterations: max 44ms
- Linux 2.6 (Core i7):
- 10 openssl speed tests + 2000000: max 1ms
We tried setting the test program to the highest priority possible=20
(rtprio(REALTIME, RTP_PRIO_MAX)) and it doesn't seem to change a thing.
Does anyone know if there is a reason for this behavior ? Is there=20
something that can be done to improve things ?
--------------070706020906040708070509
Content-Type: text/plain;
name="timecheck.c"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment;
filename="timecheck.c"
#include <sys/limits.h>
#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>
#include <unistd.h>
#include <time.h>
#include <err.h>
#include <sysexits.h>
int main(int argc, char *argv[])
{
struct timespec pre, cur;
useconds_t d, dmin, dmax, dsum;
int dweird =3D 0;
int i, loop;
if (argc !=3D 2) {
fprintf(stderr, "timecheck <loop>\n");
return EXIT_FAILURE;
}
loop =3D atoi(argv[1]);
dmax =3D dsum =3D 0;
dmin =3D UINT_MAX;
fprintf(stdout, "looping %d times...\n", loop);
for (i =3D 0; i < loop; i++) {
if (clock_gettime(CLOCK_MONOTONIC, &pre) < 0)
err(EX_OSERR, NULL);
if (clock_gettime(CLOCK_MONOTONIC, &cur) < 0)
err(EX_OSERR, NULL);
d =3D (cur.tv_sec - pre.tv_sec) * 1000000;
d =3D d + (cur.tv_nsec / 1000) - (pre.tv_nsec / 1000);
dsum +=3D d;
if (d < dmin)
dmin =3D d;
if (d > dmax)
dmax =3D d;
if (d > 10*1000) /* 10 ms */
dweird++;
cur =3D pre;
}
fprintf(stdout, "dmin =3D %dus\ndmax =3D %dus\ndmean =3D %dus\ndweird =
=3D %d\n",
dmin, dmax, dsum / loop, dweird);
return 0;
}
--------------070706020906040708070509--
--------------ms090401030701030108000803--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4D6291A5.4050206>
