Date: Fri, 5 Aug 2016 14:27:22 +1000 (EST) From: Bruce Evans <brde@optusnet.com.au> To: Jung-uk Kim <jkim@freebsd.org> Cc: Bruce Evans <brde@optusnet.com.au>, src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r303733 - head/contrib/libpcap Message-ID: <20160805121548.M1029@besplex.bde.org> In-Reply-To: <764bf543-5f62-cc7b-1242-8beccdaecd61@FreeBSD.org> References: <201608032008.u73K8dWe047330@repo.freebsd.org> <20160804124849.Y1045@besplex.bde.org> <764bf543-5f62-cc7b-1242-8beccdaecd61@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 4 Aug 2016, Jung-uk Kim wrote: > On 08/03/16 11:40 PM, Bruce Evans wrote: >> On Wed, 3 Aug 2016, Jung-uk Kim wrote: >> >>> Log: >>> Support nanosecond time stamps for pcap_dispatch(3) and pcap_loop(3). >>> >>> Modified: >>> head/contrib/libpcap/pcap-bpf.c >>> >>> Modified: head/contrib/libpcap/pcap-bpf.c >>> ============================================================================== >>> >>> --- head/contrib/libpcap/pcap-bpf.c Wed Aug 3 19:23:22 2016 >>> (r303732) >>> +++ head/contrib/libpcap/pcap-bpf.c Wed Aug 3 20:08:39 2016 >>> (r303733) >>> @@ -1008,7 +1028,25 @@ pcap_read_bpf(pcap_t *p, int cnt, pcap_h >>> if (pb->filtering_in_kernel || >>> bpf_filter(p->fcode.bf_insns, datap, bhp->bh_datalen, >>> caplen)) { >>> struct pcap_pkthdr pkthdr; >>> +#ifdef BIOCSTSTAMP >>> + struct bintime bt; >>> + >>> + bt.sec = bhp->bh_tstamp.bt_sec; >>> + bt.frac = bhp->bh_tstamp.bt_frac; >> >> The names are very confusing since bt_sec and bt_frac are only misnamed as >> sec and frac in struct bintime. > > bt_* means BPF timestamp, not bintime. Yes, it has even larger naming bugs than I noticed before. It should have been a union or something. > ... >> Old code is even more confusing, and at least partly wrong. > ... >> X sys/net/bpf.c: struct timeval32 bh_tstamp; /* time stamp */ >> >> Banal comment. The complexities are from what sort of timestamp this is. >> It is obviously a timestamp. >> >> This bh_tstamp is in struct bpf_hdr32 for the !BURN_BRIDGES case. There >> is also struct timeval bh_timestamp in struct bpf_hdr. This header is >> bogusly marked Obsolete. > > It is superceded by struct bpf_xhdr although we kept it for backward > compatibility. New applications should avoid it. > >> X sys/net/bpf.c: hdr32_old.bh_tstamp.tv_usec = ts.bt_frac; >> >> This is in the !BURN_BRIDGES && COMPAT_FREEBSD32 case. Since struct >> timeval32 >> always has a 32-bit tv_usec, this assignment discards the most significant >> bits in bt_frac but keeps the noise. >> >> X sys/net/bpf.c: hdr_old.bh_tstamp.tv_usec = ts.bt_frac; >> >> This is in the !BURN_BRIDGES && !COMPAT_FREEBSD32 case. Since tv_sec in a >> normal timetamp is bogusly long, this accidentally preserves all of the >> bits >> in bt_frac on 64-bit arches. On 32-bit arches, it loses the signal as for >> the COMPAT_FREEBSD32 case. > > Note struct bpf_ts is not struct bintime. I missed that it is only used for the old standard timeval format here. My main complaints about bintimes in bpf are actually: - their existence - broken conversion between monotonic times and real times. There are not just bintimes, but 12 new time formats/resolutions/types/ qualities from supporting the closure of 3 formats/resolutions, 1 boolean MONOTONIC flags and 1 boolean FAST flag. One of these was the old standard timeval (!MONOTONIC, !FAST) format. Before this commit, none of the other 11 was referenced anywhere in /usr/src except in the implementation (net/bpf.[ch]) and the man page (bpf.4). This commit adds support for bintimes (!MONOTONIC, !FAST) in libpcap. libpcap now references a whole 1 of the BPF_T_ symbols (BPF_T_BINTIME) to implement this. The list of these symbols is quite readable in bpf.h, but the paragraph in the man page is almost unreadable since horrible names like BPF_T_MICROTIME_MONOTONIC_FAST are too hard to format -- they tend to come out as 1 or 2 words per line. Here is the broken conversion: X static void X bpf_bintime2ts(struct bintime *bt, struct bpf_ts *ts, int tstype) It starts with style bugs similar to other bintime code. X { X struct bintime bt2, boottimebin; X struct timeval tsm; X struct timespec tsn; X X if ((tstype & BPF_T_MONOTONIC) == 0) { X bt2 = *bt; X getboottimebin(&boottimebin); X bintime_add(&bt2, &boottimebin); X bt = &bt2; X } bpf now uses only monotonic bintimes as its primary format, and calls this function to convert. This function does the conversions with less efficiency and larger races than timecounter code. This addition of boottimebin is still fundamentally broken. Non-atomic accesses to boottimebin were only fixed recently. The above still has a race between reading the monotonic bintime add converting it to a real time. This race is now fixed in timecounter code. Races for leap seconds adjustments are also fixed there but not here. There are still large races for setting boottimebin. X switch (BPF_T_FORMAT(tstype)) { X case BPF_T_MICROTIME: X bintime2timeval(bt, &tsm); X ts->bt_sec = tsm.tv_sec; X ts->bt_frac = tsm.tv_usec; X break; X case BPF_T_NANOTIME: X bintime2timespec(bt, &tsn); X ts->bt_sec = tsn.tv_sec; X ts->bt_frac = tsn.tv_nsec; X break; X case BPF_T_BINTIME: X ts->bt_sec = bt->sec; X ts->bt_frac = bt->frac; X break; X } X } The others are just less efficient than direct timecounter conversion. The conversions are even impossible to do correctly in userland. It would be reasonable for userland to take only monotonic timestamps and convert them to real times much later. But the only way to do this is to record the value of boottimebin at every reading of the monotonic clock (without the races in the above), so it can be added later. It is just as non-easy to record the monotonic and real times taken atomically on every reading. I checked most uses of [get]boottimebin() and didn't find any correct ones. All uses are subject to races, and at best the addition breaks your monotonic time to give a real time perfectly broken for POSIX leap seconds, and if you want the latter why did you read the monotonic time in the first place? As in bpf, timecounter code would have given a more perfectly broken real time if used directly. Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20160805121548.M1029>