Date: Sun, 18 Aug 2019 22:53:51 +0200 From: Per Hedeland <per@hedeland.org> To: Ian Lepore <ian@freebsd.org> Cc: freebsd-arm@freebsd.org Subject: Re: Is it a good idea to use a usb-serial adapter for PPS? Yes, it is. Message-ID: <fe2c2d77-3030-6734-e1d8-c1375f231a24@hedeland.org> In-Reply-To: <72a964c78cbfc36be2345919633ca2196f0783e3.camel@freebsd.org> References: <alpine.BSF.2.21.99999.352.1908071046410.98975@autopsy.pc.athabascau.ca> <69a9bed3-4d0a-f8f6-91af-a8f7d84ee307@hedeland.org> <345bae77417c2495f55799b4c7ca2784f4ece9ed.camel@freebsd.org> <7312032d-2908-9414-0445-6b442c3a02e5@hedeland.org> <523b6f0a0fa5f2aeec298fa74df25d3c4af66acc.camel@freebsd.org> <0426fc8b-5398-d8ab-561e-7823c24403a5@hedeland.org> <24b0eaf25b64d6098b390df092866c69e352d859.camel@freebsd.org> <16c91be1-6f2a-b26d-22c7-be8e4ba8eec0@hedeland.org> <72a964c78cbfc36be2345919633ca2196f0783e3.camel@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2019-08-18 21:27, Ian Lepore wrote: > On Thu, 2019-08-15 at 23:05 +0200, Per Hedeland wrote: >> On 2019-08-15 17:49, Ian Lepore wrote: >>> On Thu, 2019-08-15 at 13:46 +0200, Per Hedeland wrote: >>>> On 2019-08-09 22:17, Ian Lepore wrote: >>>>> [...] >>>> >>>> I have a theory that your making the kernel clock be based on the 10 >>>> MHz clock also ended up locking the USB poll frequency to that clock, >>>> and thus to the PPS signal - this would certainly explain the result. >>>> Do you think this is a possibility? Would it be possible for you to >>>> re-run the test without modifying the kernel clock? (I do understand >>>> that the results will be harder to interpret with the drift, and >>>> ntpd's correction of it, coming into play.) >>>> >>>> --Per >>>> >>> >>> I'm not sure what you mean by "modifying the kernel clock". The kernel >>> clock always runs on some frequency source. Typically it's derived >>> from the cheap 24 MHz crystal that clocks the SoC, sometimes after >>> being scaled up to 66 MHz by a phase-fractional PLL within the SoC. I >>> arranged to use a very stable nearly-drift-free frequency source >>> instead of a cheap crystal for counting time in the kernel. >>> >>> The kernel clock has nothing to do with usb, including polling >>> intervals; the usb controller hardware handles that, and the root >>> source clock for that is the cheap 24 MHz crystal. >> >> The thing that made me hypothesize that the kernel clock *could* have >> *something* to do with the USB polling frequency was this observation >> in https://blog.dan.drown.org/pps-over-usb (link provided by one of >> the posters in the newsgroup, though he didn't refer specifically to >> this): >> >> Looking closer at the USB latency, you can see the PPS drifting >> relative to the host schedule of polling the USB device for its >> status. The system clock error was 2.215ppm during this time >> period, and this drift matches that error exactly. This probably >> means USB on this system shares the same clock as the system >> clock. This hardware is a Raspberry Pi 2, and I suspect it won't be >> true for other platforms. >> >> So at least on RPi 2, there appears to be a relation between the >> "normal" system/kernel clock and the USB polling frequency. But I have >> no idea if there is such a relation on the system you used, and even >> in that case, *I* certainly can't see how using a different source for >> the kernel clock could affect the USB polling frequency, which is why >> asked if you thought that it was a possibility... >> > > I probably should have been clearer that I meant there was no > correlation between the kernel clock and the usb polling on the system > I was using as a testbed. On most SoCs, and probably even modern x86 > systems, the same frequency source (typically a 24MHz crystal) will be > the root clock for both the usb controller hardware and the timer > hardware from which the kernel clock is derived. However, the kernel > clock is numerically steered to be more stable in frequency and > accurate in phase, so once ntpd has been running for long enough to > capture and disipline the kernel clock, the situation will change. The > usb polling will still be happening at the drifting frequency of the > underlying crystal, while the kernel timestamps used to mark the PPS > pulse time will not be drifting at that rate. Understood. What I don't understand is if, and if so how, your "replacing" the kernel clock with an "exact" frequency from your 10 MHz clock might affect the USB polling. > I have a hard time understanding how the measurements were made in that > pps-over-usb page you cited. There is mention of a STM32F103 > microcontroller, but it's not clear to me what role it plays. There is > also mention of a usb irq and something about a message in a buffer. My understanding is that the "STM32F103 devboard" actually *implements* the USB-to-serial adapter. This means that the author has detailed insight into the workings of the adapter, including the possibilty to modify its firmware - something that is obviously not the case for an off-the-shelf adapter. The "PPS IRQ" is thus something that happens *inside* the "adapter". > For the measurements I made, I was using FTDI usb-serial devices > directly connected to the usb bus on the Wandboard I was using to make > measurements. When the DCD pin changes at the ftdi chip, the chip > internally notes that it has a line-status change that must be > communicated upstream at the next opportunity. When the time comes to > send the data, it sends a 2-byte packet which contains the modem and > line status register bits. (If there are any routine uart data bytes > in the buffer, they also get transferred, but I'm not doing any data > transfer on the adapters I'm using for this test, in fact the only pins > connected are ground and DCD.) When the input packet arrives, the > uftdi driver sees the change in the DCD bit and captures a pps event. > For ftdi chips, all of the foregoing is done with usb BULK-IN > transfers, not control or interrupt transfers. > > I need to run the same tests with some other brands of usb-serial > adapters. I think I may have a cable laying around based on Prolific > PL2303 chipset. If I can find it. I should just go buy a few other- > brand breakout boards and test them. FWIW, I did a bleak replica of your setup, using a "noname" USB-to-serial adapter that I had laying around, which was actually identified as pi kernel: uplcom0: <Prolific Technology Inc. USB-Serial Controller, class 0/0, rev 1.10/3.00, addr 4> on usbus0 - and since the uplcom driver has "support for Prolific PL-2303/2303X/2303HX", I assume it is one of those. Since I don't have a "real" PPS source, I simulated one with a simple program running on an RPi 3, that generated a pulse on a gpio pin at the turn of the second. This pin was then connected to a gpio pin on an RPi B, and to the DCD pin on the above adapter, also connected to the RPi B - notably without any ttl-to-rs232 converter (since I also don't have one of those). I then set up ntpd on the RPi B to use gpiopps plus the PPS from the uplcom driver: server <lan host> iburst prefer server 127.127.22.0 minpoll 4 maxpoll 4 fudge 127.127.22.0 refid gpio server 127.127.22.1 minpoll 4 maxpoll 4 noselect fudge 127.127.22.1 refid usb The result was very nice for gpiopps, but the offset for the uplcom PPS varied way more than the 1 ms that could be expected even from a constant 1 ms polling interval (this is a 1.1 device), more like a 5-6 ms interval - some ntpq samples approximately 16 s apart: remote refid st t when poll reach delay offset jitter ============================================================================== *<lan host> 194.58.205.148 2 u 40 64 377 0.996 -0.049 0.034 oPPS(0) .gpio. 0 l 14 16 377 0.000 -0.004 0.004 PPS(1) .usb. 0 l 13 16 377 0.000 -7.090 3.273 remote refid st t when poll reach delay offset jitter ============================================================================== *<lan host> 194.58.205.148 2 u 56 64 377 0.996 -0.049 0.034 oPPS(0) .gpio. 0 l 14 16 377 0.000 0.000 0.004 PPS(1) .usb. 0 l 13 16 377 0.000 -2.957 2.567 remote refid st t when poll reach delay offset jitter ============================================================================== *<lan host> 194.58.205.148 2 u 6 64 377 0.996 -0.049 0.034 oPPS(0) .gpio. 0 l 15 16 377 0.000 -0.001 0.004 PPS(1) .usb. 0 l 14 16 377 0.000 -8.627 4.871 This *could* be taken to imply that there was also some polling going on *in* the adapter towards the DCD pin - especially since the above was with a 10 ms pulse, while if I shortened the pulse to 1 ms, the variation went down to ~ 1 ms, but almost half the pulses were missed. Or it might be due to shaky detection due to the lack of a ttl-to-rs232 converter. In any case pretty inconclusive, other than the observation that it's certainly possible to mess things up...:-) >>> I think people are massively confused by usb. A usb 2.0 bus runs at >>> 480MHz. That means the time to transmit a packet describing a usb >>> serial pin-change event takes literally a dozen or so nanoseconds. The >>> time it takes to transmit an entire sector of disk data is 2 >>> microseconds; even if continuous disk data is flowing, the usb serial >>> adapter gets its round-robin opportunity to send a packet on the bus in >>> between them. >> >> Yes, the transmission speed is obviously not a problem, the question >> is about varying latency due to the polling. >> >>> A USB 2.0 bus spends most of its time idle. The >>> devices on the bus are polled, but the polling happens in time slots >>> that are 125 microseconds wide. There's just no reason for a lot of >>> jitter or latency. >> >> In the newsgroup it was claimed that the polling frequency was 1 kHz >> for USB 1.1 and 4 kHz for USB 2.0, but it seems it should indeed be 8 >> kHz for 2.0 "high" speed. And your test used one USB 1.1 device and >> one 2.0 device. >> >> And "a lot" is a bit subjective, but for any polling at a frequency >> that isn't an exact integral number of periods per second, there will >> be a latency between the start of the PPS pulse and the detection in >> the host that *varies* in an interval the size of the polling >> interval. I believe that interval should thus be expected to be 1000 >> microseconds for 1.1 and 125 microseconds for 2.0. >> > > I think there is some confusion around the concept of usb 1.x devices > on a usb 2.0 bus. I think there may even be some confusion when a 1.x > bus is involved. And then adding to the confusion is the likelyhood > that different usb-serial adapters use different usb transfer types > (bulk vs interrupt)_to communicate line-state changes. > > A usb 1.x bus is divided into 1ms frames. A 2.0 bus is divided into > 125us (micro-)frames. For interrupt endpoints, a usb 1.x bus limits > devices to 1 interrupt transfer per frame, and that may imply that > there is up to 1ms of latency for reporting a DCD change on such a 1.x > bus. A 2.0 bus allows up to 3 interrupt transfers per microframe, > implying latency of up to 125us. > > However, there is no limit on either 1.x or 2.0 busses for how many > bulk transfers can happen to a given endpoint during a frame. The > controller needs to fill a frame with transactions in a way that first > provides all the g'teed bandwidth that is promised to control, > interrupt, and isochronous transfers. It is then free to fill all the > remaining time with bulk transfers. > > To me, this implies that you may end up with nearly no latency (and > negligible jitter) if you have a usb 2.0 bus that has just one or two > devices on it which are communicating via bulk-transfer endpoints. The > controller would be continuously sending BULK IN tokens to the one or > two devices, so that as soon as one of them has data, it gets an > opportunity to deliver it almost immediately (meaning within a few > microseconds). > > The results I see with FTDI usb-serial adapters which use bulk > transfers provide some evidence that my theory may be correct. I think > the bus looks like this: > > BULK IN token to usb 1.x device (do you have anything to say?) > 1.x device NAKs > BULK IN token to usb 2.0 device > 2.0 device NAKs > <no significant amount of time elapses here> > BULK IN token to usb 1.x device > 1.x device NAKs > ... (repeat forever) > > In other words, the device(s) aren't getting 1 chance per frame to > transfer data, they are getting many thousands of chances per second. > I think the bus overhead of the BULK-IN token followed by a NAK from > the device, along with the various framing bits and crc and all that > probably adds up to less than 64 bytes per poll. But assuming it took > as much as 64 bytes to do that, if there was one usb-serial device on a > usb 2.0 bus, it would be getting asked about 1 million times per second > whether it had anything to say. This is extremely interesting - if it really is the case that the host will poll "as fast as it can", as opposed to always doing it with a fixed frequency, it would definitely change the picture. Unfortunately I haven't seen any documentation to support that this is the case. > I'd welcome input from low-level USB gurus about the bus and controller > behavior in this regard. I'm afraid we have lost freebsd-usb@ in this sub-thread (it was actually the case before my first comment, but it would probably have happened anyway, since I'm not subscribed to that list). Unless you know that "low-level USB gurus" are also subscribed to freebsd-arm@, it might make sense to forward your message to freebsd-usb@. >> Your ntpq output showed an offset close to 200 microseconds for both >> devices, and I *assumed* that it was more or less constant and thus >> ntpd could trivially be told to correct for it - but maybe that >> assumption was incorrect, there was only one instance of ntpq output? >> >> If it actually varied in an interval per above, I would expect the >> jitter to be significantly higher though. And if it *is* more or less >> constant, can you explain how this is possible? Even the 2.0 125 >> microsecond case should be clearly visible in the offset reported by >> ntpd across a sequence of ntpq requests. >> >>> I'm not on a crusade to change the minds of people who make judgements >>> based on gut feelings and reject objective measurements. I put the >>> measurements out there, and I described the measurement methodology. >>> (Precision timing is what I do for a living, btw.) I'm perfectly >>> willing to explain the methodology in more detail or help interpret the >>> results, but I'm not going to butt heads with people who just reject >>> data they don't like for emotional reasons. >> >> Well, I guess a problem here is that it's my confused head that is >> butted between yours and those of the supposedly-experts that >> participate in the NTP newsgroup/maillist:-) - you already declined to >> participate there, and I don't expect that any of them will take the >> trouble to participate here. Maybe we'll just have to leave it at >> that... > > I had a brief look at whether I could get posting access to the ntp > newsgroup and didn't find anything easy to set up and use, and I'm > reluctant to get an nntp provider and install a newsreader for one > conversation. Understood - there are free nntp providers, but you do need a newsreader of some kind. The newsgroup is sort-of gatewayed to the questions@lists.ntp.org mailing list, but the gatewaying is pretty broken - posts to the mailing list do not appear at all in the newsgroup, and posts to the newsgroup only appear on the mailing list after manual approval by a moderator (at least unless you are subscribed to the mailing list). Thus I believe most participants use the newsgroup. > (The 1980s me would be astounded to hear that future-me > would have any reluctance to get involved in usenet.) Well, there isn't much value there anymore, but there are some groups that refuse to die.:-) --Per
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?fe2c2d77-3030-6734-e1d8-c1375f231a24>