Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 18 Aug 2019 13:27:56 -0600
From:      Ian Lepore <ian@freebsd.org>
To:        Per Hedeland <per@hedeland.org>
Cc:        freebsd-arm@freebsd.org
Subject:   Re: Is it a good idea to use a usb-serial adapter for PPS? Yes, it is.
Message-ID:  <72a964c78cbfc36be2345919633ca2196f0783e3.camel@freebsd.org>
In-Reply-To: <16c91be1-6f2a-b26d-22c7-be8e4ba8eec0@hedeland.org>
References:  <alpine.BSF.2.21.99999.352.1908071046410.98975@autopsy.pc.athabascau.ca> <69a9bed3-4d0a-f8f6-91af-a8f7d84ee307@hedeland.org> <345bae77417c2495f55799b4c7ca2784f4ece9ed.camel@freebsd.org> <7312032d-2908-9414-0445-6b442c3a02e5@hedeland.org> <523b6f0a0fa5f2aeec298fa74df25d3c4af66acc.camel@freebsd.org> <0426fc8b-5398-d8ab-561e-7823c24403a5@hedeland.org> <24b0eaf25b64d6098b390df092866c69e352d859.camel@freebsd.org> <16c91be1-6f2a-b26d-22c7-be8e4ba8eec0@hedeland.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 2019-08-15 at 23:05 +0200, Per Hedeland wrote:
> On 2019-08-15 17:49, Ian Lepore wrote:
> > On Thu, 2019-08-15 at 13:46 +0200, Per Hedeland wrote:
> > > On 2019-08-09 22:17, Ian Lepore wrote:
> > > > [...]
> > > 
> > > I have a theory that your making the kernel clock be based on the 10
> > > MHz clock also ended up locking the USB poll frequency to that clock,
> > > and thus to the PPS signal - this would certainly explain the result.
> > > Do you think this is a possibility? Would it be possible for you to
> > > re-run the test without modifying the kernel clock? (I do understand
> > > that the results will be harder to interpret with the drift, and
> > > ntpd's correction of it, coming into play.)
> > > 
> > > --Per
> > > 
> > 
> > I'm not sure what you mean by "modifying the kernel clock".  The kernel
> > clock always runs on some frequency source.  Typically it's derived
> > from the cheap 24 MHz crystal that clocks the SoC, sometimes after
> > being scaled up to 66 MHz by a phase-fractional PLL within the SoC.  I
> > arranged to use a very stable nearly-drift-free frequency source
> > instead of a cheap crystal for counting time in the kernel.
> > 
> > The kernel clock has nothing to do with usb, including polling
> > intervals; the usb controller hardware handles that, and the root
> > source clock for that is the cheap 24 MHz crystal.
> 
> The thing that made me hypothesize that the kernel clock *could* have
> *something* to do with the USB polling frequency was this observation
> in https://blog.dan.drown.org/pps-over-usb (link provided by one of
> the posters in the newsgroup, though he didn't refer specifically to
> this):
> 
>     Looking closer at the USB latency, you can see the PPS drifting
>     relative to the host schedule of polling the USB device for its
>     status. The system clock error was 2.215ppm during this time
>     period, and this drift matches that error exactly. This probably
>     means USB on this system shares the same clock as the system
>     clock. This hardware is a Raspberry Pi 2, and I suspect it won't be
>     true for other platforms.
> 
> So at least on RPi 2, there appears to be a relation between the
> "normal" system/kernel clock and the USB polling frequency. But I have
> no idea if there is such a relation on the system you used, and even
> in that case, *I* certainly can't see how using a different source for
> the kernel clock could affect the USB polling frequency, which is why
> asked if you thought that it was a possibility...
> 

I probably should have been clearer that I meant there was no
correlation between the kernel clock and the usb polling on the system
I was using as a testbed.  On most SoCs, and probably even modern x86
systems, the same frequency source (typically a 24MHz crystal) will be
the root clock for both the usb controller hardware and the timer
hardware from which the kernel clock is derived.  However, the kernel
clock is numerically steered to be more stable in frequency and
accurate in phase, so once ntpd has been running for long enough to
capture and disipline the kernel clock, the situation will change.  The
usb polling will still be happening at the drifting frequency of the
underlying crystal, while the kernel timestamps used to mark the PPS
pulse time will not be drifting at that rate.

I have a hard time understanding how the measurements were made in that
pps-over-usb page you cited.  There is mention of a STM32F103
microcontroller, but it's not clear to me what role it plays.  There is
also mention of a usb irq and something about a message in a buffer.

For the measurements I made, I was using FTDI usb-serial devices
directly connected to the usb bus on the Wandboard I was using to make
measurements.  When the DCD pin changes at the ftdi chip, the chip
internally notes that it has a line-status change that must be
communicated upstream at the next opportunity.  When the time comes to
send the data, it sends a 2-byte packet which contains the modem and
line status register bits.  (If there are any routine uart data bytes
in the buffer, they also get transferred, but I'm not doing any data
transfer on the adapters I'm using for this test, in fact the only pins
connected are ground and DCD.)  When the input packet arrives, the
uftdi driver sees the change in the DCD bit and captures a pps event. 
For ftdi chips, all of the foregoing is done with usb BULK-IN
transfers, not control or interrupt transfers.

I need to run the same tests with some other brands of usb-serial
adapters.  I think I may have a cable laying around based on Prolific
PL2303 chipset.  If I can find it.  I should just go buy a few other-
brand breakout boards and test them.

> > I think people are massively confused by usb.  A usb 2.0 bus runs at
> > 480MHz.  That means the time to transmit a packet describing a usb
> > serial pin-change event takes literally a dozen or so nanoseconds.  The
> > time it takes to transmit an entire sector of disk data is 2
> > microseconds; even if continuous disk data is flowing, the usb serial
> > adapter gets its round-robin opportunity to send a packet on the bus in
> > between them.
> 
> Yes, the transmission speed is obviously not a problem, the question
> is about varying latency due to the polling.
> 
> > A USB 2.0 bus spends most of its time idle.  The
> > devices on the bus are polled, but the polling happens in time slots
> > that are 125 microseconds wide.  There's just no reason for a lot of
> > jitter or latency.
> 
> In the newsgroup it was claimed that the polling frequency was 1 kHz
> for USB 1.1 and 4 kHz for USB 2.0, but it seems it should indeed be 8
> kHz for 2.0 "high" speed. And your test used one USB 1.1 device and
> one 2.0 device.
> 
> And "a lot" is a bit subjective, but for any polling at a frequency
> that isn't an exact integral number of periods per second, there will
> be a latency between the start of the PPS pulse and the detection in
> the host that *varies* in an interval the size of the polling
> interval. I believe that interval should thus be expected to be 1000
> microseconds for 1.1 and 125 microseconds for 2.0.
> 

I think there is some confusion around the concept of usb 1.x devices
on a usb 2.0 bus.  I think there may even be some confusion when a 1.x
bus is involved.  And then adding to the confusion is the likelyhood
that different usb-serial adapters use different usb transfer types
(bulk vs interrupt)_to communicate line-state changes.

A usb 1.x bus is divided into 1ms frames.  A 2.0 bus is divided into
125us (micro-)frames.  For interrupt endpoints, a usb 1.x bus limits
devices to 1 interrupt transfer per frame, and that may imply that
there is up to 1ms of latency for reporting a DCD change on such a 1.x
bus.  A 2.0 bus allows up to 3 interrupt transfers per microframe,
implying latency of up to 125us.

However, there is no limit on either 1.x or 2.0 busses for how many
bulk transfers can happen to a given endpoint during a frame.  The
controller needs to fill a frame with transactions in a way that first
provides all the g'teed bandwidth that is promised to control,
interrupt, and isochronous transfers.  It is then free to fill all the
remaining time with bulk transfers.

To me, this implies that you may end up with nearly no latency (and
negligible jitter) if you have a usb 2.0 bus that has just one or two
devices on it which are communicating via bulk-transfer endpoints.  The
controller would be continuously sending BULK IN tokens to the one or
two devices, so that as soon as one of them has data, it gets an
opportunity to deliver it almost immediately (meaning within a few
microseconds).

The results I see with FTDI usb-serial adapters which use bulk
transfers provide some evidence that my theory may be correct.  I think
the bus looks like this:

  BULK IN token to usb 1.x device (do you have anything to say?)
  1.x device NAKs
  BULK IN token to usb 2.0 device
  2.0 device NAKs
  <no significant amount of time elapses here>
  BULK IN token to usb 1.x device
  1.x device NAKs
  ... (repeat forever)

In other words, the device(s) aren't getting 1 chance per frame to
transfer data, they are getting many thousands of chances per second.
 I think the bus overhead of the BULK-IN token followed by a NAK from
the device, along with the various framing bits and crc and all that
probably adds up to less than 64 bytes per poll.  But assuming it took
as much as 64 bytes to do that, if there was one usb-serial device on a
usb 2.0 bus, it would be getting asked about 1 million times per second
whether it had anything to say.

I'd welcome input from low-level USB gurus about the bus and controller
behavior in this regard.

> Your ntpq output showed an offset close to 200 microseconds for both
> devices, and I *assumed* that it was more or less constant and thus
> ntpd could trivially be told to correct for it - but maybe that
> assumption was incorrect, there was only one instance of ntpq output?
> 
> If it actually varied in an interval per above, I would expect the
> jitter to be significantly higher though. And if it *is* more or less
> constant, can you explain how this is possible? Even the 2.0 125
> microsecond case should be clearly visible in the offset reported by
> ntpd across a sequence of ntpq requests.
> 
> > I'm not on a crusade to change the minds of people who make judgements
> > based on gut feelings and reject objective measurements.  I put the
> > measurements out there, and I described the measurement methodology.
> > (Precision timing is what I do for a living, btw.)  I'm perfectly
> > willing to explain the methodology in more detail or help interpret the
> > results, but I'm not going to butt heads with people who just reject
> > data they don't like for emotional reasons.
> 
> Well, I guess a problem here is that it's my confused head that is
> butted between yours and those of the supposedly-experts that
> participate in the NTP newsgroup/maillist:-) - you already declined to
> participate there, and I don't expect that any of them will take the
> trouble to participate here. Maybe we'll just have to leave it at
> that...

I had a brief look at whether I could get posting access to the ntp
newsgroup and didn't find anything easy to set up and use, and I'm
reluctant to get an nntp provider and install a newsreader for one
conversation.  (The 1980s me would be astounded to hear that future-me
would have any reluctance to get involved in usenet.)

-- Ian





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?72a964c78cbfc36be2345919633ca2196f0783e3.camel>