By Dan Drown in gps — Sep 19, 2014

Stratum 2 NTP over a Cable Modem

Goals

I have a Stratum 1 NTP server at home, and ideally I'd like to join the ntp pool as a server. The problem is, my home connection isn't up to this task. My IP is dynamic, there's an asymmetric latency built in, and my upstream is tiny. So it'd be great if the NTP server on my VM in the cloud could use my home NTP server as a clock source.

Plan

Write a program to do the NTP measurements and submit the results to ntpd
Use dynamic DNS, and have the program automatically switch to the new dynamic IP if the DNS record changes
Measure the request latency to avoid the upstream cable modem latency noise
Estimate the one way latency and subtract that (assumption: the one way latency doesn't change often)
Throw away any measurement that has a round trip time over 2x the minimum (currently around 5% of the samples)
Measure once per second
Every 20 seconds, remove the extreme samples, and send the median to ntp
Configure ntpd to accept the best sample every 64 seconds
For backup and comparison, configure the other two Stratum 1 NTP servers I run with a minpoll 6 (64 seconds), and add two more Stratum 1 NTP servers with a minpoll 10 (~17 minutes)

Experiment

A process on "vps" was setup to take 4 hours of NTP request latency samples, one per second. The clock on vps was sync'd to public timeservers with ~17 minute poll times. The clock on "sandfish" was sync'd to GPS. This graph is limited to a max of 20ms, samples over that amount are not shown.

Filtered Requests

The purple and cyan lines are the 90th and 3rd percentiles of the last 200 samples. The uneven percentiles are used based on the assumption that the error in the offset measurement will be positive more often than negative. They are recalculated every 20 samples, and are used to calculate the filtered mode (incorrectly labeled "filtered mean" in this graph) and filtered average. Both filtered statistics are only over the last 20 samples, while the last 200 samples are used to generate the filter. The filter moves at a slower pace to limit the effects of noise.

Comparison

Putting this data into practice, I created a asymmetric ntp client to send the filtered samples to ntpd.

I configured ntpd to log both the filtered samples vs going direct (without a filter). The dark blue line is the filtered offset, and the lighter blue line is the direct offset. The filtered offset has an order of magnitude better jitter.

filtered vs direct

Comparing the filtered NTP samples ("sandfish") vs other Stratum 1 NTP servers. I've adjusted the offsets of the other NTP servers to be within 1ms with a static +/- number.

Stratum 1 NTP servers

The filtered NTP samples are still more noisy than the other sources, but they all follow each other very closely. You can see the "lon" clock lost sync for roughly 6 hours, and then came back. This is because its antenna placement isn't great and sometimes loses signal. 3ms in 6 hours is a 138 parts per billion error, which is reasonable holdover performance.

Results

Lastly, a look at the local clock's performance.

NTP loopstats

The offset is within +/- 144us 90% of the time, and +/- 482us 99.98% of the time. On average, the clock wandered 86 parts per billion in 30 minutes, and wandered less than 185 parts per billion 90% of the time.