Covering All Things Web3

Precision Time Protocol Improves Meta’s AI & Metaverse

Precision Time Protocol and Timestamps

Meta’s new Precision Time Protocol service improves their Artificial Intelligence (AI), and Metaverse plans with higher accuracy.

Meta is implementing Precision Timing Protocol (PTP), so they can keep their technology running as efficiently as possible.

It has taken them years to re-engineer the way their timekeeping hardware and software operate within their servers and data centers.

Let’s first look at an example of a simple use case for highly accurate timekeeping before exploring the PTP protocol.

If a system has multiple components, one component may be busy processing another request when it receives a new request. If the reader is reading from a replica server that hasn’t received the most recent updates, there is a chance that the user won’t be able to see their writings.

It’s annoying at the very least; more importantly, it violates a linearizability guarantee, which means that if you interact with a distributed system in precisely the same ways as you would interact with a single server, you may end up causing inconsistencies between different nodes in the cluster.

To solve this problem, Meta typically issues multiple queries to different replicas and then collects their results before deciding which one to use. Unfortunately, this approach not only uses up additional resources but also significantly slows down the query because of the lengthy roundtrips between nodes.

By adding precise and reliable time stamps on both the server and client sides, they can just let them run for a bit and then check if they’re synchronized. If they aren’t, they know something went wrong and will need to investigate further.

It both accelerates the reading speed and reduces the computing power needed for the conversion.

A critical condition to ensure the design works properly is that all devices must be synchronized. Otherwise, the system may not know the current date/timestamps, which could cause problems.

To synchronize the devices, we use a window of uncertainty (WOU) concept, where they can determine with a high degree of certainty where the offsets are. With such information, we can block reads until the read timestamps plus WOU.

Meta is not saying that PTP can replace NTP but can provide a good alternative if you need to implement something similar. In addition to having a much better performance profile (which means your code runs faster), PTP also provides additional features such as support for IPv6, multi-homed hosts, and more.

There are several different uses for blockchain beyond just currency. Some examples include event tracing, cache invalidations, privacy violation detection, latency compensating in the Metaverse, and simultaneous executions in artificial intelligence.

These will help Meta continue to grow their business for years to come. They decided that their design could be broken down into three main components: (1) the PTP rack, (2) the networking infrastructure, and (3) the clients.

This house contains the hardware and software that serve time to our customers. Each component has been carefully selected and thoroughly tested.

The GNSS antenna (Global Navigation Satellite System) is an easy component to overlook. But, it’s actually where time origin­ates, at least on earth.

Meta strives for nanosecond accuracy, and if they cannot get an accurate reading, they cannot measure time. A poor GPS reception may cause a sizeable 3D location standard deviation when determining location, which could lead to inaccurate times.

To achieve precise time measurements, your device must enter “time mode.” In this mode, the device takes multiple readings every second and averages them together to produce a much more accurate measurement.

You must ensure an open sky for good reception and to see the view from the top. Also, you want to buy a high-quality antenna.

While trying out different antennas, one relatively new GNSS-based technology caught their eye. It’s free from most disadvantages — it doesn’t use any electrical power, and the signal travels several kilometers without amplification.

Inside the facility, it can utilize existing structured cabling and LC (Laser Communication) patches, which significantly reduces the complexity of the installation. Furthermore, the signal propagation delay through

Meta conducted tests and found that the total roundtrip time between the Time Appli­cance and the device was typically about a few hundred nanoseconds.

Then, they built their Time Appliance because existing solutions didn’t meet their needs.

But this was mostly done within the context of NTP, which brought even higher requirements and tighter constraints. Most importantly, they committed to reliably supporting up to one million clients per device without harming accuracy and precision. To do so, they took a critical review of many of the traditional elements of the Time Device and considered them very carefully.

Meta has been working hard to protect its infrastructure against potential security issues. As part of this effort, they started experimenting with different tracking time card usage solutions.

They’ve developed an application, oscillators, which allows them to monitor time cards from multiple vendors.

Their default tool for GNSS receivers is configured by setting their default settings and then customizing them using the particular parameters. In addition, they allow for the simulation of holdovers by disabling specific constellations.

GNSS receivers monitor satellite signals for the position, time, and navigation messages. They report their findings back to base stations, which then relay them to users via mobile apps.

Different atomic clocks require different configurations and sequences of events. For example, It has an internal frequency reference (SA53 TAU) that allows it to be quickly disciplined, but Atomic clocks require constant maintenance; parameters such as a locked temperature and a locked frequency need to be constantly monitored, and quick decisions need to be taken if the values fall out of operational range.

Meta uses oscillator data to determine whether the Time Appliance needs to be drained. Meta also aims to create protocols such as PTP propagating over the internet. And if the clock is the beating heart of an appliance, then the networking chip is the face. Every clock-sensitive Precision Time Protocol frame gets hardware timestamps via the NIC. That means the PHC of the NIC needs to be accurately disciplined.

They could use the time card values directly into the network interface controller (NIC). However, their tests showed they would lose at least 1–2 microseconds during the transfer between the time card and the NIC.

So, if they were to go through the PCI Express, CPU, memory hierarchy, and so forth, the overall latency would increase by several nanoseconds. To avoid these issues, Meta needs to synchronize the time card with the system clock before sending it to the NIC.

They used ts2phc to copy the time card’s internal clocks into the NIC’s registers. They had to connect the time card’s PPS output to the NIC’s PPS IN pin to accomplish this.

They keep an eye on offsets from the Time Card and make sure they stay within a 50 nanoseconds range between the Time Card and our Network Interface Controller (NIC).

They also monitor the PHC-out interface of the NICS to act as a fail-safe and to confirm that Meta is seeing what’s happening on the NICs.

They found that none of the existing Precision Time Protocol servers could handle our needs at our scale. At best, they could support up to 50K clients per device. At Meta’s scale, they needed thousands of these devices.

Precision Time Protocol and Timestamps

Precision Time Protocol and TimestampsBecause PTP uses hardware timestamps, its server side doesn’t need to be a high-performance C program or even an ASIC-based appliance.

They then developed a scalable PTPv3 multicast PTP server called pptp4m in Go, which they released under the MIT license. They could scale up to over 100,000 concurrent clients per client with some minor optimizations and modifications. This was independently validated by an IEEE 1588 v2 certified testbed.

They achieved this by using channels in Go that allow them to send subscription requests from one worker to another.

PTP4U works as a service on a Linux server, so they automatically get all the advantages of Linux, including IPv6 support, firewalls, etc., for free!

The pptp4u server has many configurable settings for passing dynamic parameters like Precision Time Protocol clock accuracy, PTP clock class, and a time zone offset from UTC to clients.

The Future of Time

To regularly generate these parameters, Meta created a separate application called c4u (configuration four u), which continuously collects various data sources and generates the active configuration for ptp4u.

This allows them to be flexible and reactive if the environment ever changes. For instance, if we lost the GNSS signal on any of our time appliances, they would change the ClockClass to HELD OVER, and clients would instantly move away from it. Meta also calculates ClockAccuracy using various methods, including the accuracy of the tS2P sync, the atomic clock status, etc.

Because they use TAI for time calculations, they get the correct UTC offset values from the package.
Meta wants to ensure that their time appliances are always and independently checked by an established certified monitoring system.

Fortunately, they had already made significant progress in the NTP (Network Time Protocol) space with Calnex so they could use a similar approach for PTP.

They worked with Calnex to develop an embedded system that would be used in data centers to collect temperature readings from sensors deployed throughout the facility. To accomplish this, they changed the physical form factors of both the hardware and software components.

Meta connects the Time Appliance NIC PPS-Out to the Calnex Sentinel. It enables them to monitor the pH levels of the NIC with nanosecond accuracy.

Unicast mode is preferred over multicast mode for large data center deployment because it significantly simplifies the design of networks and reduces the number of required devices.

Precision Time Protocol is going to improve time accuracy in Meta’s future AI greatly, and Metaverse plans no doubt about it. So we are interested in watching this technology continue to develop!