NTP server misuse and abuse

Misuse of a Network Time Protocol (NTP) server ranges from flooding it with traffic (effectively a DDoS attack) or violating the server's access policy or the NTP rules of engagement. One incident was branded NTP vandalism in an open letter from Poul-Henning Kamp to the router manufacturer D-Link in 2006.[1] This term has later been extended by others to retroactively include other incidents. There is, however, no evidence that any of these problems are deliberate vandalism. They are more usually caused by shortsighted or poorly chosen default configurations.

A deliberate form of NTP server abuse came to note at the end of 2013, when NTP servers were used as part of amplification denial-of-service attacks. Some NTP servers would respond to a single "monlist" UDP request packet, with packets describing up to 600 associations. By using a request with a spoofed IP address attackers could direct an amplified stream of packets at a network. This resulted in one of the largest distributed denial-of-service attacks known at the time.[2]

Common NTP client problems

The most troublesome problems have involved NTP server addresses hardcoded in the firmware of consumer networking devices. As major manufacturers and OEMs have produced hundreds of thousands of devices using NTP coupled with customers almost never upgrading the firmware of these devices, NTP query storms problems will persist for as long as the devices are in service.

One particularly common NTP software error is to generate query packets at short (less than five second) intervals until a response is received

  • When placed behind aggressive firewalls that block the server responses, this implementation leads to a never-ending stream of client requests to the variously blocked NTP servers.
  • Such over-eager clients (particularly those polling once per second) commonly make up more than 50% of the traffic of public NTP servers, despite being a minuscule fraction of the total clients.

While it might be technically reasonable to send a few initial packets at short intervals, it is essential for the health of any network that client connection re-attempts are generated at logarithmically or exponentially decreasing rates to prevent denial of service.

This in protocol exponential or logarithmic backdown applies to any connectionless protocol, and by extension many portions of connection-based protocols. Examples of this backing down method can be found in the TCP specification for connection establishment, zero-window probing, and keepalive transmissions.

Notable cases

Tardis and Trinity College, Dublin

In October 2002, one of the earliest known cases of time server misuse resulted in problems for a web server at Trinity College, Dublin. The traffic was ultimately traced to misbehaving copies of a program called Tardis[3] with thousands of copies around the world contacting the web server and obtaining a timestamp via HTTP. Ultimately, the solution was to modify the web server configuration so as to deliver a customized version of the home page (greatly reduced in size) and to return a bogus time value, which caused most of the clients to choose a different time server.[4]

Netgear and the University of Wisconsin–Madison

The first widely known case of NTP server problems began in May 2003, when Netgear's hardware products flooded the University of Wisconsin–Madison's NTP server with requests.[5] University personnel initially assumed this was a malicious distributed denial of service attack and took actions to block the flood at their network border. Rather than abating (as most DDOS attacks do) the flow increased, reaching 250,000 packets-per-second (150 megabits per second) by June. Subsequent investigation revealed that four models of Netgear routers were the source of the problem. It was found that the SNTP (Simple NTP) client in the routers has two serious flaws. First, it relies on a single NTP server (at the University of Wisconsin–Madison) whose IP address was hard-coded in the firmware. Second, it polls the server at one second intervals until it receives a response. A total of 707,147 products with the faulty client were produced.

Netgear has released firmware updates for the affected products (DG814, HR314, MR814 and RP614) which query Netgear's own servers, poll only once every ten minutes, and give up after five failures. While this update fixes the flaws in the original SNTP client, it does not solve the larger problem. Most consumers will never update their router's firmware, particularly if the device seems to be operating properly.

SMC and CSIRO

Also in 2003, another case forced the NTP servers of the Australian Commonwealth Scientific and Industrial Research Organisation's (CSIRO) National Measurement Laboratory to close to the public.[6] The traffic was shown to come from a bad NTP implementation in some SMC router models where the IP address of the CSIRO server was embedded in the firmware. SMC has released firmware updates for the products: the 7004VBR and 7004VWBR models are known to be affected.

In 2005 Poul-Henning Kamp, the manager of the only Danish Stratum 1 NTP server available to the general public, observed a huge rise in traffic and discovered that between 75 and 90% was originating with D-Link's router products. Stratum 1 NTP servers receive their time signal from an accurate external source, such as a GPS receiver, radio clock, or a calibrated atomic clock. By convention, Stratum 1 time servers should only be used by applications requiring extremely precise time measurements, such as scientific applications or Stratum 2 servers with a large number of clients.[7] A home networking router does not meet either of these criteria. In addition, Kamp's server's access policy explicitly limited it to servers directly connected to the Danish Internet Exchange (DIX). The direct use of this and other Stratum 1 servers by D-Link's routers resulted in a huge rise in traffic, increasing bandwidth costs and server load.

In many countries, official timekeeping services are provided by a government agency (such as NIST in the U.S.). Since there is no Danish equivalent, Kamp provides his time service "pro bono publico". In return, DIX agreed to provide a free connection for his time server under the assumption that the bandwidth involved would be relatively low, given the limited number of servers and potential clients. With the increased traffic caused by the D-Link routers, DIX requested he pay a yearly connection fee of 54,000 DKK[citation needed] (approximately US$9,920 or €7,230[8][9]).

Kamp contacted D-Link in November 2005, hoping to get them to fix the problem and compensate him for the time and money he spent tracking down the problem and the bandwidth charges caused by D-Link products. The company denied any problem, accused him of extortion, and offered an amount in compensation which Kamp asserted did not cover his expenses. On 7 April 2006, Kamp posted the story on his website.[10] The story was picked up by Slashdot, Reddit and other news sites. After going public, Kamp realized that D-Link routers were directly querying other Stratum 1 time servers, violating the access policies of at least 43 of them in the process. On April 27, 2006, D-Link and Kamp announced that they had "amicably resolved" their dispute.[11]

IT providers and swisstime.ethz.ch

For over 20 years ETH Zurich has provided open access to the time server swisstime.ethz.ch for operational time synchronization. Due to excessive bandwidth usage, averaging upwards of 20 GB / day, it has become necessary to direct external usage to public time server pools, such as ch.pool.ntp.org. Misuse, caused mostly by IT-providers synchronizing their client infrastructures, has made unusually high demands on network traffic, thereby causing ETH to take effective measures. As of Fall 2012, the availability of swisstime.ethz.ch has been changed to closed access. Since beginning of July 2013, access to the server is blocked entirely for the NTP protocol.

Snapchat on iOS

In December 2016, the operator community NTPPool.org noticed a significant increase in NTP traffic, starting December 13.[12]

Investigation showed that the Snapchat application running on iOS was prone to querying all NTP servers that were hardcoded into a third party iOS NTP library, and that a request to a Snapchat-owned domain followed the NTP request flood.[13] After Snap Inc. was contacted,[14] their developers resolved the problem within 24 hours after notification with an update to their application.[15] As an apology and to assist in dealing with the load they generated, Snap also contributed timeservers to the Australia and South America NTP pools.[16]

The error-prone default settings were improved[17] after feedback from the NTP community.[18][19][full citation needed]

Firmware for TP-Link Wi‑Fi extenders in 2016 and 2017 hardcoded five NTP servers, including Fukuoka University in Japan and the Australia and New Zealand NTP server pools, and would repeatedly issue one NTP request and five DNS requests every five seconds consuming 0.72 GB per month per device.[20] The excessive requests were misused to power an Internet connectivity check that displayed the device's connectivity status in their web administration interface.[20]

The issue was acknowledged by TP-Link's branch in Japan who pushed the company to redesign the connectivity test and issue firmware updates addressing the issue for affected devices.[21] The affected devices are unlikely to install the new firmware as WiFi extenders from TP-Link does not install firmware updates automatically, nor do they notify the owner about firmware update availability.[22] TP-Link firmware update availability also varies by country, even though the issue affects all WiFi range extenders sold globally.[20][22]

The servers of Fukuoka University are reported as being shut down sometime between February and April 2018, and should be removed from the NTP Public Time Server Lists.[23]

Yandex speaker incident

In the fall of 2024, Yandex introduced a bug in the firmware of their speaker product, causing a massive overload[24] of Russian NTP servers in the NTP pool.[25]

Although Yandex was rolling out the new firmware gradually across their installed base, the full extent of the problem was not detected until 100% of the firmware had been updated.

After the incident was resolved by pushing a hotfix, Yandex announced several measures to prevent similar problems in the future. Among other actions, Yandex donated NTP servers to the pool, improved their monitoring, and indicated they would apply for a vendor zone,[26] which they did not have at the time.

Technical solutions

After the first major incidents, it became clear that apart from stating a server's access policy, a technical means of enforcing a policy was needed. One such mechanism was provided by extending semantics of a Reference Identifier field in an NTP packet when a Stratum field is 0.

In January 2006, RFC 4330 was published, updating details of the SNTP protocol, but also provisionally clarifying and extending the related NTP protocol in some areas. Sections 8 to 11 of RFC 4330 are of particular relevance to this topic (The Kiss-o'-Death Packet, On Being a Good Network Citizen, Best Practices, Security Considerations). Section 8 introduces Kiss-o'-Death packets:

In NTPv4 and SNTPv4, packets of this kind are called Kiss-o'-Death (KoD) packets, and the ASCII messages they convey are called kiss codes. The KoD packets got their name because an early use was to tell clients to stop sending packets that violate server access controls.

Chrony with Network Time Security for NTP (NTS) support

The new requirements of the NTP protocol do not work retroactively, and old clients and implementations of earlier version of the protocol do not recognize KoD and act on it. For the time being there are no good technical means to counteract misuse of NTP servers.

In 2015, due to possible attacks to Network Time Protocol,[27] a Network Time Security for NTP (Internet Draft draft-ietf-ntp-using-nts-for-ntp-19)[28] was proposed using a Transport Layer Security implementation. On June 21, 2019 Cloudflare started a trial service around the world,[29] based on a previous Internet Draft.[30]

References

  1. ^ Kamp, Poul-Henning (2006-04-08). "Open Letter to D-Link about their NTP vandalism". FreeBSD. Archived from the original on 2006-04-08. Retrieved 2006-04-08.
  2. ^ Gallagher, Sean (2014-02-11). "Biggest DDoS ever aimed at Cloudflare's content delivery network". Ars Technica. Archived from the original on 2014-03-07. Retrieved 2014-03-08.
  3. ^ "Tardis 2000". Archived from the original on 2019-08-17. Retrieved 2019-06-13.
  4. ^ Malone, David (April 2006). "Unwanted HTTP: Who Has the Time?" (PDF). ;login:. USENIX Association. Archived (PDF) from the original on 2013-07-28. Retrieved 2012-07-24.
  5. ^ "Flawed Routers Flood University of Wisconsin Internet Time Server, Netgear Cooperating with University on a Resolution". University of Wisconsin–Madison. Archived from the original on 2006-04-10. Retrieved 2020-07-06.
  6. ^ "Network Devices Almost Take Down Atomic Clock". Taborcommunications.com. 2003-07-11. Archived from the original on 2013-02-04. Retrieved 2009-07-21.
  7. ^ Lester, Andy (2006-02-19). "Help save the endangered time servers". O'Reilly Net. Archived from the original on 2007-08-18. Retrieved 2007-08-07.
  8. ^ "Currency Converter - Google Finance". Archived from the original on 2017-03-31. Retrieved 2016-11-11.
  9. ^ "Currency Converter - Google Finance". Archived from the original on 2017-03-31. Retrieved 2016-11-11.
  10. ^ Kamp, Poul-Henning (2006-04-27). "Open Letter to D-Link about their NTP Vandalism: 2006-04-27 Update". FreeBSD. Archived from the original on 2006-04-27. Retrieved 2007-08-07.
  11. ^ Leyden, John (2006-05-11). "D-Link settles dispute with 'time geek'". The Register. Archived from the original on 2019-05-10. Retrieved 2020-05-26.
  12. ^ "Recent NTP pool traffic increase: 2016-12-20". NTP Pool. 2016-12-10. Archived from the original on 2016-12-21. Retrieved 2016-12-20.
  13. ^ "NANOG mailing list archive: Recent NTP pool traffic increase: 2016-12-19". NANOG/opendac from shaw.ca. 2016-12-19. Archived from the original on 2017-09-24. Retrieved 2016-12-20.
  14. ^ "NANOG mailing list archive: Recent NTP pool traffic increase: 2016-12-20 18:58:57". NANOG/Jad Boutros from Snap inc. 2016-12-20. Archived from the original on 2017-04-19. Retrieved 2017-04-19.
  15. ^ "NANOG mailing list archive: Recent NTP pool traffic increase: 2016-12-20 22:37:04". NANOG/Jad Boutros from Snap inc. 2016-12-20. Archived from the original on 2017-04-20. Retrieved 2017-04-20.
  16. ^ "NANOG mailing list archive: Recent NTP pool traffic increase: 2016-12-21 02:21:23". NANOG/Jad Boutros from Snap inc. 2016-12-21. Archived from the original on 2017-04-19. Retrieved 2017-04-19.
  17. ^ "iOS NTP library: advance to v1.1.4; git commit on github.com". GitHub. 2016-12-20. Archived from the original on 2020-07-06. Retrieved 2017-04-19.
  18. ^ "iOS NTP library: Issue #47: Hardcoded NTP Pool names; github.com". GitHub. 2016-12-19. Archived from the original on 2020-07-06. Retrieved 2017-04-19.
  19. ^ "NTP Pool Incident Log - Excessive load on NTP servers". NTP Pool. 2016-12-30. Archived from the original on 2017-04-19. Retrieved 2017-04-19.
  20. ^ a b c Aleksandersen, Daniel (2017-11-23). "TP-Link repeater firmware squanders 715 MB/month". Ctrl Blog. Archived from the original on 2017-12-20. Retrieved 2017-12-21.
  21. ^ "TP-Link製無線LAN中継器によるNTPサーバーへのアクセスに関して" (in Japanese). TP-Link. 2017-12-20. Archived from the original on 2017-12-20. Retrieved 2017-12-21.
  22. ^ a b Aleksandersen, Daniel (2017-11-20). "TP-Link serves outdated or no firmware at all on 30% of its European websites". Ctrl Blog. Archived from the original on 2017-12-22. Retrieved 2017-12-21.
  23. ^ "福岡大学における公開用NTPサービスの現状と課題" (PDF) (in Japanese). Fukuoka University. Archived (PDF) from the original on 2018-01-29. Retrieved 2018-01-29.
  24. ^ "Collapse of Russia country zone". 2024-11-15.
  25. ^ "Об инциденте с NTP-серверами" (in Russian). Yandex. 2024-11-27.
  26. ^ https://www.ntppool.org/en/vendors.html
  27. ^ Malhotra, Aanchal; Cohen, Isaac E.; Brakke, Erik; Goldberg, Sharon (2015-10-21). "Attacking the Network Time Protocol" (PDF). Boston University. Archived (PDF) from the original on 2019-05-02. Retrieved 2019-06-23. We explore the risk that network attackers canexploit unauthenticated Network Time Protocol (NTP) traffic toalter the time on client systems
  28. ^ Franke, D.; Sibold, D.; Teichel, K.; Dansarie, M.; Sundblad, R. (30 April 2019). Network Time Security for the Network Time Protocol (html). IETF. I-D draft-ietf-ntp-using-nts-for-ntp-19. Retrieved 23 June 2019.
  29. ^ Malhotra, Aanchal (2019-06-21). "Introducing time.cloudflare.com". Cloudflare Blog. Archived from the original on 2019-06-21. Retrieved 2019-06-23. We use our global network to provide an advantage in latency and accuracy. Our 180 locations around the world all use anycast to automatically route your packets to our closest server. All of our servers are synchronized with stratum 1 time service providers, and then offer NTP to the general public, similar to how other public NTP providers function.
  30. ^ Franke, D.; Sibold, D.; Teichel, K.; Dansarie, M.; Sundblad, R. (17 April 2019). Network Time Security for the Network Time Protocol (html). IETF. I-D draft-ietf-ntp-using-nts-for-ntp-18. Retrieved 23 June 2019.