Centurylink outage :US web service provider CenturyLink has experienced a significant technical issue on Sunday following an error in the configuration of one of their data centres caused chaos all over the internet. Because of it being a technical outage that involved firewalls and BGP routing, the issue spread beyond CenturyLink’s network, affected the other Internet service providers, and ended with connectivity issues for a variety of other companies.
Table of Contents
A list of the tech titans that we’re unable to provide services due to the CenturyLink outage includes notable names such as Amazon, Twitter, Microsoft (Xbox Live), EA, Blizzard, Steam, Discord, Reddit, Hulu, Duo Security, Imperva, NameCheap, OpenDNS, and numerous others.
Cloudflare, which was significantly affected, reported that CenturyLink’s outward propagating problem caused a 7.5 per cent drop in global traffic on the internet that would make one of the most significant internet outages ever recorded.
Root cause: Incorrect configuration of Flowspec rule
According to the CenturyLink Status page, the issue arose from CenturyLink’s data centre located in Mississauga, the city that is located close to Toronto, Canada. The telco claims that the primary reason for the issue was an error in the Flowspec announcement.
Flowspec can be described as an expansion of BGP, and it is an extension of the BGP protocol that permits companies to use BGP routes to spread firewall rules across their networks. Flowspec announcements are typically utilized in security issues such as BGP hijacks or DDoS attacks because it permits companies to alter the entire network to deal with and minimize attacks in just a few minutes.
The company, however, CenturyLink said that its Mississauga data centre issued an error-filled Flowspec announcement which effectively stopped CenturyLink’s BGP routes from getting established. Cloudflare observed the event from far away and believes that CenturyLink effectively put their entire network in the loop by announcing a new collection of BGP routes and then disabling all courses through the wrongly configured Flowspec rule.
BGP routes act as the glue which holds the internet running. They are a kind of information that internet providers communicate between themselves. BGP routes inform each internet service provider which portion of IP addresses is accessible on their network.
However, when CenturyLink’s error-prone Flowspec command slowed down some routers in its network, a few of the routers started to display inaccurate BGP routes for another “Tier 1” neighbouring internet service. This resulted in bringing down other networks, creating a domino effect.
The outage took seven hours to resolve.
CenturyLink solved the issue by taking the unusual step of telling the other top Tier 1 broadband providers to stop peering and block any traffic that is coming through its network. Most companies do not make these types of decisions because it causes a complete loss of connectivity for all customers. CenturyLink had to reset the entire equipment and begin using clean BGP routing tables. It took close to seven hours, from 12:13 UTC until 18:58 UTC, the company stated.
“This was a significant global Internet outage,” said Matthew Prince, co-founder & CEO of Cloudflare, when he analyzed the issue.