AWS Outage: Why The Cloud Service Crashed And Its Aftermath 

AWS Outage: Why The Cloud Service Crashed And Its Aftermath

AWS Outage: Why The Cloud Service Crashed And Its Aftermath

AWS’s northern Virginia cluster, known as US-EAST-1, has been identified as the problem cause, and it is the third time in the course of five years that this unit’s impairment has caused a major internet breakdown.  

Amazon Web Services (AWS) resumed smooth operations after an internet outage on Monday resulted in the disruption of the function of many websites like Snapchat, Fortnite and Reddit. These disruptions left scores of people across the globe unable to perform everyday tasks like processing online payments through platforms like Venmo, changing airline tickets, and making Zoom video calls. 

AWS offers applications and computer processes globally, and this outage brought its international operations to a standstill for a few hours. Amazon maintained that some AWS services had a backlog of messages and would take slightly longer to be resolved. 

This is the largest internet disruption since the 2024 CrowdStrike outage, which paralysed the functioning of the healthcare, banking, service and aviation sectors. These internet malfunctions expose the vulnerability of globally interconnected tech systems. 

AWS’s northern Virginia cluster, known as US-EAST-1, has been identified as the problem cause, and it is the third time in the course of five years that this unit’s impairment has caused a major internet breakdown. The problem arose from the Domain Name System (DNS), which prevented applications from locating the right addresses for AWS’s DynamoDB API, which is a cloud database which stores user data and other important information. 

AWS did not explain why this particular unit keeps crashing. However, the cloud service had said earlier that the reason such outages occur is due to a problem with the underlying subsystem, which monitors the functioning of its network load balancers, which are used to direct traffic across several servers to avoid overcapacity. 

The cloud service pinpointed that the main issue emerged from Amazon’s Elastic Compute Cloud service, also known as ‘EC2 internal network’, which equips AWS with the required cloud capacity based on its demand.  

Amazon is the world’s largest cloud service provider, followed by Microsoft’s Azure and Alphabet’s Google Cloud. Individuals, companies and governments across the world rely on AWS for computing power, data storage and other digital services. The US-EAST-1 site is the default location for many of AWS’s services and has caused such outages in 2020 and 2021. 

This outage raised concerns over the lack of better fault tolerance, with software developers and tech experts commenting that developers need backup cloud services and use the tools provided by AWS to protect themselves against such malfunctions at any of its many data centres. Such breakdowns shed light on how integrated global digital systems are and the problem of relying on a small number of global cloud providers. 

 Downdetector-parent company Ookla said that over 4 million users reported issues. British banks like Lloyd Bank, Bank of Scotland and telecom services like Vodafone were also affected by this internet disruption. Social media sites like Reddit, Duolingo and Snapchat were also down due to the outage. Amazon’s own shopping website, Prime Video and Alexa were also down due to this crash. 

However, what remains surprising is that cryptocurrency exchange Coinbase and trading app Robinhood also suffered from this malfunction, but Wall Street remained largely unaffected by this incident. In fact, Amazon shares were up 1.6% to $216.48. 

While these outages typically last a few hours, the aftermath is far worse. Companies and service providers are left dealing with backlogs like flight delays or cancellations, missed appointments, disruptions in the delivery of orders, etc, which could take days to resolve. While internet breakdowns can last only a few hours, reverting to normalcy usually takes many days. 

Therefore, companies must diversify their cloud service reliance. Such outages are not unheard of and are a rather frequent occurrence. The only way companies and other services can prepare for such global internet disruptions is by having cloud backups and by supporting themselves with multiple cloud services. 

Exit mobile version