MLOps Live

Join our webinar on Improving LLM Accuracy & Performance w/ Databricks - Tuesday 30th of April 2024 - 12 noon EST

AWS S3 Outage Signals We MUST Decentralize Cloud

Yaron Haviv | March 3, 2017

Earlier this week, a significant portion of the internet was down due to an Amazon Web Services’ (AWS) S3 outage. While AWS blames the problem on the removal of a couple of servers, the real question is why have we created such a dependency on services like AWS?  Won’t it get worse with the coming IoT, 5G networking, DDoS wars and the constant movement of essential services to cloud infrastructures?

Tweets from a gentleman whose home “didn’t work” since his home automation is connected to AWS may seem humorous at first. But the tweets raise legitimate questions about what will happen in the future when all our devices and driverless cars will be connected.

AWS S3 Outage Signals We MUST Decentralize Cloud 2
Here’s Amazon’s official statement on the outage:

“The servers that were inadvertently removed supported two other S3 subsystems.  One of these subsystems, the index subsystem, manages the metadata and location information of all S3 objects in the region. This subsystem is necessary to serve all GET, LIST, PUT, and DELETE requests. The second subsystem, the placement subsystem, manages allocation of new storage and requires the index subsystem to be functioning properly to correctly operate.”

What they are saying is that big chunks of the internet depend on just one or two local services to function. Clearly it’s a system design flaw that AWS can work to avoid it in the future. However, this week’s outage happened without malicious intent. What happens if failures are initiated by terror organizations or rogue states?

When DARPA ideated and built the concept called “internet,” it’s goal was to avoid centralized control so the defense systems could survive in case of an attack. Have we forgotten those important roots?

Why Decentralize?
Rather than building dozens of centralized mega clouds that host thousands of essential services, we’ll be better served with thousands of distributed min-clouds connected in a mesh that is resilient to failures.

As 5G networks capable of gigabit traffic emerge in the near future, we will be surrounded by smart devices. With the Internet of Things (IoT) generating huge amounts of data and requiring real-time analytics and automated response to alerts, most companies and public services will move their data and computation to the cloud. But in doing so, our dependency on connectivity and latency will grow.

We built content delivery networks (CDNs) to afford good internet experiences. With CDN, web pages and video content is cached locally so it can deliver an improved internet experience. But the new cloud is different --  we run transactions, we upload content, we use it for essential services and much more in addition to the things we’ve used with CDN. What that means is that CDNs aren’t going to help us in the future.

It’s no longer science fiction to think about the next wars as cyber-attacks on essential infrastructure, all in the cloud. We must think again like the folks at DARPA did when they created the Internet.

In his interesting talk on the “end of cloud,” Peter Levine discusses how it need to be decentralized. Here’s a snapshot from that presentation:

AWS S3 Outage Signals We MUST Decentralize Cloud
What’s become increasingly clear is that it’s time to complement the mega clouds with edge and distributed mini-clouds. These closer-to-the-edge clouds should be used for providing the first line of response to all those sensors, devices, and essential services in the IoT. The edge must be distributed without points of failure and in a way that can sustain outages or malicious attacks. Make no mistake: the new edge is not CDNs. Rather, it’s a new type of edge that incorporates real-time computation, data, and machine learning and analytics.