Beyond the Monoculture: Why now is the time to re-examine Internet Resilience

Nov 21

Recent outages across major providers have shown how tightly connected modern Internet services have become. This article examines what those events reveal about today’s infrastructure, and why it may be valuable to revisit some long-standing architectural assumptions.

By Dean Moheet

Earlier this week CloudFlare experienced a high-impact outage that generated headline coverage even in mainstream coverage. Within the last month so have both AWS and Azure. While none of these incidents lasted longer than a business day, each created a wave of disruption that spread beyond the providers themselves. Businesses with no direct relationship Cloudflare, AWS or Azure suddenly found core services unavailable.

Events like these invite a closer look at the underlying assumptions that shape how we build for reliability in a world where a handful of the most trusted infrastructure providers handle such a large percentage of the internet. And while these incidents don't necessarily mean organizations should abandon hyperscalers, they should cause us to grapple with the reality that the Internet increasingly behaves like a monoculture, vastly interconnected but dependent on a narrow set of providers whose failures ripple globally.

Putting a Spotlight on Today’s Centralized Internet

The early Internet was intentionally decentralized. It was designed so that no single point of failure could cause a critical failure of the infrastructure itself. I’m not telling you anything you don't already know, but the fact is that over time the desire for commercial growth, convenience, economics and developer efficiency concentrated enormous portions of global traffic into just a few hands.

The resultant hyperscalers offer remarkable capability and have allowed incredible rapid scaling we benefit from today, but they also concentrate risk. When something goes wrong inside one of these global platforms, the effects can cascade quickly across the Internet. The recent outages are reminders that centralization, even when it brings benefits, can create fragility we may not fully appreciate.

This isn’t an indictment of Cloudflare, AWS, Azure, Crowdstrike and others. Far from it. It’s an acknowledgment of how much the industry relies on a handful of logos.

When Control Planes Become Central Points of failure

Even highly distributed networks have hidden central components like routing logic, identity systems, DNS authorities, and global decision-making layers that coordinate thousands of edge locations. These control-planes are efficient, but they are also binding points where misconfigurations or internal failures can ripple across continents.

It’s easy to assume that global distribution equals inherent resilience. But the Cloudflare and AWS incidents of the last month show that logical centralization can outweigh physical decentralization. Understanding that distinction is an important part of rethinking how we design for continuity.

Assumptions Worth Revisiting

Most companies believe they have redundancy because they use large, reputable providers. But in practice, many environments rely on a single provider for DNS, CDN or reverse-proxy path, their single global ingress route, and bundled DDoS protection.

These choices often grew out of normal operational pressures. These choices often grew out of normal operational imperatives to simplify where you can, reduce vendor count, or deep integration. While there’s nothing inherently wrong with this operational calculus, these recent outages seem to be a signal that these assumptions may deserve re-examination moving forward. Organizations may find that what feels like redundancy within your infrastructure actually works out to being anchored to common a root dependency across multiple layers. While best practices appear to have been followed, a hidden underlying risk stemming from centralization may have been added.

A barrier to the re-examination above is the assumption that de-integration will be both costly and complex operationally. However not all layers of the stack carry the same cost or complexity to diversify. It is true that internal services like databases, application logic, and analytics pipelines tend to be deeply coupled with tooling, workflows, and developer processes. Diversifying them across multiple vendors or platforms often requires significant redesign and long-term operational commitment.

But infrastructure layers closest to the edge operate differently, making diversity much more achievable.

Why the Edge Is the Most Practical Place to Diversify

DNS, CDN routing, DDoS mitigation, and traffic steering are naturally quite modular because they sit at the boundary of the network, where interoperability, open standards, and protocol-level compatibility make it far easier to use multiple providers in parallel. These layers don’t have the same tight integration requirements that deep application services do, which means adding a second provider is far more practical and far less disruptive than most teams assume.

These edge functions are often where outages begin, and so diversifying them can offer disproportionate resilience benefits without overhauling the entire architecture. Beyond resilience, diversification at the edge also opens opportunities for best-of-breed architectures. Instead of accepting one vendor’s one-size-fits-all approach, organizations can choose the strongest provider for each layer: a DNS service optimized for performance, a CDN optimized for routing intelligence, a standalone DDoS solution optimized for high quality attack filtering and analytics, etc.

Diversification can also yield tangible improvements in terms of vendor support during critical times or issues. The very structure of a hyperscaler means you are one of tens of thousands of voices needing support when a widespread issue occurs. The support to client ratios are simply too lopsided to allow the kind of access or control that is needed. Often leaving you with little option other than to head to reddit, X or status pages for information.

A New Discussion which Creates an Opportunity for Infrastructure Providers

These high profile incidents are leading to more widespread discussions about the risks of monoculture centralization and shifting the narrative.

For non-hyperscaler IaaS operators, network providers, and hosting companies, this shift represents a real opportunity. Customers who once assumed “the hyperscaler is always the safest choice” may become more open to architectures that blend global scale with independent regional paths. Providers who can offer those alternative paths and explain their value, may find that clients once again view vendor diversity as a strength.

It also means that investments in developing visibility and control surfaces for clients can help differentiate solutions and deliver tangible value to prospective clients. These are typically lacking across all layers at the largest hyperscalers who expend more efforts in bundling services rather than making each service the best of breed in every respect possible

The Moment for Reflection

The recent outages do not suggest abandoning hyperscalers altogether. Rather, they are strong reminders that the logic of “just put everything on the biggest platform” may be due for an update. Resilience now depends less on choosing the biggest provider and more on distributing critical edge functions so that no single incident can take down the entire stack.

This is a moment to revisit long-standing assumptions and consider where diversity, independence, and architectural separation can strengthen internet resilience, and renew the importance of vendor diversity within the tech stack.

Dean Moheet