Archive.today Blacklisted by Wikipedia After DDoS Attack: What You Need to Know

In late 2024, something pretty significant happened in the world of web archiving. Wikipedia—the encyclopedia that millions of people rely on daily—made a bold decision to blacklist Archive.today. The reason? The site allegedly embedded malicious code to hijack users' browsers for DDoS attacks, as reported by Ars Technica.

This isn't just a technical squabble between two websites. It's a wake-up call about the reliability of archived content, the risks of third-party services, and what happens when trust breaks down online.

Let me walk you through what actually happened, why it matters, and what this means for anyone who uses archived links on Wikipedia.

TL; DR

The Blacklist: Wikipedia removed roughly 695,000 links pointing to Archive.today from around 400,000 English-language articles according to TechCrunch.
The Attack: Archive.today allegedly embedded malicious JavaScript code that used visitors' browsers to launch DDoS attacks against a blogger's website, as detailed by PCMag.
The Trigger: The site demanded a blogger remove a 2023 investigation into Archive.today's ownership, and when he refused, the attacks began.
The Fallout: Wikipedia editors now need to replace Archive.today links with alternatives like the Wayback Machine or Internet Archive.
The Pattern: This is the second time Archive.today has faced blacklisting—it was banned in 2013, reinstated in 2016, and now blacklisted again.

What Archive.today Actually Is

Before we get into the drama, let's establish what we're talking about. Archive.today (sometimes written as archive.is or archive.vn depending on domain availability) is a web archiving service. Think of it like a time capsule for the internet.

When you submit a URL to Archive.today, the service captures a snapshot of that webpage at that exact moment. The text, images, layout—everything gets preserved. This is useful for journalists, researchers, and anyone who needs proof that something existed online at a specific time.

Archive.today exists largely because the internet moves fast. Websites change. Posts get deleted. Companies alter their messaging. Web archives let you pull up what things actually said before they disappeared or got edited.

For years, Wikipedia editors have relied on Archive.today as a citation tool. If someone claims a website said something, they can link to an archived version as proof. It's been a trusted resource in the Wikipedia ecosystem.

Or at least, it was.

The DDoS Attack: How It Went Down

Let's talk about the actual technical incident. According to a detailed explanation that emerged during Wikipedia's blacklist discussion, here's what happened.

Archive.today's CAPTCHA page—you know, that annoying verification step you have to complete to prove you're human—contained embedded JavaScript code. This code wasn't there to verify users. It was there to conduct a distributed denial-of-service (DDoS) attack, as noted by Ars Technica.

Every 300 milliseconds, as long as a user kept the CAPTCHA page open in their browser, that JavaScript code would send a request to a specific target. In this case, the target was the blog of Jani Patokallio, a developer and researcher.

But here's the clever part: the code sent requests with randomized search strings. Why? Because web servers cache repeated identical requests. By randomizing the queries, the code ensured that each request would bypass the cache and actually consume server resources. It's like repeatedly asking the same question with different wording—the server has to do the work each time.

This is a form of attack called a DDoS attack, and when multiplied across potentially thousands of Archive.today users all trying to access the service simultaneously, it becomes a serious problem. Patokallio's blog was getting hammered.

The Motivation Behind the Attack

So why would Archive.today do this? What provoked such an extreme response?

It all traces back to a 2023 blog post. Patokallio, the blogger being targeted, had published an investigation into Archive.today's actual ownership and operations. The post asked questions about who really ran the service, how it was funded, and what the true motivations were.

Archive.today's operators apparently didn't appreciate this scrutiny. They demanded that Patokallio remove the post. When he refused to comply, the attacks started.

It gets worse. According to reports, a subpoena was also issued to Tucows, the domain registrar handling Archive.today's domains. The subpoena sought information about who actually operated the service—essentially trying to uncover the identity of the person or people behind Archive.today.

This is where we see the full context. Archive.today, which markets itself as a transparency tool, operates with remarkable opacity. Nobody really knows who runs it. The service bounces between different domain names because it gets blocked in various jurisdictions. And when someone digs into these questions, they allegedly get attacked.

It's darkly ironic. A service designed to preserve information and maintain records apparently doesn't want transparency turned on itself.

Wikipedia's Response: The Blacklist Decision

Wikipedia's response was swift and decisive. The platform hosts one of the most important reference works on the internet. Millions of people use Wikipedia daily to learn about everything from history to technology. The links on Wikipedia carry weight. They drive traffic. They establish credibility.

When Wikipedia links to a source, it's saying, "This is worth reading. This is reliable."

So when the community discovered that Archive.today was using its platform's traffic to conduct cyberattacks, the response was unified. There was, according to Wikipedia's administrative discussions, "strong consensus" that the site needed to be blacklisted, as reported by TechCrunch.

The blacklist decision meant that going forward, any new links to Archive.today would be caught by Wikipedia's automated systems and prevented. More significantly, Wikipedia's editors would need to actively remove existing Archive.today links from approximately 400,000 English-language articles.

We're talking about 695,000 individual links across the encyclopedia. That's not just a cleanup task. That's a massive undertaking.

Wikipedia cited two main reasons for the blacklist:

First, the obvious one: "A website that hijacks users' computers to run a DDoS attack is untrustworthy."

Second, something equally damaging: Archive.today operators have allegedly altered archived content after the fact. If you can't trust that an archived page is actually what it claims to be, then the whole value proposition of a web archive collapses. The whole point is that it's supposed to be an unchangeable record of what something looked like at a specific time.

Why This Matters More Than Just Wikipedia

You might be thinking, "Okay, so Wikipedia removed some links. Isn't that just a Wikipedia problem?"

Not really. This incident reveals something much broader about the internet's infrastructure and our dependence on third-party services.

Think about how the modern internet works. Most websites don't operate completely independently. They rely on hosting providers, content delivery networks, domain registrars, DNS services, and dozens of other third-party services. We've outsourced huge portions of our digital infrastructure.

Archive.today demonstrated exactly how fragile this system can be. A service that seemed reliable for years turned out to have serious trustworthiness issues. But the problem runs deeper than just Archive.today.

What if every service you rely on had hidden vulnerabilities or undisclosed purposes? What if the analytics tool you use to track website visitors was also using your traffic for something else? What if your email provider was logging everything for reasons other than what you agreed to?

We largely trust these services because we don't have much choice. We need them to function online. But trust, once broken, is incredibly difficult to rebuild.

More immediately, the blacklist creates a practical problem for Wikipedia editors. They need to find alternative sources for citations that pointed to Archive.today. The Wayback Machine (operated by the Internet Archive) is the obvious alternative, but not every page was archived there.

QUICK TIP: If you're citing archived content for research or journalism, always verify that you have access to the original archived page and the timestamp. Don't rely solely on Archive.today links—backup important archived sources using multiple services like Internet Archive's Wayback Machine.

The History: Archive.today's First Blacklist in 2013

Here's something that makes this situation even more interesting: this isn't Archive.today's first run-in with Wikipedia. The service was previously blacklisted back in 2013.

At that time, concerns centered on different issues. Archive.today was relatively new, and there were questions about its reliability and the appropriateness of linking to it from Wikipedia's articles.

But here's what's notable: Archive.today was reinstated in 2016. The community apparently believed that whatever concerns existed had been addressed. The service was given another chance.

Now, three years after reinstatement—actually, more than eight years later—we're back to a blacklist. But this time, the reasons are dramatically more serious. This isn't about reliability or appropriateness. This is about active malice.

The pattern here suggests a troubling trajectory. When a service gets blacklisted, reinstated, and then blacklisted again with far worse accusations, it raises questions about what was happening during that middle period. Did the problematic behavior start after reinstatement? Was there any indication it was coming?

Unfortunately, these are the kinds of questions we can't easily answer when services operate with minimal transparency.

Technical Deep Dive: How the DDoS Attack Actually Worked

Let's get more technical for a moment, because understanding how this attack functioned is important for grasping why it's such a serious violation of user trust.

A DDoS attack typically requires the attacker to control many computers and have them all send requests to a target simultaneously. That's the "distributed" part. The attacker overwhelms a server by flooding it with traffic from many sources at once.

Traditionally, launching a DDoS attack requires either owning a botnet (a network of compromised computers) or renting one. It's infrastructure. It costs money and involves obvious technical preparation.

But Archive.today found a cheaper solution: use your legitimate users' browsers without their knowledge or consent.

When someone visits Archive.today and encounters a CAPTCHA, they're presumably doing so because they want to access an archived page. They expect the CAPTCHA to verify that they're human. They don't expect that their browser is simultaneously participating in an attack on someone else's server.

The brilliance (and the villainy) of this approach is that it's invisible. A user completes the CAPTCHA, accesses their archived content, and has no idea their browser was weaponized.

Moreover, the attack was persistent and scalable. As long as Archive.today received traffic—which, given its popularity, was constant—the attack would continue. The attacker didn't need to own botnets. They had thousands of volunteer computers from normal users.

The randomized query strings were also significant technically. Most web servers implement caching at multiple levels. If the same request comes in repeatedly, the server might serve it from cache instead of re-executing the database query or computation. This reduces load.

By varying the query strings, the JavaScript ensured that caching became useless. Every single request had to be processed fully. Every 300 milliseconds. From thousands of browsers.

For Patokallio's small blog, this would have been absolutely devastating. Depending on the server infrastructure, it could have caused the site to go offline entirely.

DID YOU KNOW: The largest DDoS attack recorded was **29.7 terabits per second** (Tbps)—confirmed by reports in 2024. Archive.today's attack wouldn't have come close to that scale, but it demonstrates how serious DDoS attacks have become at a global level.

The Credibility Question: "Altered Content"

The second major reason Wikipedia cited for the blacklist is particularly serious: Archive.today allegedly alters the content of archived pages after the fact.

This is almost worse than the DDoS attack because it undermines the fundamental purpose of a web archive. The entire value of services like this depends on their immutability. Once you archive something, it should remain exactly as it was captured.

If someone can go back and edit an archived page, then the archive becomes useless as evidence or as a historical record. You can't trust it. Any archived page could have been modified.

Think about the journalism and research use cases. A journalist might archive a web page showing a company's misleading claims. That archive serves as evidence. If the archive can be modified later, the evidence is worthless.

Wikipedia took this threat seriously. In their blacklist discussion, they explicitly noted that the ability to alter archived content made Archive.today an "untrustworthy" source.

The specific allegations of content alteration aren't detailed in the public discussion, so it's unclear whether this was happening at scale, or whether it was discovered in specific cases. But the possibility that it could happen was apparently enough to tip the scales.

The Domain Hopping Problem

One of Archive.today's defining characteristics is that it's not always at the same domain. Over the years, the service has operated under various domain names: archive.today, archive.is, archive.ph, archive.vn, and others.

Why the constant domain changes?

Mostly because the service keeps getting blocked. Various governments and organizations have determined that they don't want Archive.today operating on their preferred domains. Archive.today's response has been to keep switching to new domains in different country code top-level domains (ccTLDs).

This is problematic for several reasons.

First, it makes the service inherently unreliable. A link to archive.today might not work if that domain gets blocked where the user is located. They'd need to know about the alternative domains and try one of those instead.

Second, it suggests that the service is consistently operating in ways that make various authorities uncomfortable. Whether you think those authorities are justified or not, the pattern indicates that Archive.today operates in contested territory.

Third, it means that the people running Archive.today are deliberately obscuring their identity and infrastructure. If a service is legitimate and trustworthy, why does it need to constantly change domains and hide who's operating it?

For Wikipedia, this domain-hopping was probably already a concern. The blacklist decisions amplified those concerns into an outright ban.

The Subpoena: Uncovering Anonymous Operators

The subpoena issued to Tucows (Archive.today's domain registrar) is another crucial piece of this puzzle.

Domain registrars hold information about domain owners. When you register a domain, you're supposed to provide accurate information about who you are. That information is stored in the registrar's systems and is (usually) searchable in the WHOIS database.

However, many registrars offer privacy protection services. For a small fee, you can register a domain while keeping your personal information private. WHOIS shows the registrar's information instead of yours.

Archive.today likely used this privacy service, which is why nobody knew who was actually operating the site. A subpoena to the registrar forces them to disclose the information they have on file, bypassing the privacy protection.

The fact that someone was willing to subpoena the registrar suggests that Patokallio and whoever was supporting him considered this a serious matter worthy of legal action. It also suggests that they weren't able to resolve it through normal channels.

This speaks to the escalation of the conflict. We're not talking about a simple disagreement. We're talking about lawyers and subpoenas and cyberattacks.

Alternative Archiving Solutions: What's Available

So if Wikipedia and other sites are moving away from Archive.today, what are they moving toward?

The primary alternative is the Internet Archive's Wayback Machine. This is the most well-known web archiving service, and it's been operating since 1996. The Internet Archive is a nonprofit organization focused on preserving digital materials for public use.

The Wayback Machine is more established, more transparent about its operations, and more trusted by the broader community. It doesn't have the same baggage that Archive.today does.

However, the Wayback Machine isn't perfect. It can't archive everything (some sites block it), and coverage can be spotty depending on when a page was crawled. But it's generally considered more reliable than Archive.today.

Beyond that, there's no perfect solution. Every archiving service has limitations. Some operate regionally (like national library archives). Some specialize in specific types of content (like academic papers). But none have quite the scope and accessibility of Archive.today.

This is part of what makes the Archive.today situation so frustrating. The service filled a genuine need. It was useful. But it turns out the people operating it can't be trusted.

QUICK TIP: When archiving important content for research or legal purposes, use multiple services (Wayback Machine, Archive.today alternatives, local PDFs) rather than relying on a single archive. This reduces your risk if one service becomes unavailable or unreliable.

The Broader Trust Problem

Beyond the specifics of Archive.today, this incident raises fundamental questions about trust on the internet.

We rely on services we don't completely understand, operated by people we don't know, following business models that aren't always transparent. We accept this because the internet requires it. You can't verify every service yourself. You have to trust.

But what happens when that trust is broken? Archive.today was trusted. Lots of people used it. Then it turned out to be malicious.

This raises an uncomfortable question: How many other services might be doing something similar? How many services might be using user traffic for purposes users didn't agree to? How many might be altering data?

We don't know because we can't see inside these systems. We only find out when something goes catastrophically wrong.

For users, this means being more cautious about which services you depend on. For platforms like Wikipedia, it means being more aggressive about vetting sources.

For the internet broadly, it means we need better transparency and accountability mechanisms. We need ways to verify that services are doing what they claim to do, without just taking them on faith.

Wikipedia's Editorial Burden: 695,000 Links to Replace

Let's talk about the practical impact of this blacklist on Wikipedia itself.

Wikipedia has approximately 400,000 English-language articles that contain links to Archive.today. Not all of them can simply be deleted. Many represent the only available archived source for a claim.

Editors need to:

Identify all articles containing Archive.today links
Evaluate whether the link is essential to the article
If essential, find an alternative source (Wayback Machine, original source, different archive)
Replace or remove the link
Verify that the replacement source works and is reliable

This is basically a cleanup project involving hundreds of thousands of edits. It's manageable for Wikipedia's community of volunteers, but it's definitely a burden.

Moreover, some of those 695,000 links probably don't have good alternatives. If a page was only archived on Archive.today and nowhere else, the information is effectively lost to Wikipedia editors.

This is actually one of the key arguments for maintaining multiple, independent web archives. If you only have one archive service, you're vulnerable to exactly this kind of situation.

The Timeline: When Everything Went Wrong

Let's establish the rough timeline of events, though exact dates are somewhat murky because not all details have been made fully public.

2023: Patokallio publishes a blog post investigating Archive.today's ownership and operations.

Late 2024: Archive.today allegedly embeds DDoS code in its CAPTCHA page, directing it against Patokallio's blog. Patokallio documents the attack.

Late 2024: The incident comes to light. Details are shared with the Wikipedia community. A subpoena is apparently issued or prepared to uncover Archive.today's true operators.

Late 2024: Wikipedia's community consensus emerges that Archive.today should be blacklisted. The blacklist is implemented.

Going Forward: Wikipedia editors begin the process of replacing or removing Archive.today links.

The exact timeline matters less than understanding the progression: investigation → attack → exposure → consequences.

DID YOU KNOW: Wikipedia's Community has processed **over 23 million edits** since its founding. The removal and replacement of 695,000 links represents approximately **3% of the total edit activity across all time**, making this cleanup a significant undertaking.

Lessons for Internet Infrastructure

So what should we take away from the Archive.today situation?

First lesson: Anonymity enables bad behavior. Archive.today's anonymity made it easier for the operators to feel unaccountable. They could use the service for DDoS attacks without fear of immediate personal consequences. If operators had to be publicly identified and accountable for their actions, this behavior might never have happened.

Second lesson: Don't put all your eggs in one basket. Wikipedia's reliance on Archive.today meant that when the service proved untrustworthy, Wikipedia had a major cleanup problem. Distributed redundancy—using multiple archives, multiple sources—is more resilient.

Third lesson: Transparency matters. Archive.today's refusal to be transparent about its operations and ownership should have been a red flag earlier. Services that won't explain how they work or who operates them are inherently suspicious.

Fourth lesson: Bad actors exploit legitimate use cases. Archive.today provided a genuine service. Web archiving is important. But that service was corrupted by its operators. This is a common pattern online—legitimate tools get weaponized by bad actors.

Fifth lesson: Once trust is broken, recovery is nearly impossible. Archive.today might eventually try to make amends and rebuild its reputation. But the damage is done. The community has moved on to alternatives. Recovery would be an uphill battle.

What This Means for Users and Researchers

If you're someone who uses archived content for research, journalism, or reference purposes, the Archive.today blacklist affects you.

Links that worked yesterday might not resolve today. References you relied on might now be broken. You may need to find alternative sources or different archived versions of content you need.

The practical advice: Don't rely on single sources for archived content. When you find something important archived, try to capture it multiple ways. Screenshot it. Save a PDF. Check if the Wayback Machine has a copy too.

Long-term, this incident might lead to better web archiving infrastructure. It might push the Internet Archive and other legitimate services to expand their capabilities. Or it might lead to discussions about regulation of archiving services.

But in the immediate term, it's a disruption that's particularly felt by people working in fields that depend on historical records.

The Question of Intent: Desperation or Malice?

One question that's worth asking: Was Archive.today's DDoS attack an act of desperation or intentional malice?

We don't know the full context of why the operators felt so threatened by Patokallio's investigation. Were they protecting genuinely sensitive information? Or were they hiding something they knew was wrong?

The fact that they demanded the removal of the post and allegedly issued or threatened legal action suggests they wanted to suppress the investigation. When suppression didn't work, they allegedly resorted to attacks.

That looks more like desperation—people who felt backed into a corner and responded with force. Not necessarily malicious in intent, but absolutely unacceptable in execution.

Of course, that's speculation. The people running Archive.today haven't explained their side of the story publicly (or at least not in a way that's widely documented).

Future of Web Archiving

This incident will probably have ripple effects on web archiving services going forward.

We might see more scrutiny applied to archiving services. Organizations like Wikipedia might demand more transparency before linking to archives. There might be discussions about standards and verification for archiving services.

We might also see more resources directed toward established services like the Internet Archive, as people prefer to use platforms they trust more thoroughly.

Long-term, the ideal solution would be a decentralized approach to web archiving. Instead of relying on single services, we'd have a network of archives, potentially blockchain-based or distributed in some other way, where no single operator can unilaterally alter or weaponize the data.

But that's future-looking speculation. For now, Archive.today is blacklisted, 695,000 links are being removed, and the web archiving community is learning hard lessons about trust and verification.

What We Can Learn About Digital Trust

The Archive.today situation is ultimately a lesson about digital trust in the modern internet.

Trust is essential for the internet to function. We can't verify everything ourselves. We have to make judgments about which services to trust and rely on those judgments.

But trust can be broken. And when it is, the consequences cascade outward. One service's betrayal affects everyone who depended on it.

The challenge is that there's no perfect solution. You can try to vet services carefully, choose well-established organizations, look for transparency and accountability. But ultimately, there's always an element of risk.

What you can do is minimize that risk by:

Not relying on single sources for critical information
Preferring services that are transparent about their operations
Favoring established organizations with reputational incentives to be trustworthy
Being skeptical of services that demand anonymity or resist oversight
Maintaining your own copies of important content rather than depending entirely on services

These practices won't guarantee security, but they'll make you more resilient when services inevitably fail.

Conclusion: A Turning Point for Web Archives

The blacklisting of Archive.today by Wikipedia represents a significant moment in the history of web archiving and online trust.

It's not just about one service or one incident. It's about the broader question of how we preserve information online, who we trust to do that work, and what happens when trust breaks down.

Archive.today provided a useful service. It let people capture snapshots of the web. But it turns out that service was operated by people willing to weaponize user traffic to silence critics and investigations.

That's a fundamental violation of the implicit contract between a service and its users. And it has consequences.

For Wikipedia, those consequences mean a massive cleanup effort. For researchers and journalists, it means finding alternative sources. For the broader internet community, it means another data point in the ongoing tension between privacy, anonymity, and accountability.

Moving forward, the lesson is clear: when evaluating which services to depend on, transparency and accountability matter. Services that hide their operators and resist scrutiny probably deserve that scrutiny.

Web archiving is important work. We need services that do it well. But we need services we can trust. Archive.today proved it can't be that service.

The internet is better off with that lesson learned, even if the learning process was painful.