DL 422

Good for Humans and Machines

Published: February 22, 2026 • 📧 Newsletter

We build tools to serve us. Then we forget to ask who's actually in control.

This week three stories surfaced the same tension from different angles. An archive site that quietly turned its readers into weapons. A file format from 1993 that still stumps the most advanced AI systems because it was built for people, not machines. An AI executive who compared his product to a human being to sidestep accountability.

The throughline isn't technology failing us. It's what happens when we stop asking who a tool is really built for and what it costs us when we find out too late.

If you've found value in these issues, subscribe here or support me here on Ko-fi.

🔖 Key Takeaways

Accountability is infrastructure. Archive.today was load-bearing for Wikipedia's verifiability, until it wasn't. The difference between a trusted archive and a weaponized one was never technical. It was always about who's accountable for what.
Capability and reliability are not the same thing. AI systems that can solve advanced physics problems still can't reliably read a 1990s PDF, and the failure rate is highest exactly where precision matters most.
Equating AI with humans isn't just a rhetorical move — it's a strategic one. When Altman compares AI training to human development, he's positioning corporate infrastructure buildout as a natural extension of human intelligence. That framing deserves scrutiny.
The transparency gap enables the narrative gap. No disclosure requirements means executives can dismiss environmental concerns as "fake" while making independent verification essentially impossible.
Anonymity without accountability is a liability. We build critical systems on top of anonymous, opaque platforms and only discover the risk after we're deeply dependent on them.

🗃️ The Archive That Lied

For a while, I used Archive.today regularly. When I shared a link in this newsletter and a subscriber couldn't get past a paywall, an archive link was the workaround. When a webpage changed or disappeared, it was a lightweight way to preserve what was actually there. It felt like a useful, neutral tool.

A few months ago, when the FBI subpoenaed the site's records, I assumed it was a standard copyright crackdown. The government siding with big media companies to protect subscriptions.

This week may have changed that. Wikipedia announced it's scrubbing nearly 700,000 links to Archive.today after discovering the site wasn't neutral at all. It had been turned into a personal weapon.

The situation started with a blogger named Jani Patokallio, who wrote a 2023 post investigating the identity of Archive.today's anonymous operator. The operator took it personally. Archive.today's CAPTCHA pages were loaded with malicious code that used visitors' browsers to launch a denial-of-service attack against Patokallio's blog, without anyone knowing. Every person who clicked an archive link was unknowingly conscripted into attacking a critic's website.

That alone would have been enough. But Wikipedia editors also found that Archive.today had edited its own snapshots. Inserting Patokallio's name into archived pages where it didn't belong, apparently to mock or implicate him. The "frozen in time" record had been quietly altered to serve a grudge.

Why This Matters: When an archive site like Archive.today is apparently caught weaponizing its code and falsifying its data, it breaks the Internet’s "permanent record" in two ways.

First, it turned readers into unwitting participants in a cyberattack. Second, and more troubling, it broke the concept of an objective record. If an archive owner can edit "frozen" snapshots to settle a personal score, those links aren't proof of anything. They're a manipulation tool.

The cleanup is now underway. 700,000 links being replaced across Wikipedia with alternatives like the Internet Archive, a US-based nonprofit that operates transparently. The difference between the two was never really technical. It was always about accountability.

📄 The PDF Paradox

PDFs were designed in the 1990s to look the same on every screen and printer. Essentially pictures of documents, not structured data. That's great for humans. It's a nightmare for machines.

When AI tries to read a PDF, it sees coordinates and character codes rather than logical text. Columns, tables, footnotes, and handwritten annotations all become potential failure points. Even the best current models hallucinate content, mix up rows, or simply miss things.

This matters more than it might seem. Researchers digging through millions of government documents (the Epstein files, for instance) found them essentially unsearchable because the tools couldn't handle messy scans. Meanwhile, AI companies that have exhausted easy web text for training are realizing that the highest-quality knowledge (textbooks, legal filings, scientific papers) is locked inside billions of PDFs they can't reliably read.

How are they trying to fix it?

Instead of using one "big" AI, perhaps we need to use a "team" of small, specialized AIs to fill different jobs.

One AI acts as a "Scanner," identifying headers, tables, and images.
If it sees a table, it sends it to a "Table Specialist" AI.
If it sees a chart, a "Graph Specialist" turns it into data.
Finally, a "Vision AI" looks at the whole thing to make sure it makes sense.

Why this matters: We are currently in a "race to liberate" trillions of pieces of information trapped in PDFs. We’re about 98% of the way there, but that last 2%(handling coffee stains, scribbled notes, and weird formatting) is what separates a "smart" AI from a truly useful tool.

It's worth noting for those of us in education, the effort going into building accessible, well-structured PDFs may have an unintended side effect. Those same practices make documents harder for AI models to scrape and train on. Perhaps accessibility and friction can work together. :)

🔋 The Energy Debate: AI vs. Humans

At an AI summit in India this week, OpenAI CEO Sam Altman called concerns about AI's water usage "completely fake" and "totally insane." He acknowledged total energy consumption is a real concern, then offered a comparison. Humans take 20 years of food and resources just to "get smart," so once an AI model is trained, it may actually be more energy-efficient per query than a human thought.

It's worth sitting with what that comparison actually claims. Altman isn't just arguing that AI is efficient. He's implicitly equating AI models with humans as equivalent sources of value and intelligence. That's a significant leap, and a convenient one. It positions AI development as a natural extension of human cognition rather than a corporate infrastructure buildout serving specific economic interests.

The reframe also shifts the conversation from systems to individuals. The real question isn't whether one ChatGPT query uses more energy than one human thought. It's whether building and running AI infrastructure at global scale is being developed responsibly. That's a power question, not a physics one.

It's also worth noting there are no legal requirements for tech companies to disclose their actual energy or water use. Scientists have to study it independently, working around deliberate opacity. When Altman calls specific concerns "fake," he's doing so from behind a wall of non-disclosure that makes independent verification essentially impossible.

The companies building global infrastructure get to define the terms of the conversation about their own impact. In this they frame the rest of us as the irrational ones for asking.

🔎 Consider

Frictions that slow us down can also be the frictions that make us think.
—Evan Selinger & Brett Frischmann, Re-Engineering Humanity

The stories this week aren't really about archives, PDFs, or energy debates. They're about who gets to define what's true, what's efficient, and what's worth questioning. Those aren't technical questions. They're human ones.

Perhaps friction isn't always the enemy. Sometimes it's the thing that keeps the record honest.

⚡ What You Can Do This Week

Pay attention to where links take you. Before clicking an archived or shared link, take a second to notice where it originates and where it leads. Archive.today looked like a neutral tool right up until it wasn't. The infrastructure behind a link matters as much as the content at the other end.
Follow the motivation, not just the message. When tech leaders make bold claims about AI's capabilities or costs, ask who benefits from your believing them. Altman's framing of AI as equivalent to human intelligence isn't just a comparison. It's a positioning strategy. Understanding the incentive behind the message is part of reading it accurately.
Be intentional about the tools you choose. Before adopting a new tool, take time to ask: what is this for, who built it, and what kind of relationship does it create? The goal isn't to avoid technology. It's to remain a human in the loop, building loops worth living in rather than ones designed for someone else's benefit.

Previous: DL 421 • Next: DL 423 • Archive: 📧 Newsletter

🌱 Connected Concepts

Information Integrity — the conditions under which we can trust that a record accurately reflects what it claims to represent
Platform Accountability — the gap between what platforms promise and the mechanisms that exist to verify those promises
Knowledge Infrastructure — the systems we build to preserve and verify collective understanding, and what happens when they're built on unaccountable foundations
AI Discourse Accountability — how AI companies shape narratives about their own impact, and what accountability structures are needed to verify those claims
Useful Friction — the value of slowing down: accessibility practices, transparency requirements, and accountability structures that make systems more trustworthy
Human in the Loop — remaining an intentional participant in technological systems rather than a passive user of loops designed by others