DL 422

Good for Humans and Machines

Published: February 22, 2026 β€’ πŸ“§ Newsletter

We build tools to serve us. Then we forget to ask who's actually in control.

This week three stories surfaced the same tension from different angles. An archive site that quietly turned its readers into weapons. A file format from 1993 that still stumps the most advanced AI systems because it was built for people, not machines. An AI executive who compared his product to a human being to sidestep accountability.

The throughline isn't technology failing us. It's what happens when we stop asking who a tool is really built for and what it costs us when we find out too late.

If you've found value in these issues, subscribe here or support me here on Ko-fi.


πŸ”– Key Takeaways


πŸ—ƒοΈ The Archive That Lied

For a while, I used Archive.today regularly. When I shared a link in this newsletter and a subscriber couldn't get past a paywall, an archive link was the workaround. When a webpage changed or disappeared, it was a lightweight way to preserve what was actually there. It felt like a useful, neutral tool.

A few months ago, when the FBI subpoenaed the site's records, I assumed it was a standard copyright crackdown. The government siding with big media companies to protect subscriptions.

This week may have changed that. Wikipedia announced it's scrubbing nearly 700,000 links to Archive.today after discovering the site wasn't neutral at all. It had been turned into a personal weapon.

The situation started with a blogger named Jani Patokallio, who wrote a 2023 post investigating the identity of Archive.today's anonymous operator. The operator took it personally. Archive.today's CAPTCHA pages were loaded with malicious code that used visitors' browsers to launch a denial-of-service attack against Patokallio's blog, without anyone knowing. Every person who clicked an archive link was unknowingly conscripted into attacking a critic's website.

That alone would have been enough. But Wikipedia editors also found that Archive.today had edited its own snapshots. Inserting Patokallio's name into archived pages where it didn't belong, apparently to mock or implicate him. The "frozen in time" record had been quietly altered to serve a grudge.

Why This Matters: When an archive site like Archive.today is apparently caught weaponizing its code and falsifying its data, it breaks the Internet’s "permanent record" in two ways.

First, it turned readers into unwitting participants in a cyberattack. Second, and more troubling, it broke the concept of an objective record. If an archive owner can edit "frozen" snapshots to settle a personal score, those links aren't proof of anything. They're a manipulation tool.

The cleanup is now underway. 700,000 links being replaced across Wikipedia with alternatives like the Internet Archive, a US-based nonprofit that operates transparently. The difference between the two was never really technical. It was always about accountability.


πŸ“„ The PDF Paradox

PDFs were designed in the 1990s to look the same on every screen and printer. Essentially pictures of documents, not structured data. That's great for humans. It's a nightmare for machines.

When AI tries to read a PDF, it sees coordinates and character codes rather than logical text. Columns, tables, footnotes, and handwritten annotations all become potential failure points. Even the best current models hallucinate content, mix up rows, or simply miss things.

This matters more than it might seem. Researchers digging through millions of government documents (the Epstein files, for instance) found them essentially unsearchable because the tools couldn't handle messy scans. Meanwhile, AI companies that have exhausted easy web text for training are realizing that the highest-quality knowledge (textbooks, legal filings, scientific papers) is locked inside billions of PDFs they can't reliably read.

How are they trying to fix it?

Instead of using one "big" AI, perhaps we need to use a "team" of small, specialized AIs to fill different jobs.

  1. One AI acts as a "Scanner," identifying headers, tables, and images.
  2. If it sees a table, it sends it to a "Table Specialist" AI.
  3. If it sees a chart, a "Graph Specialist" turns it into data.
  4. Finally, a "Vision AI" looks at the whole thing to make sure it makes sense.

Why this matters: We are currently in a "race to liberate" trillions of pieces of information trapped in PDFs. We’re about 98% of the way there, but that last 2%(handling coffee stains, scribbled notes, and weird formatting) is what separates a "smart" AI from a truly useful tool.

It's worth noting for those of us in education, the effort going into building accessible, well-structured PDFs may have an unintended side effect. Those same practices make documents harder for AI models to scrape and train on. Perhaps accessibility and friction can work together. :)


πŸ”‹ The Energy Debate: AI vs. Humans

At an AI summit in India this week, OpenAI CEO Sam Altman called concerns about AI's water usage "completely fake" and "totally insane." He acknowledged total energy consumption is a real concern, then offered a comparison. Humans take 20 years of food and resources just to "get smart," so once an AI model is trained, it may actually be more energy-efficient per query than a human thought.

It's worth sitting with what that comparison actually claims. Altman isn't just arguing that AI is efficient. He's implicitly equating AI models with humans as equivalent sources of value and intelligence. That's a significant leap, and a convenient one. It positions AI development as a natural extension of human cognition rather than a corporate infrastructure buildout serving specific economic interests.

The reframe also shifts the conversation from systems to individuals. The real question isn't whether one ChatGPT query uses more energy than one human thought. It's whether building and running AI infrastructure at global scale is being developed responsibly. That's a power question, not a physics one.

It's also worth noting there are no legal requirements for tech companies to disclose their actual energy or water use. Scientists have to study it independently, working around deliberate opacity. When Altman calls specific concerns "fake," he's doing so from behind a wall of non-disclosure that makes independent verification essentially impossible.

The companies building global infrastructure get to define the terms of the conversation about their own impact. In this they frame the rest of us as the irrational ones for asking.


πŸ”Ž Consider

Frictions that slow us down can also be the frictions that make us think.
β€”Evan Selinger & Brett Frischmann, Re-Engineering Humanity

The stories this week aren't really about archives, PDFs, or energy debates. They're about who gets to define what's true, what's efficient, and what's worth questioning. Those aren't technical questions. They're human ones.

Perhaps friction isn't always the enemy. Sometimes it's the thing that keeps the record honest.


⚑ What You Can Do This Week


Previous: DL 421 β€’ Next: DL 423 β€’ Archive: πŸ“§ Newsletter

🌱 Connected Concepts