Why Now Is the Moment to Back Up the Web

and rethink how we share knowledge

We must become undisciplined. The work we do requires new modes and methods of research and teaching; new ways of entering and leaving the archives of slavery.

Christina Sharpe, In The Wake


Over the past year, two intersecting crises have laid bare a hard truth: the Internet doesn’t always remember. First, grassroots archivists and volunteers scrambled to rescue tens of terabytes of U.S. government climate, health, and societal data slated for removal from official websites. Then, information preservation advocates sounded the alarm that swaths of scientific research, built on taxpayer-funded studies, could vanish with a single administrative whim.

In an age when even government data can vanish overnight, there’s never been a more urgent moment to support information-preservation efforts and to start backing up the web ourselves. Open access to reliable knowledge is the lifeblood of a healthy society, yet political pressures today threaten to lock down everything from climate records to classroom resources.

You’ll often hear “back up the web” uttered in prepper circles—people who build emergency plans and stockpile supplies in case society faces sudden collapse—but a recent video from EposVox (watch here) made me realize this isn’t just a survivalist hobby. It’s a professional imperative for anyone who teaches, researches, or learns online.

(Prepper: someone who prepares for a wide range of crises, natural disasters, economic breakdowns, and even geopolitical upheaval by planning and hoarding resources in advance.)

As researchers and educators, we’re more than passive consumers of digital content. We’re active contributors and stewards of knowledge. Suppose we want tomorrow’s learners, policymakers, and historians to find, trust, and build upon our work. In that case, we must act today to ensure our content survives and model best practices for our communities.

That urgency drives us to safeguard the web’s archives and rethink how we share knowledge.

The Fragility of Online Knowledge

Why Researchers & Educators Must Lead the Charge

Rethinking “How, Where, When, and Why” You Publish

Aspect Questions to Ask Yourself
How • Am I providing machine-readable metadata alongside my work?
• Have I version-controlled my drafts and final outputs?
Where • Is this hosted on a single proprietary platform (Academia.edu, ResearchGate)?
• Do I trust professional organizations and journals to keep my materials freely available online?
• Do I have mirrored copies (institutional repository, GitHub, personal server)?
When • Do I update archives when I revise or retract content?
• Have I timestamped my releases with a clear version history?
Why • Am I prioritizing broad access over platform convenience?
• How does this posting serve the public good and future scholarship?

Actionable Mindsets, Steps, Tools & Communities

Mindsets to Adopt

Practical Steps

  1. Keep Track of Every Draft
    Think of your work like a photo album of versions, so you can always flip back in time. Treating every save as a way-back snapshot, you’ll never lose an important idea, and you can always trace how your work evolved.
    • Google Docs/Drive: Every time you edit, Google automatically saves a “Version History.” Go to File → Version history → See version history, then click the three dots beside any entry to Name this version. Later, you can restore or download any snapshot.
    • Microsoft OneDrive/Office 365: Similar story—open your document in Word or PowerPoint online, then File → Info → Version History to view and restore past edits.
    • Dropbox: Right-click a file in your Dropbox folder, choose Version history, and you’ll see every saved change.
    • Simple File Naming: If you prefer your desktop, create a folder called MyPaper/ and save files like MyPaper_v1.docx, MyPaper_v2.docx, etc. It’s manual but foolproof.
  2. Capture the “Behind the Scenes” Too
    The whole scholarly journey doesn’t end at the published PDF. This transparency not only bolsters reproducibility, but it lets future learners see the real work behind the finished article.
    • Drafts & notes. Upload annotated manuscripts, early outlines, and meeting notes to your repo.
    • Raw data & code. Link to datasets (e.g., CSV, JSON) and analysis scripts in the same versioned project. Unsure if your data is “sensitive?” Check out Purdue University Library’s guide on sensitive data, which can help you identify it and all applicable laws and regulations.
    • Idea logs. Keep a simple markdown diary of “aha!” moments or serendipitous finds. You’ll thank yourself later when combing back through your research arc.
  3. Archive on Publication.
    When you hit “publish,” immediately deposit a copy in:
  4. Embed Rich Metadata.
    Include clear titles, authors, dates, and licenses in your file headers or HTML <meta> tags—so future archivists can catalog your work accurately.

Essential Tools

Communities to Join

Addressing The Elephant in the Room: AI Scraping Your Content

Large-language models (LLMs) and data-harvesting bots are already mining open repositories, preprint servers, and even academic slides for training data. That creates both risks and opportunities:

How to navigate this landscape:

“We’re not just writing for today’s clicks, we’re preserving tomorrow’s discoveries.”

By embracing a culture of redundancyopenness, and community engagement, we can ensure that the web remains a reliable, enduring home for research and teaching. Let’s start today, because the history we save now will be tomorrow’s foundation.