Breaking: Nonprofit Digital Library Compromised
The Internet Archive, home to the Wayback Machine and 31 million users, has suffered three separate cyber attacks in October 2024, exposing fundamental security failures in nonprofit digital infrastructure.
Executive Summary
October 2024 will be remembered as one of the darkest months in the history of digital preservation. The Internet Archive, a nonprofit organization that has served as humanity's digital memory for over 25 years, has been brought to its knees by a series of sophisticated and opportunistic cyber attacks.
The attacks exposed 31 million user accounts, compromised support tickets dating back to 2018, and revealed a pattern of security negligence that should serve as a wake-up call for nonprofit organizations worldwide. More critically, they demonstrate the vulnerability of our digital cultural heritage to motivated threat actors.
First Attack: The Data Breach (October 9, 2024)
October 9, 2024: Initial Compromise
The first attack was discovered when visitors to archive.org encountered an unexpected JavaScript alert message that read:
"Have you ever felt like the Internet Archive runs on sticks and is constantly on the verge of suffering a catastrophic security breach? It just happened. See 31 million of you on HIBP!"
This brazen message, injected directly into the website by the attackers, announced the compromise in the most public way possible.
Attack Vector: Exposed GitLab Token
The breach was enabled by a GitLab authentication token that had been exposed since 2022. This token, which should have provided temporary access and been rotated regularly, remained active and unprotected for nearly two years.
Using this token, attackers gained access to:
- Internet Archive's source code repositories
- Database credentials and connection strings
- User authentication systems
- Historical data stores
Data Compromised
The stolen data included:
- Email addresses for all 31 million registered users
- Screen names and usernames
- Bcrypt-hashed passwords (encrypted but potentially crackable)
- Account creation dates
- Internal user identifiers
Simultaneous DDoS Attack
Compounding the data breach, a hacktivist group calling themselves "SN_BLACKMETA" launched a distributed denial-of-service (DDoS) attack against archive.org. While appearing coordinated, Internet Archive confirmed these were separate attacks with different motivations.
The DDoS attack, while less sophisticated than the data breach, succeeded in taking the Wayback Machine offline for several days, disrupting access to 866 billion archived web pages.
Second Attack: Zendesk Token Exploitation (October 20, 2024)
Just as the Internet Archive was beginning to recover from the first attack, attackers struck again, this time exploiting exposed Zendesk API tokens to access years of support tickets.
The Vulnerability
Like the GitLab token, the Zendesk API credentials had been left unrotated after the initial breach. This represents a fundamental failure in incident response procedures where all authentication credentials should be immediately invalidated following a security incident.
Support Ticket Exposure
The compromised support system contained:
- Six years of support tickets dating back to 2018
- Personal identification documents submitted by users requesting content removal
- Correspondence about sensitive content users wanted removed from the Archive
- Detailed account information from troubleshooting interactions
- Internal communication between Archive staff members
The DMCA Request Problem
Particularly concerning is the exposure of DMCA takedown requests and content removal tickets. Users who requested removal of sensitive, potentially identifying, or damaging content had their identities and concerns exposed to the very attackers who now have leverage for extortion.
This creates a secondary victimization scenario where individuals seeking to protect their privacy are instead more vulnerable than ever.
Third Attack: The Pattern Continues
The Internet Archive confirmed a third security breach in October 2024, though full details remain limited. What's clear is that the attackers maintained persistent access to Internet Archive systems throughout the month, likely due to additional exposed credentials or backdoors established during the initial compromise.
This third attack underscores a critical point: the Internet Archive never fully regained control of its security posture after the first breach.
Root Cause Analysis: Why This Happened
1. Credential Management Failures
The most glaring security failure was the lack of credential rotation:
- GitLab token exposed since 2022 - Two years of exposure before exploitation
- No automated rotation policies - Tokens set to never expire
- Post-breach credentials not invalidated - Zendesk tokens remained active after initial compromise
- No monitoring for token usage - Unauthorized access went undetected
2. Nonprofit Resource Constraints
The Internet Archive operates on a nonprofit budget with limited cybersecurity resources:
- No dedicated security operations center (SOC)
- Limited budget for enterprise security tools
- Small IT team managing massive infrastructure
- Competing priorities between preservation and security
3. Legacy System Technical Debt
The Internet Archive's infrastructure has grown organically over 25+ years, resulting in:
- Mixture of old and new systems with inconsistent security standards
- Complex interdependencies making security updates risky
- Limited documentation of security controls
- Accumulated technical debt in authentication systems
4. Incident Response Gaps
The response to the first breach revealed critical gaps:
- Incomplete credential inventory and rotation
- No comprehensive system access audit
- Limited forensic capability to understand full breach scope
- Insufficient communication with affected users
The Broader Context: Nonprofit Organizations Under Attack
The Internet Archive is not alone. October 2024 has seen a pattern of attacks against knowledge institutions:
- British Library - Ransomware attack in 2023, still recovering
- Calgary Public Library - Systems compromised in 2024
- Seattle Public Library - Ransomware attack forced system shutdown
- Toronto Public Library - Major breach affecting patron data
Why Libraries and Archives?
These institutions are attractive targets because they:
- Hold valuable data - Millions of user accounts and personal information
- Have limited security budgets - Nonprofits prioritize mission over security
- Run legacy systems - Decades-old infrastructure with known vulnerabilities
- Lack incident response capabilities - No dedicated security teams
- Face public pressure to maintain access - Can't easily take systems offline
Impact on Australian Users and Organizations
For Australian individuals and organizations who rely on the Internet Archive:
Immediate Risks
- Credential stuffing attacks - If Archive passwords were reused elsewhere
- Targeted phishing campaigns - Using leaked email addresses
- Social engineering - Attackers know who uses archival services
- Research disruption - Academic and legal research depends on Archive access
Recommendations for Australian Users
If You Have an Internet Archive Account
Lessons for Australian Organizations
1. Credential Management is Critical
The Internet Archive's failure came down to basic credential hygiene:
- Implement automated API token rotation (maximum 90-day lifespan)
- Maintain an inventory of all service accounts and API keys
- Monitor for exposed credentials in public repositories
- Immediately rotate all credentials after any security incident
2. Nonprofit ≠ Not a Target
Australian nonprofits must recognize they are attractive targets:
- Cultural institutions hold valuable historical data
- Charities maintain donor and beneficiary information
- Educational nonprofits store student and research data
- Advocacy organizations are targeted for political reasons
3. Incident Response Cannot Wait
The second and third attacks were enabled by incomplete incident response:
- Have a documented incident response plan before you need it
- Rotate ALL credentials immediately upon detecting compromise
- Engage external forensic assistance for serious breaches
- Communicate transparently with affected users
4. Security is a Budget Priority
Cybersecurity cannot be treated as a discretionary expense:
- Allocate minimum 5-10% of IT budget to security
- Invest in security tools before pursuing new features
- Factor security costs into grant applications
- Consider cyber insurance for nonprofits
The Digital Preservation Crisis
Beyond the immediate security incident, the Internet Archive attacks raise existential questions about digital preservation:
Who Protects Our Digital Heritage?
The Internet Archive preserves:
- 866 billion archived web pages
- 44 million books and texts
- 10.6 million videos
- 4.8 million audio recordings
- 4 million images
This represents an irreplaceable record of human culture and knowledge. Yet it's protected by an organization with fewer security resources than a mid-sized e-commerce company.
The Funding Model Problem
Digital preservation operates on a broken economic model:
- Users expect free access to preserved content
- Governments provide minimal funding for digital preservation
- Private donors prioritize new initiatives over maintenance
- Security improvements are invisible and hard to fundraise for
A Call for Government Action
Australia should consider:
- Establishing a National Digital Preservation Fund
- Requiring minimum security standards for cultural institutions
- Providing cybersecurity resources to underfunded archives
- Creating redundant preservation infrastructure
- Coordinating international digital preservation efforts
Current Status and Recovery
As of October 24, 2024:
- Wayback Machine: Restored with limited functionality
- Archive.org blog: Online
- Archive-It service: Available
- OpenLibrary: Partially restored
- Email services: Offline pending security review
- User authentication: Disabled until password reset process complete
Recovery Timeline
The Internet Archive has indicated that full service restoration will take several more weeks as they:
- Complete forensic investigation of all three breaches
- Rebuild authentication infrastructure from scratch
- Implement new security controls and monitoring
- Rotate all service credentials and API keys
- Conduct penetration testing before reopening services
Conclusion: A Preventable Tragedy
The Internet Archive attacks were entirely preventable. A GitLab token exposed for two years. Credentials not rotated after a major breach. Multiple warnings ignored or undetected.
But this is not just a story of Internet Archive's failures. It's a story about how we as a society undervalue and underfund the institutions that preserve our digital heritage.
For Australian organizations, particularly nonprofits in the cultural, educational, and charitable sectors, the message is clear: you are targets, and basic security hygiene is not optional.
The Internet Archive will recover from these attacks. The question is whether the broader nonprofit sector will learn the lessons before their own preventable tragedy occurs.
Protect Your Organization
Don't wait for a breach to prioritize cybersecurity. Australian nonprofits and cultural institutions can access security resources and expert guidance.
Learn About Nonprofit Security