Campus-wide research data publishing and preservation platform built on Harvard Dataverse infrastructure.
Role: Platform lead and institutional PI Focus: Research data publishing, preservation, and discovery
Overview
UCLA Dataverse is the campus research data repository, built on the open-source Dataverse platform maintained by Harvard. We extended an inherited social science archive into a multi-disciplinary publishing platform serving researchers across the university.
What we built
We stood up and now operate the campus instance, including:
- Self-service dataset publishing for all disciplines
- Custom metadata schemas for domain-specific collections
- Integration with researcher workflows and grant compliance requirements
- Governance policies for dataset review, embargo, and access control
Current scale: 1,603 datasets, 54,655 files, 142,300+ downloads, 312 registered users.
How it works
Researchers deposit datasets directly or with support from DSC staff. The platform handles versioning, persistent identifiers (DOIs), metadata, and long-term preservation. Collections are browsable publicly; access-restricted datasets use controlled release workflows.
In early 2025, the repository was targeted by AI-driven mass downloading. We coordinated with UCLA IT and deployed a reverse-proxy filtering layer to contain the attack and maintain availability.
Outcomes
- Primary campus infrastructure for research data publication
- Supports grant compliance and open data mandates (NIH, NSF, etc.)
- Contributed to two 2024 Public Impact Research Award–winning projects
Collaborators
- Harvard Dataverse development team
- UCLA Office of Research
- UCLA IT Services