UCLA Dataverse

2019–present

Campus-wide research data publishing and preservation platform built on Harvard Dataverse infrastructure.

data publishing
repository
open data
preservation

Role: Platform lead and institutional PI Focus: Research data publishing, preservation, and discovery

Overview

UCLA Dataverse is the campus research data repository, built on the open-source Dataverse platform maintained by Harvard. We extended an inherited social science archive into a multi-disciplinary publishing platform serving researchers across the university.

What we built

We stood up and now operate the campus instance, including:

Self-service dataset publishing for all disciplines
Custom metadata schemas for domain-specific collections
Integration with researcher workflows and grant compliance requirements
Governance policies for dataset review, embargo, and access control

Current scale: 1,603 datasets, 54,655 files, 142,300+ downloads, 312 registered users.

How it works

Researchers deposit datasets directly or with support from DSC staff. The platform handles versioning, persistent identifiers (DOIs), metadata, and long-term preservation. Collections are browsable publicly; access-restricted datasets use controlled release workflows.

In early 2025, the repository was targeted by AI-driven mass downloading. We coordinated with UCLA IT and deployed a reverse-proxy filtering layer to contain the attack and maintain availability.

Outcomes

Primary campus infrastructure for research data publication
Supports grant compliance and open data mandates (NIH, NSF, etc.)
Contributed to two 2024 Public Impact Research Award–winning projects

Collaborators

Harvard Dataverse development team
UCLA Office of Research
UCLA IT Services

Selected links

UCLA Dataverse