Research data infrastructure, computational support, and student workforce development at UCLA Library.
Role: Director (2017–present) Focus: Research data infrastructure, computational support, student workforce development
Overview
The UCLA Data Science Center exists to make data and computational research durable.
That means more than workshops. It means building infrastructure, governance, and workforce capacity so that research can move from idea to publication to preservation without collapsing under technical friction.
My role is to set vision, align priorities, and make sure the systems keep working.
What we built
UCLA Dataverse
We extended UCLA Dataverse from a legacy archive into a campus publishing platform.
- 1,603 datasets
- 54,655 files
- 142,300+ downloads
- 312 registered users
It supports self-service publishing across disciplines, without requiring proportional staffing increases.
In early 2025, the repository was targeted by AI-driven mass downloading. I coordinated with UCLA IT and deployed a reverse-proxy filtering layer to contain the attack and protect long-term availability.
Infrastructure only matters if it survives stress.
Restricted and sensitive data: Redivis
Before 2023, UCLA had no compliant environment for multi-terabyte restricted datasets.
We licensed and implemented Redivis, established governance for P3/P4 data, and designed workflows that scale without routing every request through senior staff.
- Grew from 5 workflows and 444 GB processed to 2.9 PB processed in 2024
The platform now supports nationally recognized research, including work published in APSR and PNAS.
DataSquad: workforce as infrastructure
Consulting demand consistently exceeded staff capacity.
We adopted and adapted the DataSquad model to train undergraduates as research data consultants.
During the review period:
- 1,175 documented service interactions
- 190 consultations handled by DataSquad students (24%)
- 100% placement rate for a full cohort
Students contributed to award-winning research and gained real professional experience. That dual impact is intentional.
How it works
The Center operates across:
- Data publishing and preservation
- Sensitive data governance
- Mid-tier computing resources
- Consulting and project collaboration
- Student workforce development
- Systemwide education partnerships
The emphasis is on continuity. Programs should not depend on a single person.
Outcomes
Two of UCLA’s six 2024 Public Impact Research Award winners were directly supported by DSC infrastructure and services.
For Carceral Ecologies, we reduced processing time from 318 minutes to 2 minutes and supported infrastructure behind a $5M follow-on grant.
Infrastructure work is invisible when it succeeds. It becomes visible only when it fails. My goal is to keep it from failing.
Collaborators
- UCLA Library colleagues
- UCLA Office of Research
- UCLA IT Services
- DataSquad students and alumni