The Odum Data Archive: 50 years of Innovation
The Odum Institute Data Archive celebrates five decades of excellence in data preservation, curation and sharing.
By: Alana Edwards
History
The Odum Institute Data Archive was formally established in 1969 using funds from a National Science Foundation grant. Odum holds one of the largest repositories of machine-readable social science data in the nation.
The archive houses historic state-level polling and social science datasets focused on the South, including:
- The Louis Harris Data Center
- The National Network of State Polls
- The Carolina Poll
- The Southern Focus Poll
- The most complete collection of 1970 U.S. Census datasets
The Data Archive’s early collections speak to Howard W. Odum’s passion for progressive research on the American South. Mandy Gooch, the Odum Institute’s Research Data Archivist, recommends diving into those early datasets on UNC Dataverse to examine changes in public opinion on topics that are still pertinent in our state and nation today, including gender and racial inequality.
The Odum Institute’s role as a steward and curator of social science data has evolved significantly since 1969. In the Digital Age, the Odum Data Archive retains its commitment to data curation and preservation while adopting new functions in data sharing and verification.
UNC Dataverse
Fifty years after its founding, the Data Archive hosts and manages almost 25,000 datasets on the University of North Carolina Dataverse. The Odum Institute was an early adopter of the Dataverse Project, an open-source software application created by Harvard’s Institute for Quantitative Social Science. Dataverse facilitates the management, publication, discovery and analysis of research data.
UNC Dataverse serves as a preservation mechanism, according to archivist Mandy Gooch.
“Our mission and our sustainability plan is to make sure this data is available and accessible in perpetuity,” says Gooch.
Dataverse stores and protects data from obsolescence or disappearance. Thanks to UNC Dataverse, researchers around the world will be able to view and analyze Odum’s historic datasets for generations to come.
Reproducible Research
The Odum Institute goes beyond open science and open access. The Data Archive supports high-quality, reproducible research. Publicizing the analysis of code, data and supplementary materials benefits the scientific community. When researchers archive and share these pieces in a repository like Dataverse, other researchers can reproduce the results and expand upon the original study.
The Data Archive works with the American Journal of Political Science (AJPS) and the State Politics & Policy Quarterly to implement data curation and verification standards based on statistical best practices. Both journals require authors to submit a data package alongside a manuscript before publication.
Authors submit their data through Odum’s workflow and through the Dataverse platform. Before verification, the Data Archive team tests the code to ensure that it is correct and consistent with the dataset.
If everything verifies successfully, the analysis and materials are published in Dataverse, which automatically creates citations with persistent identifiers (DOIs). Journal editors publish the DOI alongside each manuscript so that readers have a direct link to explore the data themselves.
“[Reproducibility] adds value to the scientific community, to future collaborations and just builds upon itself,” says Mandy Gooch. “I would say that Odum is one of the only institutes currently working in the reproducibility field with other journals and researchers.”
As more scholarly journals and funding agencies move in the direction of data verification and reproducibility, demand increases for reproducibility support services. However, the process requires time and expertise. To break down these barriers, the Odum Institute aims to make verification tools more accessible and cost-effective.
The Alfred P. Sloan Foundation recently awarded Odum a grant for a three-year project, Confirmable Reproducible Research (CoRe2) Environment: Linking Tools to Promote Computational Reproducibility.
Jonathan Crabtree, Assistant Director of Cyberinfrastructure, and Thu-Mai Christian, Assistant Director for Archives, are developing automated scientific verification and research replication workflows.
The CoRe2 environment will allow researchers to receive automatic feedback rather than manually running code against a dataset. CoRe2 will also connect to UNC Dataverse.
By automating tools and workflows to reduce cost, the Sloan-funded project intends to make the data verification process accessible to all journals and researchers.
CoRe2 will lead to more open and rigorous data policies, and in turn more transparency and reproducibility in research.
Beyond Social Science
As research trends towards the interdisciplinary, the Odum Data Archive remains on the cutting edge by expanding beyond the social sciences. Supporting data practices and sharing across disciplines allows Odum to better serve UNC’s larger research mission.
“In this day and age, it’s important for us to go beyond social science,” says Gooch. “The research is part of a lot of disciplines— not just the way they conduct the research, but the questions that those fields are asking.”
A collaboration with the Carolina Population Center (CPC) led Odum to expand its reach into Public Health and other disciplines. CPC uses UNC Dataverse as a repository for its publicly available data. Dataverse holds more than 120 CPC datasets, including those from projects such as MEASURE Evaluation and Add Health.
Because most research sits at the intersection of multiple disciplines, expanding UNC Dataverse to house contributions beyond social science felt natural. Today, Odum’s data curators support researchers in managing, archiving and sharing their data, regardless of field. The Odum Institute believes that interdisciplinary collaboration and open data sharing strengthen the research community and benefit our world.
New Frontiers
In 2019, the Data Archive embodies the Odum Institute’s tradition of progress and excellence in research. While Odum’s archivists preserve distinguished archival collections for future generations, they also contribute to a more open and accessible data verification process for today’s researchers.
The Data Archive has modernized data management and sharing using UNC Dataverse, and will transform reproducibility through the CoRe2 project.
In the increasingly connected and fast-paced world of research, the Odum Institute’s Data Archive does more than keep up— it sets new standards.