Academics

Libraries offers workshops on research reproducibility and data management in R

Credit: Chris Blaska / Penn StateAll Rights Reserved.

UNIVERSITY PARK, PA. — Beginning Sept. 20, the Research Informatics and Publishing department at Penn State University Libraries will offer a series of four workshops on research reproducibility and data management in R and RStudio. These workshops will introduce participants to implementable practices that support reproducible and open research. Participants will learn how to use R and RStudio to apply data management practices and enable reproducible analysis and documentation workflows.

R is a programming language that allows users to wrangle data sets, conduct statistical analyses, create data visualizations, and develop reproducible and documented analysis workflows. This workshop series will offer hands-on training in fundamental coding skills, data management strategies in R to support research reproducibility, and data visualization. Participants can expect to learn how to wrangle data into an analysis-ready format, use R packages and connections to manage R projects, and create data visualizations using ggplot2, an open-source data visualization package for R.

The workshops are free and open to Penn State graduate students, postdoctoral scholars, faculty and staff. Beginner knowledge of R is recommended for this series, but no previous knowledge is required.

Because the workshops build upon one another, participants are expected to complete each one in sequence. The first workshop, “Introduction to R and RStudio,” is optional but recommended for those who have never used R or RStudio before.

Participants must have access to a computer with a Mac, Linux or Windows operating system and be able to download R, RStudio and Git applications. Registrants will receive instructions on how to access these applications before the workshops begin.

All workshops will be held virtually via Zoom. Access information will be distributed via email following registration. Advance registration is required; registration and additional information is provided below.

If you need additional information, contact Research Informatics and Publishing at repub@psu.edu.

Register for this workshop series here. Registration is required and will close September 15, 2023, or when capacity is reached, whichever comes first.

Workshop Schedule

Introduction to R and RStudio —Sept. 20, 2–4 p.m.

This session will introduce R and RStudio, walk through the platform interface, and discuss the utility of using the software for reproducible research practices. Participants will learn how to set a working directory, load data and packages, and discover how to find resources to support general learning and answer specific questions.

Data Wrangling in R — Sept 27, 2–4 p.m.

This session will introduce the use of the package data.table to manage, clean and transform data into “tidy” format, or create new variables in a reproducible manner. Additionally, participants will learn how to handle string and date/time data.

Data Management and Research Reproducibility in R and RStudio — Oct. 4, 2–4 p.m.

This workshop will focus on data management strategies that can be implemented in R and R Studio to develop a reproducible analysis and output workflow to facilitate transparent and reproducible research, as well as support open data sharing.

Data Visualization in R — Oct. 11, 2–4 p.m.

This workshop will provide an overview of how to use the R package ggplot2 to create meaningful data visualizations.

Last Updated August 24, 2023