Introduction
Key-terms: quantitative research, data lifecycle (acquire, clean, use/reuse, publish, preserve/destroy), FAIR-principles, R (analytic software), introductory course, relevant for students in any PhD phase.
ECTS: 2.5
Number of sessions: 4
Hours per session: 4
This course introduces participants to R programming as a powerful tool for working with data across its full life cycle. Rather than teaching R in isolation, the course is built around the principles of data literacy, helping participants understand not only how to code, but also how to manage, analyse, and share data according to the FAIR-principle (Findable, Accessible, Interoperable, and Reusable).
It is particularly suited for participants who are new to R or seeking a structured, conceptually grounded entry into data analysis.
Participants learn to navigate the five stages of the data life cycle: acquire, clean, use/reuse, publish, and preserve/destroy. Each life cycle step is linked to practical R skills and broader reflections on scientific integrity, ethics, and sustainability. The course highlights not only what to do with data, but also why it matters.
The course consists of four 2-hour on-site sessions, each requiring about 2 hours of preparation. We use a flipped classroom model: short pre-recorded videos (5-10 minutes), self-guided tutorials, and selected readings are completed before class, so face-to-face time can focus on hands-on problem-solving, Q&A, and interactive discussion.
Entry level and relevance
This course does not require knowledge of advanced statistics or prior coding experience with R or any other programme. It is designed as an introductory course, making it accessible to students with little or no programming background. However, a basic familiarity with research data, such as having worked with spreadsheets or simple datasets, will be helpful.
This course is relevant for PhD students in all stages of their research, from early planning to later analysis and publication. It is particularly useful for those handling data themselves or aiming to improve the transparency and reproducibility of their research workflow.
This course is especially suited for students in the social sciences, humanities, and life sciences who engage in quantitative or mixed-methods research and want to develop data skills using R and the FAIR-principles. Researchers working with structured data in any discipline will benefit.
Relations with other courses
This course complements other quantitative research courses offered by EGSH by providing a foundational introduction to R and ethical data practices.
While the course is rooted in open science principles and FAIR data practices, it does not replace the EGSH course Open science and research transparency, which focusses on the broad guiding principles and fundamentals of open science. The latter course is mandatory for all our PhD candidates.
Further, by preparing participants to work confidently with data in R, this course functions as an introduction to more advanced EGSH courses that focus on specific statistical techniques and analyses, such as Multilevel modelling or Structural equation modelling.
Students new to programming or data handling are encouraged to take this course before enrolling in the more specialised courses, as it will equip them with essential R skills and data literacy. Conversely, students who already have experience with R and are focused on modelling techniques may find the course partly overlapping but still useful for reinforcing ethical and reproducible research practices. Together, these courses form a progressive learning path from basic data work to advanced statistical modelling.
Key Facts & Figures
- Type
- Course
- Instruction language
- English
- Mode of instruction
- Online
Start dates for: Data literacy through R: Managing and analysing data responsibly
Edition 1
Session 1: February 4 (Wednesday) 2026 | 13.00-17.00 hrs | Offline (Langeveld building, room 1.04)
Session 2: February 11 (Wednesday) 2026 | 13.00-17.00 hrs | Offline (Langeveld building, room 1.04)
Session 3: February 25 (Wednesday) 2026 | 13.00-17.00 hrs | Offline (Langeveld building, room 1.19)
Session 4: March 4 (Wednesday) 2026 | 13.00-17.00 hrs | Offline (Langeveld building, room 3.19)
What will you achieve?
- After this course, you will understand the key stages of the data lifecycle and how they relate to their practical data work.
- After this course, you will have basic proficiency in using R for data acquisition, cleaning, analysing and visualisation.
- After this course, you will know how to structure and document data workflows in a transparent and reproducible way.
- After this course, you will understand best practices for data management, including considerations for sharing, reuse, and preservation.
- After this course, you will have improved your data literacy skills, enabling you to critically assess data quality and ethical implications.
- After this course, you will be able to apply R programming skills to real-world tasks relevant to your academic discipline.
- After this course, you will be able to apply open science practices in your work.
Sessions and preparations
Session 1: Introduction to R and the data life cycle
We introduce the course, R, RStudio, and the data life cycle as a guiding framework. Participants will learn how to import and explore datasets in R and reflect on the kind of data they work with. Time is reserved for loading and inspecting either a provided dataset or their own. Participants will hear about how to collaborate using Git and GitHub.
Preparations: Watch short videos on RStudio and the data life cycle. Read the short handout on data types and formats. Read the data management plan.
Session 2: Cleaning and preparing data
This session focuses on using the tidyverse framework to clean and transform data. Participants practise reshaping data, handling missing values, and preparing data for analysis.
Preparations: Watch short videos on cleaning data. Skim the tidyverse cheatsheet provided in the syllabus.
Session 3: Analysing and visualising data
We introduce basic analysis techniques, such as descriptive statistics and simple regression models using lm(), and provide space to explore other techniques that align with participants’ needs and interests. Participants will also visualise key results with ggplot2. Emphasis is placed on interpretation, not on statistical theory.
Preparations: Watch videos on visualisation in R. Read the short guide on interpreting linear models.
Session 4: Reproducibility, FAIR data, and reporting
We cover documentation and reporting with Qaurto, including how to embed regression output and plots in a reproducible report. We close with discussion on FAIR principles, responsible data sharing, and data archiving.
Preparations: Watch a video. Read the FAIR principles one-pager and the sample annotated Quarto report.
Instructor
- Frederick Thielen is assistant professor at the Erasmus School of Health Policy and Management (ESHPM), Erasmus University Rotterdam, and senior scientific sesearcher at the Trimbos Institute in Utrecht. His research focuses on Health Technology Assessment (HTA), with an emphasis on cancer care, mental health, and environmental sustainability. He regularly works with R for health economic modelling, cost-effectiveness analysis, and reproducible research. Frederick is also international programme director of the European Master in Health Economics and Management (Eu-HEM), where he supports curriculum development and interdisciplinary collaboration.
- Stijn Peeters is a PhD candidate of the Erasmus School of Health, Policy & Management (ESHPM). In his work, he relies on R for a wide range of tasks, including statistical modelling, developing presentations and reports with Quarto, and building interactive dashboard using Shiny. He has also co-developed several R packages together with Frederick, aimed at making data analysis more efficient and accessible.
Contact
- Enrolment-related questions: enrolment@egsh.eur.nl
- Course-related questions: thielen@eshpm.eur.nl or s.b.peeters@eshpm.eur.nl
- Telephone: +31 (0)10 4082607
Facts & Figures
- Fee
- free for PhD candidates of the Graduate School
- € 575,- for non-members
- consult our enrolment policy for more information
- Tax
- Not applicable
- Offered by
- Erasmus Graduate School of Social Sciences and the Humanities
- Course type
- Course
- Instruction language
- English
- Mode of instruction
- Online