R for Reproducible Scientific Analysis

An introduction to R for non-programmers using gapminder data

The goal of this lesson is to teach novice programmers to write modular code and best practices for using R for data analysis. R is commonly used in many scientific disciplines for statistical analysis and its array of third-party packages.

Note that this workshop will focus on teaching the fundamentals of the programming language R, and will not teach statistical analysis.

A variety of third party packages are used throughout this workshop. These are not necessarily the best, nor are they comprehensive, but they are packages we find useful, and have been chosen primarily for their usability.

Yuo can find the two days long version of this course at http://swcarpentry.github.io/r-novice-gapminder/.


  1. Understand that computers store data and instructions (programs, scripts etc.) in files.
  2. Files are organised in directories (folders).
  3. Know how to access files not in the working directory by specifying the path.


Setup Download files required for the lesson
00:00 1. Introduction to R and RStudio How to find your way around RStudio?
How to interact with R?
How to manage your environment?
How to install packages?
00:55 2. Seeking Help How can I get help in R?
01:15 3. Data Structures How can I read data in R?
What are the basic data types in R?
How do I represent categorical information in R?
02:10 4. Exploring Data Frames How can I manipulate a data frame?
02:40 5. Subsetting Data How can I work with subsets of data in R?
03:30 6. Creating Publication-Quality Graphics with ggplot2 How can I create publication-quality graphics in R?
04:50 7. Vectorization How can I operate on all the elements of a vector at once?
05:15 8. Functions Explained How can I write a new function in R?
06:15 9. Writing Data How can I save plots and data created in R?
06:35 Finish

The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.