Find the raw R-Markdown file that I used to generate this html file at https://timotheenivalis.github.io/workshops/RforRSB/rmarkdown_notes.Rmd and figures I have used are at https://timotheenivalis.github.io/workshops/RforRSB/Figures.zip .

R-Markdown part of a reproducible workshop

Science is hard enough, do yourself a favor and make it reproducible.

R-markdown combines the R and Markdown languages using Knitr and is made easier in RStudio. R-markdown let you create a broad range of documents: reports, journal articles, books, (interactive) webpages, blogs, slides.

The ability to create documents with bits of R-code and outputs makes your life easier and promotes reproducible science.

Today we will focus on writing reports. Start by selecting new, R Markdown, Document, HTML. RStudio generates an example. You can see that the file has three components:

  • An (optional) YAML header surrounded by - - -s. This sets the configuration and layout.
  • R code chunks surrounded by ```s
  • Text mixed with simple text formatting

Compile by cliking “Knit” or pressing Ctrl+Shift+K.

Text: Basics of Markdown syntax

Plain text is just text.

End a line with two spaces
to start a new line, or leave a blank line to start a new paragraph.

Use # to create headers. # for main title, ## for section header, ### for subsection… and so on.

Text between *italics*: italics or _italics_: italics

Text between **bold**: bold or __bold__: bold

Mix * and _ for _**both bold and italics**_: both bold and italics

Unordered lists with *, + or - followed by spaces. Hierarchy is controled by 2-tabs.

Special characters be: \* \_ \\: * _ \

Math model with Latex syntax. Equation between $. $A = \pi r^2$: \(A = \pi r^2\)

Equation block between $$: \[A = \pi r^2\]

Block quote
second line

Inline verbatim code within backticks (`): lm(x ~ y)

Lists

  • unordered list
    • sub-item 1
    • sub-item 2
    • sub-item 3
    • sub-item 4
      • sub-sub-item

Ordered lists with 1., i) or A). Hierarchy is controled by 2-tabs:

  1. ordered main 1
  2. ordered main 2
    1. sub 1
    2. sub 2
      1. sub-sub item 1
      2. sub-sub item 1
  3. ordered main 3

Or create numbering automatically with (@) to allow breaks:

  1. list that can be interupted item 1
  2. list that can be interupted item 2

Interuption

  1. list that can be interupted item 3

R chunks

Insert by clicking “Insert R” or Ctrl+Alt+I or starting with 3 backticks followed by curly braces, and ending with 3 backticks.

x <- rnorm(1000)
plot(x)

If you want your code to be interpreted, make the language explicit in the opening curly braces. By default it is r.

x <- rnorm(1000)
plot(x)

Turn the following code in a .Rmd document and compile it:

x1 <- rnorm(200)
x2 <- x1 +rnorm(200)
y <- 1 + x1 +rnorm(200)
summary(lm(y ~ x2))
plot(x2, y)

Control chunk behaviour

If you want to show a code that does not work or that would take a long time to run use argument eval=FALSE.

If you want not to show code, use argument echo=FALSE

  • collapse= TRUE/FALSE ; combine code and output?
  • warning / message / error = TRUE/FALSE ; show what R wants to tell you?
  • include = TRUE/FALSE ; show anything from the chunk in the document?
  • fig.width / fig.height ; figure dimensions in inches
  • fig.cap ; figure caption
  • dev = ’pdf’ / ’png’ / ’svg’ / ’jpeg’ / ’tikz’ /… ; How to create images?

Inline r-code or r-output

` 1 + pi `

Inline code 1 + pi.

`r 1 + pi `

Inline code 4.1415927.

A little bit of YAML (pronounced “Ya-mel”, like “Camel”)

Warning: YAML is very sensitive to spaces/tabs!

Starts and end with 3 dashes. Then, basic information:

  • title: “XX”
  • author: “XX”
  • date: “XX”
  • output: html_document / word_document / pdf_document

Try word output, and pdf output if you have a LaTeX distribution.

Some useful options with html

Add a table of contents (floating or fixed). Pay attention to tabs.

output:
  html_document:
    toc: true
    toc_float: true

Section numbering.

output:
  html_document:
    number_sections:true

Layout: theme and highlight.

theme: default, cerulean, journal, flatly, darkly, readable, spacelab, united, cosmo, lumen, paper, sandstone, simplex, and yeti. Pass null for no theme (in this case you can use the css parameter to add your own styles)

highlight: default, tango, pygments, kate, monochrome, espresso, zenburn, haddock, textmate and null

For instance:

output:
  html_document:
    theme: united
    highlight: tango

More Markdown syntax

Insert pictures

![caption](Figures/rmarkdown.jpg)

or if you want more control with chunk options, for instance: r, fig.cap=“R Markdown logo”, fig.width=6

knitr::include_graphics("Figures/rmarkdown.jpg")
R Markdown logo

R Markdown logo

Insert tables

You can create them manually but probably do not want to:

Right Left Default Center
12 12 12 12
123 123 123 123
1 1 1 1

However knitr makes printing tables easy with kable.

data(cars)
head(cars)
##   speed dist
## 1     4    2
## 2     4   10
## 3     7    4
## 4     7   22
## 5     8   16
## 6     9   10
knitr::kable(x = head(cars),
     caption = "A knitr kable table", align = c("r","c"), 
     row.names = TRUE)
A knitr kable table
speed dist
1 4 2
2 4 10
3 7 4
4 7 22
5 8 16
6 9 10

Insert html tabs with {.tabset}

Linear regression

Simulations

x1 <- rnorm(100)
x2 <- rnorm(100) + 2*x1
y <- x1 - 0.5*x2 +rnorm(100)

Simple

A simple regression measures total associations

summary(lm(y ~ x2))
## 
## Call:
## lm(formula = y ~ x2)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.77548 -0.92216 -0.04407  0.79261  2.33776 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept)  0.03833    0.11220   0.342    0.733
## x2          -0.05149    0.05696  -0.904    0.368
## 
## Residual standard error: 1.122 on 98 degrees of freedom
## Multiple R-squared:  0.00827,    Adjusted R-squared:  -0.00185 
## F-statistic: 0.8172 on 1 and 98 DF,  p-value: 0.3682

Multiple

A multiple regression measures direct associations, corrected for indirect associations.

summary(lm(y ~ x1+x2))
## 
## Call:
## lm(formula = y ~ x1 + x2)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.08115 -0.80395 -0.00654  0.80898  1.95661 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.01036    0.10254   0.101     0.92    
## x1           0.97029    0.21296   4.556 1.52e-05 ***
## x2          -0.40897    0.09411  -4.346 3.42e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.023 on 97 degrees of freedom
## Multiple R-squared:  0.1831, Adjusted R-squared:  0.1662 
## F-statistic: 10.87 on 2 and 97 DF,  p-value: 5.501e-05

Final exercise