14 Class 24. R Markdown

R-Markdown: How to share your code and results with others

Content modified from the Software Carpentry, CC-BY.

Data analysts write a lot of reports that describe their analyses and results for their collaborators or to document their work for future reference. Everything is easier now that you can create a web page (as an html file) or a PDF to share your work. It can be one long stream, so you can use tall figures that wouldn’t ordinary fit on one page.

Literate programming

Ideally, such analysis reports are reproducible documents: If an error is discovered, or if some additional subjects are added to the data, you can just re-compile the report and get the new or corrected results (versus having to reconstruct figures, paste them into a Word document, and further hand-edit various detailed results).

The key tool for R is knitr, which allows you to create a document that is a mixture of text and some chunks of code. When the document is processed by knitr, chunks of R code will be executed, and graphs or other results inserted. This sort of idea has been called “literate programming”.

knitr allows you to mix basically any sort of text with any sort of code, but we recommend that you use R Markdown, which mixes Markdown with R. Markdown is a light-weight mark-up language for creating web pages.

Creating an R Markdown file

Within R Studio, click File → New File → R Markdown and you’ll get a dialog box.

You can stick with the default output (HTML output), but give it a title.

Basic components of R Markdown

The initial chunk of text contains instructions for R: you can give the document a title, author, and date, and tell it that you’re going to want to produce html output (in other words, a web page).

—title: “Initial R Markdown document”author: “Karl Broman”date: “April 23, 2015″output: html_document—

You can delete any of those fields if you don’t want them included. The double-quotes aren’t strictly necessary in this case. They’re mostly needed if you want to include a colon in the title.

RStudio creates the document with some example text to get you started. Note below that there are chunks like

“`{r}summary(cars)“`

These are chunks of R code that will be executed by knitr and replaced by their results. Everything between the lines annotated with “` should contain R commands/comments. Try running this summary command in your console to see what you get!

Also note the web address that’s put between angle brackets (< >) as well as the double-asterisks around **Knit**. These marks (symbols) provide information about how that information should be treated. This is the language of Markdown.

Markdown

Markdown is a system for writing web pages by marking up the text much as you would in an email rather than writing html code. The marked-up text gets converted to html, replacing the marks with the proper html code. For now, let’s write a bit of markdown.

You make things bold using two asterisks, like this: **bold**, and you make things italics by using underscores, like this: _italics_.

You can make a bulleted list by writing a list with hyphens or asterisks, like this:

* bold with double-asterisks* italics with underscores

or like this:

– bold with double-asterisks- italics with underscores

Each will appear as:

  • bold with double-asterisks
  • italics with underscores

You can make a numbered list by just using numbers. You can use the same number over and over if you want:

1. bold with double-asterisks1. italics with underscores

This will appear as:

  1. bold with double-asterisks
  2. italics with underscores

You can make section headers of different sizes by initiating a line with some number of # symbols:

# Header 1 (Title)
## Header 2 (Main section)
### Header 3 (Sub-section)
#### Header 4 (Sub-sub section)

 

You compile the R Markdown document to an html webpage by clicking the “Knit HTML” in the upper-left. And note the little question mark next to it; click the question mark and you’ll get a “Markdown Quick Reference” (with the Markdown syntax) as well to the RStudio documentation on R Markdown. More guidelines are here: http://rmarkdown.rstudio.com/authoring_basics.html

Making tables

You can create a table in R markdown by using the | symbol to separate columns as shown below. The line of dashes creates a line between your headers and your rows.

 

A bit more Markdown to add links, images, or sub/super scripts

  • You can add a hyperlink like this: [webpage](http://the-web-page.com)

The output of this would just say webpage and would embed the link you added

  • You can include an image file like this: !(http://url/for/file)
  • You can do subscripts (e.g., F~2~) with F~2 and superscripts (e.g., F^2^) with F^2^.

R code chunks

Markdown is useful, but the real power comes from mixing markdown with chunks of R code. This is R Markdown. When processed, the R code will be executed; if they produce figures, the figures will be inserted in the final document.The main code chunks look like this:

```{r load_data}
gapminder <- read.csv("gapminder.csv")
```

That is, you place a chunk of R code between “`{r chunk_name} and “`. The r specifies that we are using the R programming language and the text after this is a name that you give this code block (could be anything). It’s a good idea to give each chunk a name, as they will help you to fix errors and, if any graphs are produced, the file names are based on the name of the code chunk that produced them.

Note: As shown in the example above, you have to have load the dataset that you are working with within your markdown document. Within a code chunk, click the “import dataset” button to add the read.csv command and select your file.

Inline code

Code results can also be inserted directly into the text of a .Rmd file by enclosing the code with ` r `. This can be helpful when referring to specific variables. For example, you can include numbers that are derived from the data as code not as numbers. Thus, rather than writing “There are 168 individuals”, insert a bit of code that, when evaluated, gives the number of individuals.

There are  `  r nrow(my_data)`  individuals.

Using inline code means that if you update your dataset this value will be correct.

Chunk options

There are a variety of options to affect how the code chunks are treated.

  • Use echo=FALSE to avoid having the code itself shown.
  • Use results=”hide” to avoid having any results printed.
  • Use eval=FALSE to have the code shown but not evaluated.
  • Use warning=FALSE and message=FALSE to hide any warnings or messages produced.
  • Use fig.height and fig.width to control the size of the figures produced (in inches).
```{r load_libraries, echo=FALSE, message=FALSE}
library("dplyr")
library("ggplot2")
```

Other output options

You can also convert R Markdown to a PDF or a Word document. Click the little triangle next to the “Knit HTML” button to get a drop-down menu. Or you could put pdf_documentor word_document in the header of the file.

 

 

 

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

BIOL446/BIOL546 Bioinformatics Coding Guides Copyright © by emilymeredith is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.

Share This Book