Intro to R

Tutorial objectives

In this tutorial, you will learn:

  • what R is
  • what R can do
  • some introductory commands
  • what an R packages is
  • some useful R packages

You can run code in this tutorial directly, or you can choose to run it directly in R on your computer. Your choice!

Ready? Let's go!

If you're interested, source code can be found on GitHub

What is R?

R is a statistical computing language. See more information (like the history of R) here. In biology, R is used for data analysis, predictions, machine learning, data visualization, etc.

I find that I learn better from doing (like doing tutorials), but sometimes you don't know what to do! Here are some websites you might want to explore in your free time.

  1. https://education.rstudio.com/learn/beginner/
  2. https://www.rensvandeschoot.com/tutorials/r-for-beginners/
  3. https://www.codecademy.com/learn/learn-r Unfortunately not free :(
  4. https://info.rstudio.com/m000CO20T0IY7SelXxZ0NN0 A talk about how to pick cute side projects to learn R. Please note: I don't recommend you actually take part in the games at the beginning of the talk...just the R stuff :)

What can R do?

R is great with stats, data analysis, and visualization. This means that R is good at math. Let's try using R as a calculator!

R as a calculator

Try out R as a calculator by clicking on "run code".

## You can use + and - 
2+2
15-7
# you use * to multiply and / to divide
5*4
20/2
# R also has functions for other typical math functions
# squre root
sqrt(182736)
# log2
log2(2.9)

Using variables in R

When you're writing code, you'll want to assign certain numbers as variables for use. In R we assign variables using <- and not =. = is reserved for arguments/parameters inside of functions (more on that below).
a <- 2+2
b <- 2*3
a + b 

Test your knowledge

Try writing code where x equals 20 and y equals 50. Then subtract x from y.

x <- 20
y <- 50
y-x

Introduction to functions

In programming languages, a function is a part of a program that performs a task. R itself has a bunch of built-in functions. We can also install and load specific packages that have functions that other people have created (we'll get to packages later).

Some important lingo:

  • function: part of a program that performs a task
  • object: how data is stored in R
  • argument: objects that are given to a function
  • parameter: options in an argument
# Comments (lines prefaced with a #) are ignored by R but are useful for you
# When someone says you should "Comment your code", they mean this
# I always explain what my code is doing so that it's easy to come back to
b <- seq(from = 2, to = 10, by = 2)
# b is the object
# seq is the function
# from, to, by are the parameters
print(b)

Packages and libraries

"Base" R is pretty basic... that is, it doesn't have that many biology specific functions that we often need. The good news this is that researchers often write functions and package them up in an R "package". You can install these packages if you need them.

There are two main package repositories you'll use:

The first time you need to use a CRAN package, you install it using install.packages("PACKAGENAME"). After that, you only need to load it using library(PACKAGENAME). Note: see the quotation marks around the package name. These are important.

Bioconductor packages use a different function for installation. See the exercise below. You will most likely use a lot of Bioconductor packages, like DESeq2. DESeq2 is important for differential expression analysis!

We don't need to install any packages online, but this is how you would do it if you were installing them on your own computer.

## CRAN packages 
# Let's install ggplot2. We will be using this for data visualization later on. 
# You only need to install these packages if you're using them on your own computer
install.packages("ggplot2")
library(ggplot2)

## Bioconductor packages
# This is how you install Bioconductor itself
if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install(version = "3.10")
# This is how you install Bioconductor packages
BiocManager::install("DESeq2")
# you load the packages the same way as the CRAN packages
library(DESeq2)

Getting help with functions

Sometimes (ok...often!) I forget how exactly to use a function. Luckily there is built in help within R that walks us through what a function does, how to use it, and includes examples. All you need to do is type ? before your function.

This only works on your own computer, but is here for reference.

?ggplot()
?plot()
?seq()
?c()

Some important functions

Here are some functions that you'll need to use often:

  • getwd()
  • setwd()
  • read.csv()
  • head()
  • tail()
  • summary()
  • write.csv()

Exercise: Let's take 7 mins to explore these functions using Google and we'll discuss them together :)

Writing your own functions

There will come a day in your R journey when you'll want to do something and the function doesnt exist. Or, you want to connect two or more functions together into a single function to make your life easier. When this happens, you'll want to make your own function.

We aren't going to go into too much detail about writing your own functions in this tutorial, but I wanted to introduce it to you.

nameofyourfunction <- function(arguments, and, parameters){
  x <- whateveryouwant(arguments)  
  y <- addanythinghere(and,paramters,x)
 return(y)
}

Test your knowledge

Let's test what we went over so far!

Quiz

After we're all comfortable with R lingo and how R works, we'll move on to our next tutorial.

Caitlin Simopoulos

Last updated on 2024-10-17