In this tutorial, you will learn:
You can run code in this tutorial directly, or you can choose to run it directly in R on your computer. Your choice!
Ready? Let's go!
If you're interested, source code can be found on GitHub
R is a statistical computing language. See more information (like the history of R) here. In biology, R is used for data analysis, predictions, machine learning, data visualization, etc.
I find that I learn better from doing (like doing tutorials), but sometimes you don't know what to do! Here are some websites you might want to explore in your free time.
R is great with stats, data analysis, and visualization. This means that R is good at math. Let's try using R as a calculator!
Try out R as a calculator by clicking on "run code".
## You can use + and -
2+2
15-7
# you use * to multiply and / to divide
5*4
20/2
# R also has functions for other typical math functions
# squre root
sqrt(182736)
# log2
log2(2.9)
<- and not =. = is reserved for arguments/parameters inside of functions (more on that below).
a <- 2+2
b <- 2*3
a + b
Try writing code where x equals 20 and y equals 50. Then subtract x from y.
x <- 20
y <- 50
y-x
In programming languages, a function is a part of a program that performs a task. R itself has a bunch of built-in functions. We can also install and load specific packages that have functions that other people have created (we'll get to packages later).
Some important lingo:
# Comments (lines prefaced with a #) are ignored by R but are useful for you
# When someone says you should "Comment your code", they mean this
# I always explain what my code is doing so that it's easy to come back to
b <- seq(from = 2, to = 10, by = 2)
# b is the object
# seq is the function
# from, to, by are the parameters
print(b)
"Base" R is pretty basic... that is, it doesn't have that many biology specific functions that we often need. The good news this is that researchers often write functions and package them up in an R "package". You can install these packages if you need them.
There are two main package repositories you'll use:
The first time you need to use a CRAN package, you install it using install.packages("PACKAGENAME"). After that, you only need to load it using library(PACKAGENAME). Note: see the quotation marks around the package name. These are important.
Bioconductor packages use a different function for installation. See the exercise below. You will most likely use a lot of Bioconductor packages, like DESeq2. DESeq2 is important for differential expression analysis!
We don't need to install any packages online, but this is how you would do it if you were installing them on your own computer.
## CRAN packages
# Let's install ggplot2. We will be using this for data visualization later on.
# You only need to install these packages if you're using them on your own computer
install.packages("ggplot2")
library(ggplot2)
## Bioconductor packages
# This is how you install Bioconductor itself
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install(version = "3.10")
# This is how you install Bioconductor packages
BiocManager::install("DESeq2")
# you load the packages the same way as the CRAN packages
library(DESeq2)
Sometimes (ok...often!) I forget how exactly to use a function. Luckily there is built in help within R that walks us through what a function does, how to use it, and includes examples. All you need to do is type ? before your function.
This only works on your own computer, but is here for reference.
?ggplot()
?plot()
?seq()
?c()
Here are some functions that you'll need to use often:
getwd()setwd()read.csv()head()tail()summary()write.csv()Exercise: Let's take 7 mins to explore these functions using Google and we'll discuss them together :)
There will come a day in your R journey when you'll want to do something and the function doesnt exist. Or, you want to connect two or more functions together into a single function to make your life easier. When this happens, you'll want to make your own function.
We aren't going to go into too much detail about writing your own functions in this tutorial, but I wanted to introduce it to you.
nameofyourfunction <- function(arguments, and, parameters){
x <- whateveryouwant(arguments)
y <- addanythinghere(and,paramters,x)
return(y)
}
Let's test what we went over so far!
After we're all comfortable with R lingo and how R works, we'll move on to our next tutorial.