DO NOT FORK THIS REPO
Ewing sarcoma is a pediatric bone cancer which arises from the fusion of the EWSR1 and FLI1 genes ("EWSR1-FLI1" fusion oncogene).
Recently, some have proposed therapies for Ewing sarcoma which suppress EWSR1-FLI1. However, it is not clear yet how suppressing EWSR1-FLI1 impacts the transcriptomic state of Ewing sarcoma tumor cells.
You have been provided with a Ewing sarcoma RNA-seq data set (EwS.rds
). The data is in RangedSummarizedExperiment
format. The metadata includes a column condition
with two levels (1) shEF1
(EWSR1-FLI1 knock-down) and (2) shCTR
(control). There are 3 shCTR samples and 4 shEF1 samples.
> rse <- readRDS("EwS.rds")
> rse$condition
[1] "shCTR" "shCTR" "shCTR" "shEF1" "shEF1" "shEF1" "shEF1"
Your task is to perform a differential gene expression (DGE) analysis to determine what genes, and biological pathways / processes, are altered in EWSR1-FLI1 knock-down (shEF1
) vs control (shCTR
).
To complete this task, you should perform the DGE analysis using DESeq2 and generate these tables and figures:
- PCA summarizing the sample-level variance within the data set.
- MA Plot showing the relationship between mean count and log2 fold change.
- Table listing the differentially-expressed genes (DEGs) saved in CSV format.
- Volcano plot showing all DGE results.
- Heatmap showing the top 10 over- and under-expressed DEGs.
- Enrichment analysis showing the top over- and under-expressed KEGG pathways.
- Write the results to a CSV file
- Create a figure to summarize the results
You can also send your work-in-progress to Henry for feedback at any time. Without giving you the answers, he will indicate whether you are on the right track and offer suggestions for improvement. Asking for help is always encouraged and will not hurt your standing.
A minimal workflow for completing this task would involve the following steps:
- Clone this repo (DO NOT FORK THIS REPO)
git clone https://github.com/Bishop-Laboratory/RA-Eval.git
- Open the R project file (
RA-Eval.Rproj
) in RStudio - Set up the development environment using renv (optional)
renv::restore()
- Complete the task in an R script in this repo.
- Create the figures / tables in the repo.
- Commit the results:
git add .
git commit -m "<some_informative_commit_message>"
- Create a repo on your personal github account called "RA-Eval".
- Add your remote repo to your local repo as
upstream
(replace<your_github_username>
with your GitHub username)
git remote add upstream https://github.com/<your_github_username>/RA-Eval.git
- Push your code to the remote repo on GitHub
git push -u upstream main
- Send Henry the link to your GitHub repo once you are done.
Here are the R packages that are likely to be useful in completing this task:
For convenience, we provide these packages in an R environment accompanying this repo. To install the R environment, simply install renv
install.packages("renv")
And then "restore" the R environment (installs all packages for the project):
renv::restore()
This task is designed to test your ability to perform a basic DGE analysis, not your ability to use git, github, renv, RStudio, etc. Therefore, if you are finding difficulties setting things up or with any aspect of submitting your results, just let Henry know and he will assist you.
If you are struggling with the DGE analysis, you are strongly encouraged to read and follow the DESeq2 Vignette. Here are some additional free learning resources which you may find valuable:
- BIG Bioinformatics workshop on R and RNA-Seq: link
- Harvard training on RNA-Seq (with DESeq2): link
- Griffith Lab Training on DESeq2: link
If there are any instructions which are confusing or any questions / clarifications you want to raise, please just reach out to Henry and he will assist you. Asking for help is always encouraged and will not hurt your standing.