intro-to-rnaseq-with-galaxy

Read Alignment

Image Source

SAM format

STAR produces a file in Sequence Alignment Map (SAM) format or the compressed version BAM.

Image Source

Genome Annotation Standards

Image Source

GTF Gene Annotation

Import a gene annotation file from a Data Library to be used for feature counting

Align the reads to the human genome using STAR aligner

Run MultiQC on the STAR log files to check the result of the alignment

Question 5: In RNAseq, the percentages of uniquely aligned reads are typically lower than for DNAseq, due to the presence of unremoved ribosomal RNA. These are present in multiple copies throughout the genome and cause reads not to be mapped confidently. RNAseq is expected to be above 75% for an uncontaminated human sample. Is the "% Aligned" above 75% for these samples? You can optionally check to see which percentage of the reads align to the HIV genome by re-running STAR using the HIV genome and hiv annotation files located in Shared Data library called **

View bam file using JBrowse

Next we’ll add two Track groups, each with an annotation track

Finally, run the job:

Question 6: Which samples appear to show higher expression of MYC, the Mock or HIV?
Question 7: How many exons does this gene have?

Next: Gene Quantification

Previous: Process Raw Reads