Course overview
Recent technological advances have brought large-scale DNA sequencing within the reach of small companies and research laboratories and opened the door for research and applied uses for sequencing. This course teaches analysis of these data sets, and interpretation of the significance of the patterns found therein. This course provides an understanding of the specific considerations of different sequencing technologies, as well as an understanding of the algorithms used to align, assemble, and annotate sequence data. While DNA sequencing is useful for sequencing genomes, it also has widespread applications in methods used to understand interactions, whether they be within a cell or organism (signalling, regulation, protein function) or between organisms (at the level of populations, symbioses, and communities). This course also provides an understanding of these systems.
Course learning outcomes
- Perform simple alignment, assembly, and annotation algorithms by hand for "toy" data sets.
- Formulate and justify appropriate choices in technology, strategy, and analysis for a range of projects involving DNA, RNA, or protein sequence data.
- Employ command line sequence analysis tools to analyze real-world biological sequence data sets, and demonstrate familiarity with the syntax and options required to generate meaningful interpretations.
- Describe the roles of mutation, recombination, duplication, and selection in generating novel variants and determining their fate in organisms and populations.
- Discuss the contents of prokaryotic and eukaryotic genomes, the general function and origin of the main components, and considerations and difficulties in genome sequencing associated with these.
- Demonstrate understanding of common methods and applications for analysis of gene expression.
- Recognize the importance of using protein (or translated nucleotide) sequence data in searches for homology in evolutionary and functional relationships of genes.
- Survey methods involving the analysis of interactions between proteins, nucleic acids, and other molecules, and their applications to biomedical and other real-world problems.
- Discuss the considerations involved in the analysis of multi-species sequence data sets, such as microbial metagenomic and host/symbiont genome analysis.