NCGR NISE Bioinformatics
License and Copyright
1
Instructors
2
Getting Started
2.1
Computer requirements
2.2
Connecting to the linux server
3
Linux
3.1
A little shell… aka the $ prompt is the command line interface
3.2
Directory Structure
3.3
Find the shell in system you’ll use to log into the NCGR’s server
3.4
Log on to logrus server
3.5
Now that I logged on, where am I?
3.6
Part I: Basic Topics
3.6.1
Understanding Directories
3.6.2
Listing options
3.6.3
Navigation
3.6.4
Files: creating with touch command
3.6.5
History command
3.6.6
Files: creating by redirecting standard out
3.6.7
File name completion with tab
3.6.8
Files: moving files from one filename to another
3.6.9
Files: copying files from one filename to another
3.6.10
Files: securely copying files between your laptop and logrus
3.6.11
Files and directories: removing files is deleting files
3.6.12
Tool box: How to abort a command/process
3.7
PART II: Advanced Topics
3.7.1
Files: Symbolic links and the soft link (-s)
3.7.2
Understanding a fasta file format
3.7.3
Understanding fastq (fq) file format
3.7.4
Using grep (global regular expression print) to extract metrics
3.7.5
Working with compressed files
3.7.6
Start ^ and end $ symbols
3.7.7
Files: parsing and creating data-subsets
3.7.8
Files: parsing and creating data-subsets
3.7.9
Files: parsing and creating data-subsets
3.7.10
Files: parsing and creating data-subsets
3.7.11
Revisiting table1 and
previous
awk command
3.7.12
Files:
S
tream
ED
itor (sed)
3.7.13
The Bash “for” Loop
3.7.14
Help with command syntax
3.8
Exercises
3.9
PART III
3.10
More Exercises
3.11
Jeopardy
4
Our World in Data (parsing)
4.1
Practice parsing
4.2
Our World in Data
4.3
Practice with your 3 countries
4.4
Country files
4.5
Jeopardy
5
NCBI SARS-CoV-2 Genome Sequences
5.1
Getting Sequences from NCBI
5.2
Exploring Sequence Files
5.3
Practice with your 3 countries
5.4
Plotting
5.5
R
5.6
ggplot2
5.6.1
Histogram of Collection Dates
5.6.2
Plot Length x Variants
5.6.3
BONUS
5.6.4
Plot Collection Date x Variants
5.6.5
Plot Collection Date x Variants for your 3 countries
6
Our World in Data (plots)
6.1
Bar chart
6.2
Line chart
6.3
Plot all 3 countries + transparency
7
GISAID SARS-CoV-2 Genome Sequences
7.1
GISAID
7.2
Phylodynamics
7.3
Genome sequences
7.4
Upload sequences
8
Multiple Sequence Alignment
8.1
SARS-Cov-2 variant genomes
8.2
Spike protein
8.3
Arcturus Genomes
8.4
Omicron Recombinant Case study
9
Phylogenetic Trees
9.1
SARS-Cov-2 variant genomes
9.2
Visualizing the tree
9.3
Arcturus Genome Sequences
9.4
Spike Protein
9.5
Spike Protein
10
Dotplots
10.1
SARS-CoV-2
10.2
Yersinia
11
Ancient Pandemic
12
Wastewater
12.1
Wastewater data
12.2
Map of wastewater sites
12.3
Read alignments
12.4
Identify SARS-CoV-2 variants
12.5
SARS-CoV-2 variant piecharts
13
Wrap Up
13.1
Acknowledgements
13.2
Survey
13.3
Questions
13.4
Server access and acknowledgements
13.5
Bookdown document
13.6
Zoom recordings
© National Center for Genome Resources
Published with bookdown
NCGR NISE-Bioinformatics
Chapter 1
Instructors
This document is available at
https://inbre.ncgr.org/nise-bioinformatics