Next generation sequencing bioinformatics

Virtual course

Next generation sequencing bioinformatics

A guide to the technology, analysis workflows, tools, and resources for next generation sequencing data analysis.

This virtual course will provide insights into how biological knowledge can be derived from genomics experiments and explain different approaches in analysing such data. The main focus will be on assembly, re-sequencing, and variant calling during the analysis of higher-eukaryotes, with a particular emphasis on human genetic research. Throughout the week, more advanced topics will introduce the creation of pipelines, automation, and the scaling-up of analysis experiments.

Practical sessions will enable participants to process training datasets and apply appropriate statistical methods in their analyses. There will be no opportunity to work with personal research data during the course.

Virtual course

Participants will learn via a mix of pre-recorded lectures, live presentations, and trainer Q&A sessions. Practical experience will be developed through group activities and trainer-led computational exercises. Live sessions will be delivered using Zoom with additional support and communication via Slack.

Pre-recorded material will be made available to registered participants prior to the start of the course and in the week before the course there will be a brief induction session. Computational practicals will run on EMBL-EBI's virtual training infrastructure, meaning participants will not require access to a powerful computer or install complex software on their own machines.

Participants will need to be available between the hours of 09:30-17:30 GMT each day of the course. Trainers will be available to assist, answer questions and further explain the analysis during these times.

Who is this course for?

The course is aimed at PhD students and post-doctoral researchers who are starting to use high-throughput sequencing technologies and bioinformatics methods in their research. The content is most applicable for those working with eukaryotic genomes, especially in the area of human genetics and rare-disease research.

Participants will require a basic knowledge of the Unix command line and the Ubuntu 18 operating system. We recommend these free tutorials:

Basic introduction to the Unix environment:
- www.ee.surrey.ac.uk/Teaching/Unix
Introduction and exercises for Linux:
- https://training.linuxfoundation.org/free-linux-training

Please note: participants without basic knowledge of these resources will have difficulty in completing the practical sessions.

What will I learn?

Learning outcomes

After this course you should be able to:

State the advantages and limitations of high-throughput assays
Apply appropriate short read aligners to unassembled reads
Perform variant calling analysis and annotation
Scale-up and automate simple genomics pipelines
Access genomic datasets from online public resources

Course content

During this course you will learn about:

Quality control methods for cleaning raw read data
Alignment of reads to a reference genome
File format conversion and processing
Tools for variant calling
Methodologies for variant annotation
Approaches for scaling up and reproducing data
Data resources for genomics data

Trainers

Tom Hancocks
EMBL-EBI

Chiara Batini
University of Leicester

Charles Solomon
University of Leicester

Kayesha Coley
University of Leicester

Noemi Piga
University of Leicester

Sean Laidlaw
Wellcome Sanger Institute

Raheleh Rabhari
Wellcome Sanger Institute

Malvika Sharan
The Alan Turing Institute

Emily Perry
EMBL-EBI

Marcela Uliano-Silva
Wellcome Sanger Institute

Sam Holt
EMBL-EBI

Baron Koylass
EMBL-EBI

Alan Tracey
Wellcome Sanger Institute

Programme

Day 1 – Monday 15 February 2021
09:30-09:45	Arrival, registration, and hangout	Tom Hancocks & Marina Pujol
09:45-10:00	Introduction to virtual training	Tom Hancocks
10:00-10:45	Welcome and introductions	Tom Hancocks & Chiara Batini
10:45-11:00	Break
11:00-12:00	Overview of NGS technology	Chiara Batini
12:00-13:00	Introduction to Unix	Chiara Batini, Charles Solomon, Kayesha Coley & Noemi Piga
13:00-14:00	Break
14:00-14:30	Quality control - lecture	Chiara Batini
14:30-15:30	Quality control - practical	Chiara Batini
15:30-15:45	Break
15:45-16:15	Read mapping - lecture	Chiara Batini
16:15-17:00	Read mapping - practical	Chiara Batini
17:00-17:30	Flash presentations	All
17:30	End of day
Day 2 – Tuesday 16 February 2021
09:30-09:45	Arrival, registration, and hangout	Tom Hancocks & Marina Pujol
09:45-10:00	Recap of Day 1	All
10:00-10:45	SAM/BAM file formats - lecture	Chiara Batini
10:45-11:45	SAM/BAM file formats - practical	Chiara Batini
11:45-12:00	Break
12:00-13:00	Introduction to BASH, loops, and variables	Chiara Batini, Charles Solomon, Kayesha Coley & Noemi Piga
13:00-14:00	Break
14:00-15:00	BAM refinement, QC & visualisation - lecture	Chiara Batini
15:00-16:00	BAM refinement, QC & visualisation - practical	Chiara Batini
16:00-16:15	Break
16:15-17:00	BAM refinement, QC & visualisation - practical	Chiara Batini
17:00-17:30	Flash presentations	All
17:30	End of day
Day 3 – Wednesday 17 February 2021
09:30-09:45	Arrival, registration, and hangout	Tom Hancocks & Marina Pujol
09:45-10:00	Recap of Day 2	All
10:00-10:45	Variant calling - lecture	Chiara Batini
10:45-11:45	Variant calling - practical	Chiara Batini
11:45-12:00	Break
12:00-13:00	Introduction to GitHub	Sean Laidlaw & Raheleh Rahbari
13:00-14:00	Break
14:00-15:00	Variant filtering - lecture	Chiara Batini
15:00-16:00	Variant filtering - practical	Chiara Batini
16:00-16:15	Break
16:15-17:00	Variant filtering - practical	Chiara Batini
17:00-17:30	Flash presentations	All
17:30	End of day
Day 4 – Thursday 18 February 2021
09:30-09:45	Arrival, registration, and hangout	Tom Hancocks & Marina Pujol
09:45-10:00	Recap of Day 3	All
10:00-10:45	Scaling things up - lecture	Sean Laidlaw & Raheleh Rahbari
10:45-11:45	Scaling things up - practical	Sean Laidlaw & Raheleh Rahbari
11:45-12:00	Break
12:00-13:00	Scaling things up - practical	Sean Laidlaw & Raheleh Rahbari
13:00-14:00	Break
14:00-15:30	Introduction to Docker	Sean Laidlaw & Raheleh Rahbari
15:30-15:45	Break
15:45-17:00	The Turing Way and reproducible research aspects of data science	Malvika Sharan
17:00-17:30	Flash presentations	All
17:30	End of day
Day 5 – Friday 19 February 2021
09:30-09:45	Arrival, registration and hangout	Tom Hancocks & Marina Pujol
09:45-10:00	Recap of Day 4	All
10:00-11:45	Ensembl genome browser & VEP	Emily Perry
11:45-12:00	Break
12:00-12:40	Genomic pipelines in the Darwin Tree of Life project	Marcela Uliano-Silva
12:40-13:00	Manual genome annotation	Alan Tracey
13:00-14:00	Break
14:00-15:30	European Nucleotide Archive	Sam Holt
15:30-15:45	Break
15:45-17:00	European Variation Archive	Baron Koylass
17:00-17:30	Course wrap-up & feedback	Tom Hancocks & Marina Pujol
17:30	End of course

This course has ended

15 – 19 February 2021

£200

Contact
Marina Pujol

Organisers

Tom Hancocks
EMBL-EBI
Chiara Batini
University of Leicester

Share this event with: