Course at EMBL-EBI
Summer school in bioinformatics
This course, organised in association with Wellcome Connecting Science, provides an introduction to the use of bioinformatics in biological research, giving participants guidance for using bioinformatics in their work whilst also providing hands-on training in tools and resources appropriate to their research.
Participants will initially be introduced to bioinformatics theory and practice, including best practices for undertaking bioinformatics analysis, data management, and reproducibility. To enable specific exploration of resources in their particular field of interest, participants will be divided into focused groups to work on a project set by resource and data experts from EMBL-EBI and external collaborating institutes. These projects will end with a presentation from each group on the final day of the course to bring together learnings from all participants.
Participants will be required to review some pre-recorded material prior to the start of the course and will have an opportunity to meet other trainees in an induction session to be held virtually in the week before the course take place.
Group projects
A major element of this course is a group project, where participants will be placed in small groups to work together on a challenge set by trainers from EMBL-EBI and external institutes. This allows people to explore the bioinformatics tools and resources available in their area of interest and apply them to a set problem, providing participants with hands-on experience relevant to their own research. The group work will culminate in a presentation session involving all participants on the final day of the course, giving an opportunity for wider discussion on the benefits and challenges of working with biological data.
Groups are mentored and supported by the trainers who set the initial challenge, but the groups will be responsible for driving their projects forward, with all members expected to take an active role. Groups are pre-organised before the course, and all group members will be sent some short “homework” in preparation for their project work prior to the start of the course.
Basic outlines of the projects on offer this year are given below. In your application you must indicate your first and second choice of project, based on which you think would benefit your research most. Not all projects may be offered, and final decisions on which projects will be run during the course will be made based on the number of applicants per project.
Most of the projects cover mammalian data sets, however, in many cases, the methods and approaches taught are transferable to data from various species.
Networks and pathways
This project will cover typical bioinformatics analysis steps needed to put differentially expressed genes into a wider biological context. You will start with gene expression data (RNA-seq) to build an initial interaction network. Next, you will learn to combine public network datasets, identify key regulators of biological pathways, and explore biological function through network analysis. You will get first-hand experience in integration and co-visualising with additional data and functional enrichment analysis. All this helps to put the initial results into a previously known context and provide hypotheses for potential follow up experiments. We will use Cytoscape, Expression Atlas, g:Profiler, StringDb, among other tools. We also may give a few R packages a try.
Project mentors: Priit Adler (University of Tartu), Hedi Peterson (University of Tartu)
Genome variation across human populations
Natural variation between individuals or between different human populations is a result of genome mutations throughout evolutionary history. Some mutations may become fixed because of their beneficial effect while most drift among individuals. During this project, you will investigate genomic variation between two separate human populations of European and Asian descent. Using sequence data from a number of individuals from each population, you will use a range of bioinformatics tools to discover variants that exist between them. In the second section of the project, you will attempt to analyse the functional consequences of the variants you have identified, linking them to phenotypes.
Project mentors: Baron Koylass (EMBL-EBI)
Modelling cell signalling pathways
Curating models of biological processes is an effective training in computational systems biology, where the curators gain an integrative knowledge of biological systems, modelling, and bioinformatics. You will learn to encode and simulate ordinary differential equation models of signalling pathways from a recent publication using user-friendly software such as COPASI even without extensive mathematical background. You will learn to perform in-silico experiments, new predictions, and develop hypotheses. Furthermore, you will learn how to annotate models and re-use pre-existing models from open repositories such as BioModels.
Project mentors: Rahuman Sheriff (EMBL-EBI)
Interpreting functional information from large scale protein structure data
This project will introduce you to the wealth of publicly available data in the Protein Data Bank (PDB) and give you the opportunity to investigate how large subsets of structure data can be used to analyse protein features and determine function. In the project you will learn how to: identify relevant protein structures, collate and interpret functional information, and implement this process programmatically.
Project mentors: David Armstrong (EMBL-EBI), Preeti Choudhary (EMBL-EBI)
Analysis of intercellular interactions in healthy and diseased states
Ulcerative colitis is an inflammatory bowel disease. The exact pathomechanism of the disease is unknown. However, the interactions between the intestinal immune cells and the intestinal epithelial cells play a crucial role during the development of the disease. Single-cell RNA-seq measurements can help us understand these complex interactions. The expression data combined with protein-protein interaction databases can shed light on the connections between cells in diseased and healthy states.
During this project, you will use a single-cell RNA-seq dataset to build interactions between the various cells. The dataset contains pre-processed, cell type classified data of biopsies from healthy, inflamed and non-inflamed UC colonic biopsies. The interactions between cells will be downloaded from the OmniPath database. You will use Python Notebooks to build up the intercellular networks, map the single cell RNA-seq expression data and visualise them. The intercellular networks between various cell types can then be compared by Cytoscape.
Project mentors: Dezso Modos (Quadram Institute), Marton Olbei (Earlham Institute)
Who is this course for?
Applicants are expected to be at an early stage of using bioinformatics in their research with the need to develop their knowledge and skills further. No previous knowledge of programming is required for this course; group projects may give you the opportunity to learn basic programming, but participants will be supported in this by their mentors. Depending on your chosen project, an introductory programming tutorial may be given as homework prior to attending the course.
Though programming skills are not a prerequisite for attending the course, we will ask participants to specify their current level of programming skills in the applications. This will allow the mentors to target the group projects better to the skills and needs of the final course participants.
What will I learn?
Learning outcomes
After this course you should be able to:
- Discuss applications of bioinformatics in biological research
- Browse, search, and retrieve biological data from public repositories
- Use appropriate bioinformatics tools to explore biological data
- Describe ways that biological data can be stored, organised and integrated
Course content
During this course you will learn about:
- Bioinformatics as a science
- Designing bioinformatics studies
- Data management and reproducibility
- Basic tools and resources for bioinformatics
The exact range of resources and tools covered will vary depending on the group project undertaken; there will be no opportunity for you to analyse your own data during this course.
Trainers
Alex Bateman
EMBL-EBI Bérénice Batut
University of Freiburg Patricia Carvajal Lopez
EMBL-EBI Alexandra Holinski
EMBL-EBI Nikiforos Karamanis
EMBL-EBI Lee Larcombe
Amphimatic Peter McQuilton
Sarah Morgan
EMBL-EBI Hedi Peterson
University of Tartu, Estonia Summer Rosonovski
EMBL-EBI Anna Swan
EMBL-EBI Jenny Cook
LifeArc Priit Adler
University of Tartu, Estonia Baron Koylass
EMBL-EBI Rahuman Sheriff Malik Sheriff
EMBL-EBI David Armstrong
EMBL-EBI Preeti Choudhary
EMBL-EBI Dezso Modos
Quadram Institute Marton Olbei
Earlham Institute Benjamin Moore
EMBL-EBI
Programme
Day / Time
Topic
Trainer
Day one - Monday 13 June 2022
10:00 – 10:30
Registration and coffee
10:30 – 11:30
Welcome and introduction
Alex Holinski and Anna Swan
11:30 – 12:30
The science of bioinformatics
Alex Bateman
12:30 – 14:00
Lunch and poster session
14:00 – 15:30
Data visualisation 101: a practical introduction to designing scientific Figures
Niki Karamanis
15:30 – 16:00
Coffee break
16:00 – 17:30
Good data management: making your data FAIR
Peter McQuilton
17:30 – 18:00
Bedroom check-in
18:00 – 19:00
Networking and drinks
19:00
Evening meal
Day two - Tuesday 14 June 2022
09:00 – 09:30
Introduction and mini-challenge
Alex Holinski and Anna Swan
09:30 – 11:00
An introduction to EMBL-EBI data resources
Sarah Morgan
11:15 - 11:30
Coffee break
11:30 – 12:00
Keynote Q&A
Jenny Cook
12:00 – 13:30
Lunch featuring EMBL-EBI biocurators and poster session
13:30 – 15:00
Introductory computational skills
Pati Carvajal-López
15:00 – 15:30
Coffee break
15:30 - 17:30
Introduction to group projects and meet your mentors
Sarah Morgan; all mentors
19:00
Evening meal
Day three - Wednesday 15 June 2022
09:00 - 09:15
Introduction and project update
Alex Holinski and Anna Swan
09:15 - 10:30
Group work
10:30 - 11:00
Coffee break
11:00 - 12:30
Group work
12:30 - 13:30
Lunch
13:30 - 14:00
Keynote Q&A
Bérénice Batut
14:00 - 15:30
Group work
15:30 - 16:00
Coffee break
16:00 - 17:30
Group work
19:00
Evening meal
Day four - Thursday 16 June 2022
09:00 - 09:30
Group projects two-minute interim report
Alex Holinski and Anna Swan
09:30 - 10:30
Group work
10:30 - 11:00
Coffee break
11:00 - 12:00
Group work
12:00 - 12:30
Mini tutorial: Literature search with EuropePMC
Summer Rosonovski
12:30 - 13:30
Lunch
13:30 - 14:30
Group work
14:30 - 15:00
Coffee break
15:00 - 15:45
Bioinformatics chat
Lee Larcombe and Hedi Petersson
15:45 - 18:00
Group work
18:30 - 19:30
Pre-dinner drinks
19:30 prompt
Silver service dinner
19:30
Cash bar
Day five - Friday 17 June 2022
09:00 – 10:30
Preparation of group presentations
10:30 – 11:00
Coffee break
11:00 – 12:00
Group presentation
12:00 – 13:00
Lunch
13:00 – 14:00
Group presentation
14:00 – 15:00
Award ceremony, course feedback, and wrap up
All
15:30
Bus to train station
This course is organised in association with Wellcome Connecting Science. Applications are handled through their website.
EMBL-EBI
University of Freiburg
EMBL-EBI
EMBL-EBI
EMBL-EBI
Amphimatic
EMBL-EBI
University of Tartu, Estonia
EMBL-EBI
EMBL-EBI
LifeArc
University of Tartu, Estonia
EMBL-EBI
EMBL-EBI
EMBL-EBI
EMBL-EBI
Quadram Institute
Earlham Institute
EMBL-EBI
Programme
Day / Time | Topic | Trainer |
Day one - Monday 13 June 2022 | ||
10:00 – 10:30 | Registration and coffee | |
10:30 – 11:30 | Welcome and introduction | Alex Holinski and Anna Swan |
11:30 – 12:30 | The science of bioinformatics | Alex Bateman |
12:30 – 14:00 | Lunch and poster session | |
14:00 – 15:30 | Data visualisation 101: a practical introduction to designing scientific Figures | Niki Karamanis |
15:30 – 16:00 | Coffee break | |
16:00 – 17:30 | Good data management: making your data FAIR | Peter McQuilton |
17:30 – 18:00 | Bedroom check-in | |
18:00 – 19:00 | Networking and drinks | |
19:00 | Evening meal | |
Day two - Tuesday 14 June 2022 | ||
09:00 – 09:30 | Introduction and mini-challenge | Alex Holinski and Anna Swan |
09:30 – 11:00 | An introduction to EMBL-EBI data resources | Sarah Morgan |
11:15 - 11:30 | Coffee break | |
11:30 – 12:00 | Keynote Q&A | Jenny Cook |
12:00 – 13:30 | Lunch featuring EMBL-EBI biocurators and poster session | |
13:30 – 15:00 | Introductory computational skills | Pati Carvajal-López |
15:00 – 15:30 | Coffee break | |
15:30 - 17:30 | Introduction to group projects and meet your mentors | Sarah Morgan; all mentors |
19:00 | Evening meal | |
Day three - Wednesday 15 June 2022 | ||
09:00 - 09:15 | Introduction and project update | Alex Holinski and Anna Swan |
09:15 - 10:30 | Group work | |
10:30 - 11:00 | Coffee break | |
11:00 - 12:30 | Group work | |
12:30 - 13:30 | Lunch | |
13:30 - 14:00 | Keynote Q&A | Bérénice Batut |
14:00 - 15:30 | Group work | |
15:30 - 16:00 | Coffee break | |
16:00 - 17:30 | Group work | |
19:00 | Evening meal | |
Day four - Thursday 16 June 2022 | ||
09:00 - 09:30 | Group projects two-minute interim report | Alex Holinski and Anna Swan |
09:30 - 10:30 | Group work | |
10:30 - 11:00 | Coffee break | |
11:00 - 12:00 | Group work | |
12:00 - 12:30 | Mini tutorial: Literature search with EuropePMC | Summer Rosonovski |
12:30 - 13:30 | Lunch | |
13:30 - 14:30 | Group work | |
14:30 - 15:00 | Coffee break | |
15:00 - 15:45 | Bioinformatics chat | Lee Larcombe and Hedi Petersson |
15:45 - 18:00 | Group work | |
18:30 - 19:30 | Pre-dinner drinks | |
19:30 prompt | Silver service dinner | |
19:30 | Cash bar | |
Day five - Friday 17 June 2022 | ||
09:00 – 10:30 | Preparation of group presentations | |
10:30 – 11:00 | Coffee break | |
11:00 – 12:00 | Group presentation | |
12:00 – 13:00 | Lunch | |
13:00 – 14:00 | Group presentation | |
14:00 – 15:00 | Award ceremony, course feedback, and wrap up | All |
15:30 | Bus to train station |
This course is organised in association with Wellcome Connecting Science. Applications are handled through their website.