Bioinformatics for the Analysis and Exploitation of Resequenced Genomes
An MRC LINK grant sponsored project.
The overall aim of this LINK project is to accelerate the development of software infrastructure and analysis methods to exploit re-sequenced genomes.
During the next phase of the human genome project is to discover and characterise all the natural variation in the genome which occurs between individuals, in particular those associated with disease. Many diseases show family association, with potentially genetic factors providing predisposition to particular diseases, and cataloging these differences wil provide many benefits.
The three participants of this project are specialist in and will work on:
- Generation of dense genotypes by resequencing
- Storage and visualisation of genotype data in context of the whole genome.
- Statistical interpretation of this data
News
- HyperLasso code and documentation for PLoS Genetics paper - Simultaneous analysis of GWAs
-
FREGENE - software to simulate sequence-like data in large
genomic regions and large populations.
- Program Download
- Documentation
-
Datasets - info
- Population A: panmictic population (21 K sequences)
1 Megabase simulation
(size: 25 MB, uncompressed: 175 MB) -info - Population B: subdivided population (3 subpopulations each of 7K sequences)
1 Megabase simulation
(size: 31 MB, uncompressed: 199 MB) -info - Population C: worldwide human population
1 Megabase simulation
(size: 210 MB, uncompressed: 1.3 GB) -info
- Population A: panmictic population (21 K sequences)
- invertFREGENE code and documentation for simulating inversion polymorphisms in population genetic data