ID FN570746; SV 1; linear; genomic DNA; CON; MAM; 15464 BP. XX AC FN570746; XX PR Project:PRJEA31265; XX DT 18-NOV-2009 (Rel. 102, Created) DT 18-NOV-2009 (Rel. 102, Last updated, Version 1) XX DE Gorilla gorilla gorilla whole genome shotgun sequence assembly, supercontig DE chr16_32697597_15464 XX KW . XX OS Gorilla gorilla gorilla (western lowland gorilla) OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; OC Eutheria; Euarchontoglires; Primates; Haplorrhini; Catarrhini; Hominidae; OC Gorilla. XX RN [1] RP 1-15464 RA Rogers A.S.; RT ; RL Submitted (14-OCT-2009) to the INSDC. RL Rogers A.S., Informatics, Wellcome Trust Sanger Institute, Wellcome Trust RL Genome Campus, Hinxton, Cambridge, CB10 1HH, UNITED KINGDOM. XX RN [2] RA Durbin R.M., Tyler-Smith C.; RT "Gorilla genome"; RL Unpublished. XX DR MD5; 9fb51aeb9e845cc16bf0284ef68f6783. DR ENA; CABD020000000; SET. DR ENA; CABD000000000; SET. DR ENA-CON; FR853087. DR BioSample; SAMEA2272491. XX CC Raw data: CC - 2.1x WGS capillary sequenced read pairs. CC - 35x Solexa sequenced read pairs with insert sizes 150 bp and 450 bp. CC All sequence was derived from DNA sampled from a single female CC Western Lowland gorilla (Gorilla gorilla gorilla), Kamilah. CC CC Assembly invvolved several phases, starting with a de novo CC assembly of the initial data using the ABySS and Phusion CC assemblers. Contigs from this 'seed' assembly were then grown by CC assembling and attaching Solexa read pairs from the initial data. CC To improve long-range structure, supercontig construction was CC guided by placing (where possible) seed contigs in accordance CC with their homologous locations on the human genome, breaking CC supercontigs wherever a potential break between human and gorilla CC was inferred in an alignment of all the Solexa data to human. CC CC This draft assembly contains all the assembled supercontigs plus a CC mitochondrial sequence. Of the assembled supercontigs there are three CC types: CC 1. Placed supercontigs, assembled from seeds (capilliary read contigs) CC placed in accordance with human synteny. CC These are named e.g. chr1_12345_67890, meaning a supercontig of length CC 67890 roughly syntenous with a region on human chromosome 1 starting at CC 12345. CC 2. Unplaced supercontigs, assembled from seeds which did not place on CC human. These are named e.g. unplaced1234_1_5678, meaning a supercontig of CC length 5678. There are about 47000 of these. CC 3. Cut fragments identified as potential initial misassemblies when CC stitching together placed contigs. These are named e.g. cut_chr1_12345_678 CC meaning a fragment of length 678, initially assembled from seeds placed CC near to 12345 on human chromosome 1. The mitochondrial sequence is CC derived from a gorilla MT sequence in the trace archive. Polymorphisms CC specific to Kamillah were incorporated during a final phase of the CC assembly. XX FH Key Location/Qualifiers FH FT source 1..15464 FT /organism="Gorilla gorilla gorilla" FT /sub_species="gorilla" FT /mol_type="genomic DNA" FT /note="supercontig chr16_32697597_15464" FT /db_xref="taxon:9595" XX CO join(CABD02136281.1:1..608,gap(12202),CABD02136282.1:1..2654) //