Project: PRJNA625724
How novel genes and cellular functions evolve is a central question in biology. Exon shuffling represents a potent mechanism to assemble new protein architectures. Here we show that DNA transposons, which are mobile and pervasive in genomes, have provided a recurrent supply of both exons and splice sites to assemble novel protein-coding genes in vertebrates. We find that transposase domains have been captured, primarily via alternative splicing, to form new fusion proteins at least 99 times independently over ~350 million years of tetrapod evolution. Evolution favors fusion of transposase DNA-binding domains to host regulatory domains, especially the Krüppel-associated Box (KRAB), suggesting transposase capture frequently yields new transcriptional repressors. Consistent with this model, we show that four KRAB-transposase fusion proteins born independently in different mammalian lineages repress gene expression in a sequence-specific fashion. Genetic knockout and rescue of the bat-specific KRABINER fusion protein in cell culture demonstrates that it binds its cognate transposons genome wide and controls a network of genes and cis-regulatory elements. Transposase capture is thus a powerful mechanism whereby transcription factors and their associated cis-regulatory networks can evolve by repurposing DNA transposon families, which provide both DNA binding domains and pre-existing genomic binding sites. Overall design: We sequenced nascent RNAs purified from run-on sequencing reactions (PRO-seq) performed in Myotis velifer embryonic fibroblasts either wild-type or knock-out (KO) for the KRABINER gene, as well as KRABINER KO cells rescued with either wild-type, mutated DNA binding domain (DBD) or mutated KRAB KRABINER transgenes.
General