- A little story behind Annotare
- Why submit to ArrayExpress?
- The team behind Annotare
- How to cite us?
- Contact Us
A little story behind Annotare
-
Audience
Annotare is a webform tool for submitting functional genomics experiments to ArrayExpress. Its major target audience is wet-lab researchers who are novices in data submission and often have only one or two data sets to submit every few years.
-
The problem we address
The tool was designed in response to our submitters' top complaint --- that it is extremely tedious to prepare standardised spreadsheets in the correct format for submission, even for experienced submitters using template spreadsheets. We need a better system.
-
The solution
Annotare abolishes the need to "learn" any spreadsheet format prior to submission. It makes use of the latest technology to guide a submitter through the series of webforms, which capture the same kind of information as in the spreadsheet. To preserve some features of spreadsheets that our submitters love, Annotare provides wizard-style functions such as "Fill down values" and "Import Values" (works like copy-and-paste) to speed up the filling of online forms.
-
Gatekeeper of reproducible research
In addition to editing features, we have incorporated a built-in validator in Annotare, so a submitter can more easily prepare an experiment submission that complies with community-recommended data standards, promoting reproducibility of research.
-
Speed is key
Only validated experiments can be submitted, which in turn allows us to issue accession numbers and load experiments into ArrayExpress much quicker, because experiments must be in a reasonably good shape at the time of submission. We believe quick turnaround time is crucial for our submitters, who are often under time pressure to publish their work in journals.
-
We collect and act on submitters' feedback
The needs of our submitters are constantly changing. In the last few years, submissions with larger sample sizes, from multi-omics studies, from new technologies (e.g. single-cell RNA-seq) or with complex designs are increasingly common. We try our best to offer timely assistance, so Annotare has a built-in "Contact Us" button for submitters to send messages to a curator, and presents submitters with an optional feedback form upon successful submission. Submitters have surprised us by being very generous in sharing constructive criticisms on their Annotare experience, which we then feed into our Annotare development plan. This ensures we focus on features which matter most to our submitters.
Why submit to ArrayExpress?
1. Journal submission requirement
This is by far the most common reason why a submitter deposits data at ArrayExpress. Most journals now require functional genomics data sets to be deposited at a public database such as ArrayExpress or Gene Expression Omnibus at NCBI in compliance with MIAME / MINSEQE / MINSCE standards prior to manuscript publication.
2. Data archiving and management
Have you ever pondered upon the best way to keep the large raw data files from your microarray or sequencing experiment? DVD? External hard-drive? What if the disc is misplaced or hard-drive is corrupted? And how about the experiment design, protocols and sample information, i.e. meta-data that provides the experiment's context? Checking your colleague's lab books is one option, but what if your colleague has left the job, and a few years down the line you have a question about a cryptic acronym used to describe some samples?
That's where a public archive like ArrayExpress comes in. We help submitters store both meta-data and data files accurately and securely -- peace of mind!
3. Promote reproducible research
All experiments submitted to ArrayExpress via Annotare, without exception, are manually curated by trained bioinformaticians who have doctoral-level wet-lab research experience in different areas of genetics, including mammalian skin stem cells, epigenetic gene silencing in mammalian embryos, and the effect of gibberellins in citrus plant growth.
Curation involves not only correcting mistakes in submissions (e.g. typographic errors, inconsistent sample annotation) but also enhancing the record so it's more likely to be discovered by users' searches. For example, a data set may be annotated as being obtained from human "MCF-7" cells. To the untrained eye, that may suffice, but curators know that the cells are derived from breast cancer tissues, and will therefore annotate the samples with disease term "breast carcinoma" too, ensuring the data set is returned when someone searches for experiments related to the disease.
4. Increased public exposure of your work
ArrayExpress is accessed by on average 1000 unique users every day, with interests ranging from data-mining for meta-analyses, combining with their own data, or populating a third-party value-added database (both academic and commercial settings). One of the biggest consumers of ArrayExpress experiment is its sister database, Expression Atlas, at EMBL-EBI, with about 800 unique users every day. ArrayExpress therefore provides a platform to showcase your work, complementary to the journal publications.
The team behind Annotare
Many people have contributed to the Annotare project over the years, either as developers building the software, or as curators interacting with submitters regularly and acting as the advocate for submitters.
As of September 2020, the primary software engineers are Haider Iqbal and Sandeep Reddy Kurri, and the curation team are Anja Füllgrabe, Nancy George and Silvie Fexova who are all part of the Gene Expression team lead by Irene Papatheodorou. Two senior software engineers Alfonso Munoz-Pomer Fuentes and Awais Athar (from the Functional Genomics Development team lead by Ugis Sarkans) support the new developments.
How to cite us?
- To cite an ArrayExpress submission in your manuscript: please include your experiment accession number and the URL to ArrayExpress home page, http://www.ebi.ac.uk/arrayexpress. e.g. "Microarray data are available in the ArrayExpress database (http://www.ebi.ac.uk/arrayexpress) under accession number E-MTAB-12345."
- To cite the ArrayExpress database or Annotare: please use the following publication: Athar A. et al., 2019. ArrayExpress update - from bulk to single-cell expression data. Nucleic Acids Res, doi: 10.1093/nar/gky964. Pubmed ID 30357387.
Contact Us
Email: annotare@ebi.ac.uk
In your email, please include information such as the URL of the page you were on, what you're trying to do on the page and what failed, any error messages you've seen, or a screenshot of the page where you had a problem.
A curator will respond as soon as possible, usually within 3 working days. You can also Tweet to @ArrayExpressEBI