Processed data

The nature of processed data can vary between different submissions because “processing” can mean different things ranging from background removal, log2 transformation and data normalisation of single hybridisations, to fold change values between two conditions. For sequencing experiments, there are multiple stages in the data analysis pipeline that can generate processed data files e.g. trimming or filtering of read sequences, reference genome alignment, and normalised “reads per kilobase of transcript per million mapped reads” (RPKM) values.

Before re-using these files it is advisable to check the protocols for “normalisation data transformation” to understand what these files represent (Figure 16).

Figure 16 Finding protocol information describing how the raw data was processed to derive the processed data matrix.

In the next section, you will find a short overview of how to open and process some of the most common data formats.