Curated databases
As described by Leonelli (2014), data curation includes:
- Selecting data for inclusion into a database
- Formatting data
- Classifying data into retrievable database categories
- 'Cleaning' data - correcting or removing data that do not meet quality control criteria
- Providing metadata (information about data in the database) - provides details about a study or experiment that are not represented in the database)
Examples of curated databases
- NCBI Reference Sequence (RefSeq) Database - database of genomic, transcript, and protein sequences selected from public sequence archives (Pruitt et al. 2012a).
- ChEMBL - bioactivity database for drug discovery (Gaulton et al. 2012).