Database Annotation in Molecular Biology. Principles and Practice

  • ID: 2171274
  • Book
  • 266 Pages
  • John Wiley and Sons Ltd
1 of 4
Two factors dominate current molecular biology: the amount of raw data is increasing very rapidly and successful applications in biomedical research require carefully curated and annotated databases.  The quality of the data – especially nucleic acid sequences – is satisfactory; however, annotations depend on features inferred from the data rather than measured directly, for instance the identification of genes in genome sequences.  It is essential that these inferences are as accurate as possible and this requires human intervention.

Many new sequences are emerging from genomics projects and many new protein structures are now being determined using X–ray crystallography, nuclear magnetic resonance spectroscopy and cryo–electron microscopy.  Without direct experimental evidence there is considerable difficulty in assigning function to proteins from their sequences or even from their proteins.  This applies even to homologues of well–characterised proteins, because of the recruitment of similar proteins for divergent functions.  Furthermore, correct classification of sequences, structures and functions often requires sensitivity to very delicate features.  Computer programs can aid to some extent but cannot to the whole job reliably – again manual curation is essential.  Proteomics studies on spatial and temporal protein expression patterns provide additional streams of data that require human interpretation to resolve fine details.

With the recognition of the importance of accurate database annotation and the requirement for individuals with particular constellations of skills to carry it out, annotators are emerging as specialists within the profession of bioinformatics.  This book compiles information about annotation – its current status, what is required to improve it, what skills must be brought to bear on database curation and hence what is the proper training for annotators.

This book should be essential reading for all people working on biological databases, both biologists and computer scientists.  It will be also be of interest to all users of such databases, inclduing molecular biologists, geneticists, protein chemists, clinicians and drug developers.

Note: Product cover images may vary from those shown
2 of 4

List of Contributors.

1. Annotation and Databases: Status and Prospects (M. Hoebeke, H. Chiapello, J.–F. Gibrat, Ph. Bessieres and J. Garnier).


2. Survey of Sequence Databases: Archival Projects (M. Magrane, M. Garcia–Pastor and R. Apweiler).

3. Survey of Sequence Databases: Derived Databases (M. Pruess, N. Mulder and R. Apweiler).

4. Databanks of Macromolecular Structure (H.J. Bernstein and F.C. Bernstein).

5. Gene Expression Databases (H. Parkinson).


6. Taxonomy: a Moving Target for Sequence Data (M.I. Krichevsky).

7. Genomics and Proteomics: Design and Sources of Annotation (K. Mayer and G. Mannhaupt).

8. Annotation of Protein Sequences (W.C. Barker and C.H. Wu).

9. Issues in the Annotation of Protein Structures (G.J. Swaminathan, J. Tate, R. Newman, A. Hussain, J. Ionides, K. Henrick and S. Velankar).

10. Classification of Protein Function (A.M. Lesk, H. Parkinson and J.C. Whisstock).


11. Information Flow and Data Integration of Databanks (C.H. Wu and W.C. Barker).

12. Models of Database Interconnectivity (G.J.L.Kemp).

13. The European Bioinformatics Institute Macromolecular Structure Relational Database Technology (H. Boutselakis, D. Dimitriopoulos, K. Henrick, J. Ionides, M. John, P.A. Keller, P. McNeil, J. Pineda and A. Suarez–Uruena).


14. Looking Around, Looking Ahead (A.M. Lesk).


Note: Product cover images may vary from those shown
3 of 4


4 of 4
Arthur M. Lesk
Note: Product cover images may vary from those shown
5 of 4
Note: Product cover images may vary from those shown