From a bioinformatics point of view, the book illustrates: how data analysis techniques can facilitate more comprehensive, user–friendly data visualisation tasks; how data visualisation methods may make data analysis a more meaningful and biologically relevant process; and how to approach the overabundance of data in genomic studies, in which spurious associations often occur, with the proper statistical tools. The book describes how this synergy may support integrative approaches to functional genomics.
The book will be of interest to all bioinformaticians, from students to researchers, as well as to many scientists working in genomics, proteomics, systems biology and related areas.
List of Contributors.
SECTION I: INTRODUCTION – DATA DIVERSITY AND INTEGRATION.
1. Integrative Data Analysis and Visualization: Introduction to Critical Problems, Goals and Challenges (Francisco Azuaje and Joaquín Dopazo).
1.1 Data Analysis and Visualization: An Integrative Approach.
1.2 Critical Design and Implementation Factors.
1.3 Overview of Contributions.
2. Biological Databases: Infrastructure, Content and Integration (Allyson L. Williams, Paul J. Kersey, Manuela Pruess and Rolf Apweiler).
2.2 Data Integration.
2.3 Review of Molecular Biology Databases.
3. Data and Predictive Model Integration: an Overview of Key Concepts, Problems and Solutions (Francisco Azuaje, Joaquín Dopazo and Haiying Wang).
3.1 Integrative Data Analysis and Visualization: Motivation and Approaches.
3.2 Integrating Informational Views and Complexity for Understanding Function.
3.3 Integrating Data Analysis Techniques for Supporting Functional Analysis.
3.4 Final Remarks.
SECTION II: INTEGRATIVE DATA MINING AND VISUALIZATION –EMPHASIS ON COMBINATION OF MULTIPLE DATA TYPES.
4. Applications of Text Mining in Molecular Biology, from Name Recognition to Protein Interaction Maps (Martin Krallinger and Alfonso Valencia).
4.2 Introduction to Text Mining and NLP.
4.3 Databases and Resources for Biomedical Text Mining.
4.4 Text Mining and Protein–Protein Interactions.
4.5 Other Text–Mining Applications in Genomics.
4.6 The Future of NLP in Biomedicine.
5. Protein Interaction Prediction by Integrating Genomic Features and Protein Interaction Network Analysis (Long J. Lu, Yu Xia, Haiyuan Yu, Alexander Rives, Haoxin Lu, Falk Schubert and Mark Gerstein).
5.2 Genomic Features in Protein Interaction Predictions.
5.3 Machine Learning on Protein–Protein Interactions.
5.4 The Missing Value Problem.
5.5 Network Analysis of Protein Interactions.
6. Integration of Genomic and Phenotypic Data (Amanda Clare).
6.2 Forward Genetics and QTL Analysis.
6.3 Reverse Genetics.
6.4 Prediction of Phenotype from Other Sources of Data.
6.5 Integrating Phenotype Data with Systems Biology.
6.6 Integration of Phenotype Data in Databases.
7. Ontologies and Functional Genomics (Fátima Al–Shahrour and Joaquín Dopazo).
7.1 Information Mining in Genome–Wide Functional Analysis.
7.2 Sources of Information: Free Text Versus Curated Repositories.
7.3 Bio–Ontologies and the Gene Ontology in Functional Genomics.
7.4 Using GO to Translate the Results of Functional Genomic Experiments into Biological Knowledge.
7.5 Statistical Approaches to Test Significant Biological Differences.
7.6 Using FatiGO to Find Significant Functional Associations in Clusters of Genes.
7.7 Other Tools.
7.8 Examples of Functional Analysis of Clusters of Genes.
7.9 Future Prospects.
8. The C. elegans Interactome: its Generation and Visualization (Alban Chesnau and Claude Sardet).
8.2 The ORFeome: the first step toward the interactome of C. elegans.
8.3 Large–Scale High–Throughput Yeast Two–Hybrid Screens to Map the C. elegans Protein–Protein Interaction (Interactome) Network: Technical Aspects.
8.4 Visualization and Topology of Protein–Protein Interaction Networks.
8.5 Cross–Talk Between the C. elegans Interactome and other Large–Scale Genomics and Post–Genomics Data Sets.
8.6 Conclusion: From Interactions to Therapies.
SECTION III: INTEGRATIVE DATA MINING AND VISUALIZATION – EMPHASIS ON COMBINATION OF MULTIPLE PREDICTION MODELS AND METHODS.
9. Integrated Approaches for Bioinformatic Data Analysis and Visualization – Challenges, Opportunities and New Solutions (Steve R. Pettifer, James R. Sinnott and Teresa K. Attwood).
9.2 Sequence Analysis Methods and Databases.
9.3 A View Through a Portal.
9.4 Problems with Monolithic Approaches: One Size Does Not Fit All.
9.5 A Toolkit View.
9.6 Challenges and Opportunities.
9.7 Extending the Desktop Metaphor.
10. Advances in Cluster Analysis of Microarray Data (Qizheng Sheng, Yves Moreau, Frank De Smet, Kathleen Marchal and Bart De Moor).
10.2 Some Preliminaries.
10.3 Hierarchical Clustering.
10.4 k–Means Clustering.
10.5 Self–Organizing Maps.
10.6 A Wish List for Clustering Algorithms.
10.7 The Self–Organizing Tree Algorithm.
10.8 Quality–Based Clustering Algorithms.
10.9 Mixture Models.
10.10 Biclustering Algorithms.
10.11 Assessing Cluster Quality.
10.12 Open Horizons.
11. Unsupervised Machine Learning to Support Functional Characterization of Genes: Emphasis on Cluster Description and Class Discovery (Olga G. Troyanskaya).
11.1 Functional Genomics: Goals and Data Sources.
11.2 Functional Annotation by Unsupervised Analysis of Gene Expression Microarray Data.
11.3 Integration of Diverse Functional Data For Accurate Gene Function Prediction.
11.4 MAGIC – General Probabilistic Integration of Diverse Genomic Data.
12. Supervised Methods with Genomic Data: a Review and Cautionary View (Ramón Díaz–Uriarte).
12.1 Chapter Objectives.
12.2 Class Prediction and Class Comparison.
12.3 Class Comparison: Finding/Ranking Differentially Expressed Genes.
12.4 Class Prediction and Prognostic Prediction.
12.5 ROC Curves for Evaluating Predictors and Differential Expression.
12.6 Caveats and Admonitions.
12.7 Final Note: Source Code Should be Available.
13. A Guide to the Literature on Inferring Genetic Networks by Probabilistic Graphical Models (Pedro Larrañaga, Iñaki Inza and Jose L. Flores).
13.2 Genetic Networks.
13.3 Probabilistic Graphical Models.
13.4 Inferring Genetic Networks by Means of Probabilistic Graphical Models.
14. Integrative Models for the Prediction and Understanding of Protein Structure Patterns (Inge Jonassen).
14.2 Structure Prediction.
14.3 Classifications of Structures.
14.4 Comparing Protein Structures
14.5 Methods for the Discovery of Structure Motifs.
14.6 Discussion and Conclusions.