Data Mining Techniques for the Life Sciences 3rd Edition by Oliviero Carugo, Frank Eisenhaber – Ebook PDF Instant Download/Delivery: 9781071620946 ,1071620940
Full download Data Mining Techniques for the Life Sciences 3rd Edition after payment
Product details:
ISBN 10: 1071620940
ISBN 13: 9781071620946
Author: Oliviero Carugo, Frank Eisenhaber
Data Mining Techniques for the Life Sciences 3rd Edition Table of contents:
Part I: Databases
Chapter 1: Data Mining, Quality and Management in the Life Sciences
1 Introduction
2 Data Resources at EMBL-EBI
2.1 Data Deposition Databases
2.2 Knowledgebases
2.3 Biology-Driven Portals
3 Data Mining, Quality, and Management
3.1 From ArrayExpress to BioStudies: Life Cycle of a Data Resource
4 Coordination of Data Resources Across Europe
5 The Value of Managed Data in the COVID-19 Pandemic
5.1 European COVID-19 Data Platform: An Ecosystem for Data Sharing
6 Conclusions
References
Chapter 2: IMEx Databases: Displaying Molecular Interactions into a Single, Standards-Compliant Dataset
1 Introduction
2 Public Resources Containing Molecular Interaction Data
3 The IMEx Databases
4 The IMEx Curation Process
5 The Molecular Interaction Standards
6 IMEx Data Content and Dataset
7 IMEx Data Structure
8 Accessing and Visualizing IMEx Data
9 Future Plans
References
Chapter 3: Three-dimensional Structure Databases of Biological Macromolecules
1 Introduction and Summary
Box 1 Structural Databases Support Our Understanding of the Molecular Basis of Health and Disease
2 Archival Databases
Box 2 Growth of the Protein Data Bank
2.1 Entries
2.1.1 The PDB Files
2.2 Deposition of a PDB Entry
2.3 Validation
2.3.1 PDBREPORT
2.3.2 PDBREDO
3 Structure of the Data, and the Surrounding Information Ecosystem
3.1 Selection and Retrieval of Entries
3.2 Software for Analysis
3.3 Formats of Individual Entries
4 Structure Classification Databases
4.1 The CATH Database
4.1.1 Hierarchical Groupings in CATH
4.2 SCOP and Related Databases (SCOPe, SCOP2)
4.2.1 Structural Classification Strategies
4.2.2 The Original SCOP Had a Tree-Like Classification
4.2.3 SCOP2
4.2.4 Classification in SCOP2
4.3 The ECOD Database
5 Other General Macromolecular Structure Databases
5.1 “Boutique ́ ́ Databases
6 Sequences and Structures
6.1 Comparative Modeling
6.2 A Priori Prediction
7 Protein Interactions
8 Expected Developments
9 Conclusions
References
Part II: Prediction Methods
Chapter 4: Predicting Protein Conformational Disorder and Disordered Binding Sites
1 Introduction
2 Methods
2.1 Searching Databases Dedicated to IDPs
2.1.1 Database of Disordered Protein Prediction (D2P2)
2.1.2 MobiDB
2.1.3 DisProt
2.1.4 IDEAL
2.1.5 DescribePROT
2.1.6 PED (Protein Ensemble Database)
2.1.7 PDB (Protein Data Bank)
2.1.8 Other Databases
2.2 Running Disorder Predictions
2.2.1 Individual Disorder Predictors
Predictors Trained on Datasets of Disordered Proteins
Predictors that Have Not Been Trained on Disordered Proteins
Stand-Alone Disorder Predictors
Binary Disorder Predictors
Non-conventional Disorder Predictors
2.2.2 Metapredictors
2.2.3 Combining Predictors and Experimental Data
2.3 Identifying Disordered Regions Involved in Binding to Partners
2.4 General Procedure for Disorder Prediction
References
Chapter 5: Profiles of Natural and Designed Protein-Like Sequences Effectively Bridge Protein Sequence Gaps: Implications in D…
1 Introduction
2 Materials and Methods
2.1 Database Generation from “Natural ́ ́ and “Designed ́ ́ Sequence Profiles
2.2 Evaluation of Profile Database Searches Using Protein families Associated with a Structural Fold
2.3 Identifying Structural Homologs for PFAM families Without an Available Structure
2.4 Search Method and Metrics Used for Evaluation
2.5 Case Study: Functional Relationships Through Fold Associations
3 Results
3.1 The Profile Database of Natural and Designed Sequences
3.2 Evaluation of Approach Using Fold Associated Families
3.3 Fold Association for the Families with no Available Structure
3.4 Functional Studies Through Fold Associations in NrichD
3.5 Case Study: Functional Basis to the Fold Association for DUF4457
3.6 Case Study: Fold Association for DUF3788 and Functional Investigation
4 Conclusion
References
Chapter 6: Turning Failures into Applications: The Problem of Protein ΔΔG Prediction
1 Introduction
2 ΔG of the Protein Folding
3 ΔΔG of the Protein Variants
4 Databases of Protein Thermodynamic Data
5 Methods for Predicting ΔΔG
6 Critical Assessment of the Predictive Tools
7 Benchmarking of ΔΔG Prediction Methods
8 Benchmarking of Machine Learning and Knowledge-Based Methods on Multiple Variations
9 Conclusion and Perspectives
References
Chapter 7: Dissecting the Genome for Drug Response Prediction
1 Introduction
2 Bioinformatic Approaches to Improve the Drug Sensitivity Prediction in Cancer Cell Lines and Real Patients
2.1 Considering the Chemical Properties of Drugs Improves Drug Response Prediction
2.2 A Priori feature Selection to Improve Drug Sensitivity Prediction Performances
2.3 Prediction of Clinical Drug Response of Cancer Patients Using In Vitro Experiments on Preclinical Cancer Cell Lines
3 Conclusion
References
Chapter 8: Prediction of the Effect of pH on the Aggregation and Conditional Folding of Intrinsically Disordered Proteins with…
1 Introduction
2 Materials
2.1 Modeling pH-Dependent Aggregation of Intrinsically Disordered Proteins with SolupHred
2.1.1 Algorithm Rationale
2.1.2 Pipeline
2.2 Predicting pH-Dependent Order-Disorder Transitions in Intrinsically Disordered Proteins with DispHred
2.2.1 Algorithm Rationale
2.2.2 Pipeline
3 Methods
3.1 SolupHred Web Server
3.1.1 Frontpage and Running Options
3.1.2 Output and Evaluation
3.2 DispHred Web Server
3.2.1 Frontpage and Running Options
3.2.2 Output and Evaluation
4 Concluding Remark
5 Notes
References
Chapter 9: Extracting the Dynamic Motion of Proteins Using Normal Mode Analysis
1 Introduction
2 Methods
2.1 On-line NMA
2.2 Projection of the Modes from One Structure onto Another Off-Line
2.3 Conventional All-Atom NMA
3 Notes
References
Part III: Data Quality
Chapter 10: Pre- and Post-publication Verification for Reproducible Data Mining in Macromolecular Crystallography
1 Introduction
2 Brief Historical Overview of Traditional Publishing Methods and Data Validation Tools
3 Openness of Articles and Their Data
3.1 What Can Be Done in Pre-publication Peer Review?
3.2 Post-publication Peer Review as a Trend
3.3 Preprints
4 Is Macromolecular Crystallography Ready for Artificial Intelligence (AI) and Machine Learning (ML)?
5 Data Mining: Which Data Depositions to Select for Further Study and Application?
5.1 General Aspects
5.2 Three Specific Examples
6 Conclusions Thus Far
7 Unanswered Questions
8 Forward Look
9 How Will Central Facilities ́ Raw Data Catalogs Be Mined?
10 Final Remarks
References
Chapter 11: Soft Statistical Mechanics for Biology
1 Introduction
2 The Middle Way and the Network Formalism
3 Multidimensional Statistics
4 Attractors and Bifurcations
5 Conclusion
References
Chapter 12: Uses and Abuses of the Atomic Displacement Parameters in Structural Biology
1 Introduction
2 Atom Flexibility in Crystals
3 B-Factors
4 Are B-Factors Portable?
5 B-Factor Validation
6 Flexibility Prediction
6.1 Protein Flexibility
6.2 Nucleic Acid Flexibility
7 B-Factor and Protein Engineering
8 Estimated Standard Errors of Atomic Positions
9 Anisotropic B Factors
10 B-Factors and Protein Sequence
11 The Issue of Occupancy
11.1 Radiation Damage
11.2 Intrinsically Disordered Regions
11.3 When B-Factors Are Too Large
12 Rigid Molecules with Large B-Factors
13 B-Factors Outlooks
References
Chapter 13: Optimizing the Parametrization of Homologue Classification in the Pan-Genome Computation for a Bacterial Species: …
1 Introduction
1.1 About the Pan-Genome Structure
1.2 Pan-Genome Analysis Software
2 About the Pan-Genome Development and the Computational Workflow
2.1 Preprocessing and Quality Control Workflow
2.2 Orthologous Analysis
2.3 Pan-Genome Development
2.4 Downstream Analysis and Application of Pan-Genome
3 Case Study: Determining the Pan-Genome of Streptococcus pyogenes
3.1 Effect of Identity and Coverage on the Pan-Genome Size
3.2 Similarity Evaluation Across Different Methods
3.3 Pan-Genome Size of Streptococcus pyogenes
3.4 Comparison with Existing Study on Streptococcus pyogenes
3.5 Protein Domain Architecture of Core Genome
3.6 Biological Interpretation of Core Genome
4 Future Application and Direction
5 Conclusions
References
Part IV: Big Data
Chapter 14: Computational Pipeline for Rational Drug Combination Screening in Patient-Derived Cells
1 Introduction
2 Materials
2.1 Isolation of Peripheral Blood Mononuclear Cells (PBMCs) from CLL Patient Samples (See Note 1)
2.2 Co-culture of CLL Cells with APRIL/BAFF/CD40L+ Fibroblasts (Day 1) (See Note 3)
2.3 Drug Sensitivity Testing in a 384-Well Plate Format (Day 2)
2.4 Measurement of Cell Viability with CellTiter-Glo (Day 5)
3 Methods
3.1 DECREASE Model for Drug Combination Dose-Response Matrix Prediction
3.2 Data Preparation for DECREASE
3.3 Prediction of Complete Drug-Combination Matrices with DECREASE (decrease.fimm.fi)
3.4 SynergyFinder for the Identification of Most Synergistic Drug Combinations
3.5 Data Preparation and User Options for SynergyFinder
3.6 Synergy Scoring Using SynegyFinder v2.0 (https://synergyfinder.fimm.fi)
3.7 SynToxProfiler for the Identification of Safe and Effective Combinations
3.8 Data Preparation and User Options for SynToxProfiler (https://syntoxprofiler.fimm.fi/)
3.9 Summary Results from SynToxProfiler
3.10 STE Score Scatterplot from SynToxProfiler
3.11 Combined 3D Landscape from SynToxProfiler
4 Notes
References
Chapter 15: Deep Mining from Omics Data
1 Introduction
2 Omics Data
2.1 Omics Data Resources
3 Deep Mining
3.1 Deep Learning
3.1.1 Auto-Encoder
3.1.2 Restricted Boltzmann Machines
3.1.3 Convolutional Neural Networks
3.1.4 Recurrent Neural Networks
4 Deep Feature Mining Techniques
4.1 Hidden Layer Visualization and Interpretation
4.2 Feature Importance and Impact Evaluation
4.3 Output Layer Gradient Analysis
5 Discussion
References
Index
People also search for Data Mining Techniques for the Life Sciences 3rd Edition:
data mining techniques review
data mining techniques research papers
data mining techniques regression
data mining methods for sequence analysis
what is data mining techniques
Tags: Oliviero Carugo, Frank Eisenhaber, Data Mining Techniques, Life Sciences