A First Course in Machine Learning 2nd Edition Solutions Manual Simon Rogers – Ebook Instant Download/Delivery ISBN(s): 9781498738569, 1498738567
Product details:
- ISBN 10:1498738567
- ISBN 13: 9781498738569
- Author: Solutions Manual, Simon Rogers
A First Course in Machine Learning
Table contents:
SECTION I Basic Topics
CHAPTER 1 ■ Linear Modelling: A Least Squares Approach
1.1 LINEAR MODELLING
1.1.1 Defining the model
1.1.2 Modelling assumptions
1.1.3 Defining a good model
1.1.4 The least squares solution – a worked example
1.1.5 Worked example
1.1.6 Least squares fit to the Olympic data
1.1.7 Summary
1.2 MAKING PREDICTIONS
1.2.1 A second Olympic dataset
1.2.2 Summary
1.3 VECTOR/MATRIX NOTATION
1.3.1 Example
1.3.2 Numerical example
1.3.3 Making predictions
1.3.4 Summary
1.4 NON-LINEAR RESPONSE FROM A LINEAR MODEL
1.5 GENERALISATION AND OVER-FITTING
1.5.1 Validation data
1.5.2 Cross-validation
1.5.3 Computational scaling of K-fold cross-validation
1.6 REGULARISED LEAST SQUARES
1.7 EXERCISES
1.8 FURTHER READING
CHAPTER 2 ■ Linear Modelling: A Maximum Likelihood Approach
2.1 ERRORS AS NOISE
2.1.1 Thinking generatively
2.2 RANDOM VARIABLES AND PROBABILITY
2.2.1 Random variables
2.2.2 Probability and distributions
2.2.3 Adding probabilities
2.2.4 Conditional probabilities
2.2.5 Joint probabilities
2.2.6 Marginalisation
2.2.7 Aside – Bayes’ rule
2.2.8 Expectations
2.3 POPULAR DISCRETE DISTRIBUTIONS
2.3.1 Bernoulli distribution
2.3.2 Binomial distribution
2.3.3 Multinomial distribution
2.4 CONTINUOUS RANDOM VARIABLES – DENSITY FUNCTIONS
2.5 POPULAR CONTINUOUS DENSITY FUNCTIONS
2.5.1 The uniform density function
2.5.2 The beta density function
2.5.3 The Gaussian density function
2.5.4 Multivariate Gaussian
2.6 SUMMARY
2.7 THINKING GENERATIVELY…CONTINUED
2.8 LIKELIHOOD
2.8.1 Dataset likelihood
2.8.2 Maximum likelihood
2.8.3 Characteristics of the maximum likelihood solution
2.8.4 Maximum likelihood favours complex models
2.9 THE BIAS-VARIANCE TRADE-OFF
2.9.1 Summary
2.10 EFFECT OF NOISE ON PARAMETER ESTIMATES
2.10.1 Uncertainty in estimates
2.10.2 Comparison with empirical values
2.10.3 Variability in model parameters – Olympic data
2.11 VARIABILITY IN PREDICTIONS
2.11.1 Predictive variability – an example
2.11.2 Expected values of the estimators
2.12 CHAPTER SUMMARY
2.13 EXERCISES
2.14 FURTHER READING
CHAPTER 3 ■ The Bayesian Approach to Machine Learning
3.1 A COIN GAME
3.1.1 Counting heads
3.1.2 The Bayesian way
3.2 THE EXACT POSTERIOR
3.3 THE THREE SCENARIOS
3.3.1 No prior knowledge
3.3.2 The fair coin scenario
3.3.3 A biased coin
3.3.4 The three scenarios – a summary
3.3.5 Adding more data
3.4 MARGINAL LIKELIHOODS
3.4.1 Model comparison with the marginal likelihood
3.5 HYPERPARAMETERS
3.6 GRAPHICAL MODELS
3.7 SUMMARY
3.8 A BAYESIAN TREATMENT OF THE OLYMPIC 100m DATA
3.8.1 The model
3.8.2 The likelihood
3.8.3 The prior
3.8.4 The posterior
3.8.5 A first-order polynomial
3.8.6 Making predictions
3.9 MARGINAL LIKELIHOOD FOR POLYNOMIAL MODEL ORDER SELECTION
3.10 CHAPTER SUMMARY
3.11 EXERCISES
3.12 FURTHER READING
CHAPTER 4 ■ Bayesian Inference
4.1 NON-CONJUGATE MODELS
4.2 BINARY RESPONSES
4.2.1 A model for binary responses
4.3 A POINT ESTIMATE – THE MAP SOLUTION
4.4 THE LAPLACE APPROXIMATION
4.4.1 Laplace approximation example: Approximating a gamma density
4.4.2 Laplace approximation for the binary response model
4.5 SAMPLING TECHNIQUES
4.5.1 Playing darts
4.5.2 The Metropolis–Hastings algorithm
4.5.3 The art of sampling
4.6 CHAPTER SUMMARY
4.7 EXERCISES
4.8 FURTHER READING
CHAPTER 5 ■ Classification
5.1 THE GENERAL PROBLEM
5.2 PROBABILISTIC CLASSIFIERS
5.2.1 The Bayes classifier
5.2.1.1 Likelihood – class-conditional distributions
5.2.1.2 Prior class distribution
5.2.1.3 Example – Gaussian class-conditionals
5.2.1.4 Making predictions
5.2.1.5 The naive-Bayes assumption
5.2.1.6 Example – classifying text
5.2.1.7 Smoothing
5.2.2 Logistic regression
5.2.2.1 Motivation
5.2.2.2 Non-linear decision functions
5.2.2.3 Non-parametric models – the Gaussian process
5.3 NON-PROBABILISTIC CLASSIFIERS
5.3.1 K-nearest neighbours
5.3.1.1 Choosing K
5.3.2 Support vector machines and other kernel methods
5.3.2.1 The margin
5.3.2.2 Maximising the margin
5.3.2.3 Making predictions
5.3.2.4 Support vectors
5.3.2.5 Soft margins
5.3.2.6 Kernels
5.3.3 Summary
5.4 ASSESSING CLASSIFICATION PERFORMANCE
5.4.1 Accuracy – 0/1 loss
5.4.2 Sensitivity and specificity
5.4.3 The area under the ROC curve
5.4.4 Confusion matrices
5.5 DISCRIMINATIVE AND GENERATIVE CLASSIFIERS
5.6 CHAPTER SUMMARY
5.7 EXERCISES
5.8 FURTHER READING
CHAPTER 6 ■ Clustering
6.1 THE GENERAL PROBLEM
6.2 K-MEANS CLUSTERING
6.2.1 Choosing the number of clusters
6.2.2 Where K-means fails
6.2.3 Kernelised K-means
6.2.4 Summary
6.3 MIXTURE MODELS
6.3.1 A generative process
6.3.2 Mixture model likelihood
6.3.3 The EM algorithm
6.3.3.1 Updating πk
6.3.3.2 Updating μk
6.3.3.3 Updating Σk
6.3.3.4 Updating qnk
6.3.3.5 Some intuition
6.3.4 Example
6.3.5 EM finds local optima
6.3.6 Choosing the number of components
6.3.7 Other forms of mixture component
6.3.8 MAP estimates with EM
6.3.9 Bayesian mixture models
6.4 CHAPTER SUMMARY
6.5 EXERCISES
6.6 FURTHER READING
CHAPTER 7 ■ Principal Components Analysis and Latent Variable Models
7.1 THE GENERAL PROBLEM
7.1.1 Variance as a proxy for interest
7.2 PRINCIPAL COMPONENTS ANALYSIS
7.2.1 Choosing D
7.2.2 Limitations of PCA
7.3 LATENT VARIABLE MODELS
7.3.1 Mixture models as latent variable models
7.3.2 Summary
7.4 VARIATIONAL BAYES
7.4.1 Choosing Q(θ)
7.4.2 Optimising the bound
7.5 A PROBABILISTIC MODEL FOR PCA
7.5.1 QT (T)
7.5.2 Qxn (xn)
7.5.3 Qwm (wm)
7.5.4 The required expectations
7.5.5 The algorithm
7.5.6 An example
7.6 MISSING VALUES
7.6.1 Missing values as latent variables
7.6.2 Predicting missing values
7.7 NON-REAL-VALUED DATA
7.7.1 Probit PPCA
7.7.2 Visualising parliamentary data
7.7.2.1 Aside – relationship to classification
7.8 CHAPTER SUMMARY
7.9 EXERCISES
7.10 FURTHER READING
SECTION II Advanced Topics
CHAPTER 8 ■ Gaussian Processes
8.1 PROLOGUE – NON-PARAMETRIC MODELS
8.2 GAUSSIAN PROCESS REGRESSION
8.2.1 The Gaussian process prior
8.2.2 Noise-free regression
8.2.3 Noisy regression
8.2.4 Summary
8.2.5 Noisy regression – an alternative route
8.2.6 Alternative covariance functions
8.2.6.1 Linear
8.2.6.2 Polynomial
8.2.6.3 Neural network
8.2.7 ARD
8.2.8 Composite covariance functions
8.2.9 Summary
8.3 GAUSSIAN PROCESS CLASSIFICATION
8.3.1 A classification likelihood
8.3.2 A classification roadmap
8.3.3 The point estimate approximation
8.3.4 Propagating uncertainty through the sigmoid
8.3.5 The Laplace approximation
8.3.6 Summary
8.4 HYPERPARAMETER OPTIMISATION
8.5 EXTENSIONS
8.5.1 Non-zero mean
8.5.2 Multiclass classification
8.5.3 Other likelihood functions and models
8.5.4 Other inference schemes
8.6 CHAPTER SUMMARY
8.7 EXERCISES
8.8 FURTHER READING
CHAPTER 9 ■ Markov Chain Monte Carlo Sampling
9.1 GIBBS SAMPLING
9.2 EXAMPLE: GIBBS SAMPLING FOR GP CLASSIFICATION
9.2.1 Conditional densities for GP classification via Gibbs sampling
9.2.2 Summary
9.3 WHY DOES MCMC WORK?
9.4 SOME SAMPLING PROBLEMS AND SOLUTIONS
9.4.1 Burn-in and convergence
9.4.2 Autocorrelation
9.4.3 Summary
9.5 ADVANCED SAMPLING TECHNIQUES
9.5.1 Adaptive proposals and Hamiltonian Monte Carlo
9.5.2 Approximate Bayesian computation
9.5.3 Population MCMC and temperature schedules
9.5.4 Sequential Monte Carlo
9.6 CHAPTER SUMMARY
9.7 EXERCISES
9.8 FURTHER READING
CHAPTER 10 ■ Advanced Mixture Modelling
10.1 A GIBBS SAMPLER FOR MIXTURE MODELS
10.2 COLLAPSED GIBBS SAMPLING
10.3 AN INFINITE MIXTURE MODEL
10.3.1 The Chinese restaurant process
10.3.2 Inference in the infinite mixture model
10.3.3 Summary
10.4 DIRICHLET PROCESSES
10.4.1 Hierarchical Dirichlet processes
10.4.2 Summary
10.5 BEYOND STANDARD MIXTURES – TOPIC MODELS
10.6 CHAPTER SUMMARY
10.7 EXERCISES
10.8 FURTHER READING
People also search:
a first course in machine learning
a first course in machine learning simon rogers
a first course in machine learning solutions manual pdf
a first course in machine learning second edition pdf
a first course in machine learning rogers