更新时间:2021-06-24 14:23:51
coverpage
Title Page
Copyright and Credits
Mastering Machine Learning on AWS
Dedication
About Packt
Why subscribe?
Packt.com
Contributors
About the authors
About the reviewer
Packt is searching for authors like you
Preface
Who this book is for
What this book covers
To get the most out of this book
Download the example code files
Download the color images
Conventions used
Get in touch
Reviews
Section 1: Machine Learning on AWS
Getting Started with Machine Learning for AWS
How AWS empowers data scientists
Using AWS tools for ML
Identifying candidate problems that can be solved using ML
The ML project life cycle
Data gathering
Evaluation metrics
Algorithm selection
Deploying models
Summary
Exercises
Section 2: Implementing Machine Learning Algorithms at Scale on AWS
Classifying Twitter Feeds with Naive Bayes
Classification algorithms
Feature types
Nominal features
Ordinal features
Continuous features
Naive Bayes classifier
Bayes' theorem
Posterior
Likelihood
Prior probability
Evidence
How the Naive Bayes algorithm works
Classifying text with language models
Collecting the tweets
Preparing the data
Building a Naive Bayes model through SageMaker notebooks
Naïve Bayes model on SageMaker notebooks using Apache Spark
Using SageMaker's BlazingText built-in ML service
Naive Bayes – pros and cons
Predicting House Value with Regression Algorithms
Predicting the price of houses
Understanding linear regression
Linear least squares estimation
Maximum likelihood estimation
Gradient descent
Evaluating regression models
Mean absolute error
Mean squared error
Root mean squared error
R-squared
Implementing linear regression through scikit-learn
Implementing linear regression through Apache Spark
Implementing linear regression through SageMaker's Linear Learner
Understanding logistic regression
Logistic regression in Spark
Pros and cons of linear models
Predicting User Behavior with Tree-Based Methods
Understanding decision trees
Recursive splitting
Types of decision trees
Cost functions
Gini Impurity
Information gain
Criteria to stop splitting trees
Understanding random forest algorithms
Understanding gradient-boosting algorithms
Predicting clicks on log streams
Introduction to Elastic MapReduce (EMR)
Training with Apache Spark on EMR
Getting the data
Categorical encoding
One-hot encoding
Training a model
Evaluating our model
Area under the ROC curve
Area under the precision-recall curve
Training tree ensembles on EMR
Training gradient-boosted trees with the SageMaker services
Training with SageMaker XGBoost
Applying and evaluating the model