Supervised Machine Learning: The Summary Way

Jennifer Boyles
2 min readDec 23, 2020

These past couple of weeks our DSI cohort has been learning the intricacies of supervised machine learning — the linear regression, logistic regression, and k-nearest neighbor algorithms, to be more specific. There are several kinds of machine learning: supervised, unsupervised, semi-supervised, and reinforcement.

In this post, I will be focusing on supervised machine learning. What exactly is supervised machine learning? Supervised machine learning occurs when a ‘supervisor’ helps train the algorithm. This supervisor is a training dataset, in which the ground truth or true values are known. After training the algorithm on this dataset, the coefficients for each predictor variables will be held within the machine learning model’s class as an attribute. With these coefficients, the model’s class holds all the tools it needs to perform predictions on other datasets, most importantly the holdout or testing dataset. The model is then evaluated on how it performs on the testing data.

The common algorithms used in supervised machine learning are classification and regression. Classification models are used to predict discrete variables. The model will predict the input value as a member of a particular class. Classification models are scored based on how accurately they can classify input values. Regression models are used to predict continuous variables. The model will predict a continuous value from the input variables. Regression models can be evaluated with several different evaluation metrics, such as mean squared error, mean absolute error, root mean squared error, and r2 score.

In summary, supervised machine learning establishes the best model for predicting outputs by training the algorithm with data that contains ground truth values.

--

--

Jennifer Boyles

Data Scientist. Biology/Chemistry Nerd. Find me on LinkedIn — Jennifer Boyles!