# Back To The Basics

I’ve decided to compile the notes I’ve made over the course of my ML journey as a series of blog posts here on my website. I’ve revised machine learning, deep learning and natural language processing concepts a couple of times since beginning my ML journey almost 3 years ago, and found that each time I revised, I added a deeper level of understanding about these concepts.

You can view other topics in this series by clicking on the

ML NOTEScategory in the article header above.

## Disclaimer

I’ve read through multiple sources; articles, documentation pages, research papers and textbooks, to compile my notes. I was looking to maximise my understanding of the concepts and, previously, never intended to share them with the world. So, I did not do a good job of documenting sources for reference later on.

I’ll leave references to source materials if I have them saved. Please note that I’m not claiming sole authorship of these blog posts; these are just my personal notes and I’m sharing them here in the hopes that they’ll be helpful to you in your own ML journey.

Take these articles as a starting point to comprehend the concepts. If you spot any mistakes or errors in these articles or have suggestions for improvement, please feel free to share your thoughts with me through my LinkedIn.

We’ll take a look at the humble logistic regression model in this post.

# The Logistic Regression model

Logistic Regression is a simple algorithm used for predicting binary classification outcomes. This model outputs a probability score between 0 and 1 given a data sample, which can be interpreted as the likelihood of the sample belonging to a class (of 2 classes), or an outcome etc.

The logistic regression model is based on the linear regression equation which is then passed through a sigmoid activation function that squashes the regression output values to the range of (0, 1).

## Equations

The two calculations happening to provide the binary classification outcome are:

**1. Net Input function** which is the linear regression equation, with feature coefficients represented by \(w_i\) for each feature \(x_i\)

\[z = w_0 + w_1 x_1 + w_2 x_2 + \ldots + w_m x_m\]

\[z = w_0 + \sum_{i=1}^{m} w_i x_i\]

**2. Sigmoid Activation function** which forces the z value to the range of (0, 1)

\[\Phi(z) = \frac{1}{1 + e^{-z}}\]

Logistic Regression is a linear, additive model as the output always depends on the sum of the inputs and parameters.

As the output does not depend on the product (or quotient, exponent etc) of the features, this model does not account for any non-linear interactions between the features themselves.

## Some guidelines to keep in mind

Below are some points to keep in mind when deciding to choose a logistic regression model:

**1. Linear relationship**: When the relationship between the independent features and the dependent output variable is linear, logistic regression can be a good model to start experimenting with.

**2. Low-Dimensional Spaces:** Logistic regression can be more interpretable and simpler to implement when the number of features in your data is low.

**3. Large Datasets:** Logistic regression can be more computationally efficient and easier to train on large datasets. Moreover, LR models also tend to improve in accuracy when they have lot more samples to learn from and optimise the feature coefficient values.

**4. If you need probability estimates** for the predictions, logistic regression naturally provides these, making it easier to interpret the likelihood of the classes.

**5. Sensitivity to Outliers:** Logistic regression models tend to be more sensitive to outliers, so if you have a lot of outliers in your dataset you may want to implement some outlier handling methods or go for a more robust model.

**6. Model Complexity + Training Time:** LR is generally faster to train and is less computationally intensive, making it a good choice for simpler models and faster iteration.

**7. Interpretability:** LR models are often more interpretable as you can interpret the importance of features to the outcome by analysing the feature coefficients.