Deep Learning 101: Lesson 2: Linear Regression

Muneeb S. Ahmad
4 min readAug 28, 2024

--

This article is part of the “Deep Learning 101” series. Explore the full series for more insights and in-depth learning here.

Linear regression is a technique we use to find a relationship between two things and then use that relationship to make predictions. For example, let’s say we want to find out if there is a relationship between the amount of time a student studies for a test and the grade they get on that test. We can use linear regression to find out if there is a relationship between those two things, and if there is, we can use that relationship to predict the grade a student will get based on how much they study. As another example, let’s say you want to predict how much money you will make based on how many hours you work. To do this, you would collect data and then use linear regression to find a line that best fits the data. Once you have this line, you can use it to predict how much money you will make for any number of hours you work.

To demonstrate linear regression, let’s look at the following example of data for the variables x and y.

x: 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150

y: 7, 8, 8, 9, 9, 9, 10, 11, 14, 14, 15

Let’s assume that this data can be represented in the form of a line equation like the following:

y’ = ax+b

Here, y’ (or y-prime) represents the predicted values that we can compare with the actual values y.

Let’s say we initialize the line equation with some arbitrary parameter values of a and b, as follows:

a = 0.02

b = 8.78

With these values, the line equation becomes:

y’ = 0.02x + 8.78

Now let’s plot the x-y values as a 2D scatter plot and the line plot of the above line equation as shown below:

Figure 1: Linear Regression Example

In this graph, the input x,y data is shown as a scatter plot with dots, and the solid line shows the regression line that approximates and represents the data.

Our goal is to use some algorithm to find the values of a and b that best fit the data. This is where the linear regression technique comes in. The process of finding the most appropriate values of a and b is known as “fitting the model to the data”. In linear regression, this fitting is usually done using a method called “least squares”. This method minimizes the sum of the squares of the differences (called residuals) between the predicted and actual values in the data set.

Least Squares Method

The least squares method calculates the optimal values of a (slope) and b (y-intercept) such that the sum of the squared differences between the predicted and actual values is minimized. Mathematically, it seeks to solve the following equation:

Where:

  • yᵢ and xᵢ are the actual observed values from the dataset,
  • n is the number of observations in the dataset,
  • axᵢ + b is the prediction made by our linear model.

When we apply the least squares method to our data, we calculate the values for a and b that best fit our data points. After this calculation, our line equation looks like the following:

y’ = 0.08x + 2.47

The graph below shows how the above equation, shown as a solid line, appears to fit the input data, shown as dots.

Figure 2: Final Linear Regression Model with Best Fit Line

This example illustrates how linear regression is used to establish a relationship between two variables in order to make predictions. Specifically, it demonstrates the process of using least squares to fit a linear model to a data set, showing the initial arbitrary line equation and how the optimized parameters improve the fit to the observed data. This basic understanding allows for practical applications in predicting outcomes based on linear relationships.

Summary

The process of data scaling is fundamental in preparing data for machine learning models. It ensures that features are on a similar scale, which enhances model performance and convergence speed during training. Techniques like normalization and standardization adjust the range and distribution of data, respectively, ensuring that each feature contributes equally to the model. By scaling data appropriately, we reduce the risk of models being biased towards features with larger magnitudes and improve the overall robustness and accuracy of predictions.

4 Ways to Learn

1. Read the article: Linear Regression

2. Play with the visual tool: Linear Regression

Play with the visual tool: Linear Regression

3. Watch the video: Linear Regression

4. Practice with the code: Linear Regression

Previous Article: Data Scaling
Next Article: Loss and Metric

--

--

Muneeb S. Ahmad

Muneeb Ahmad is a Senior Microservices Architect and Recognized Educator at IBM. He is pursuing passion in ABC (AI, Blockchain, and Cloud)