Linear Regression

A visual introduction to linear regression and the perceptron

I'm currently looking for a new job and after interviewing at a few places I finally got my first offer! Yay!

But after looking at the salary, my first thought was "Is this considered a good offer?"

It’s not easy to tell without a frame of reference. So I ask my friends who are in the same field and ended up with a few data points:

We can see that there's a correlation between years of experience and how much a person is getting paid.

One way to figure out whether the offer was good is to see if it's above or below the average for my years of experience

How about we create a formula which looks at the average salary per year of experience?

Here's what our formula can look like:

This is a simple predictive model that takes an input (years of experience), does a calculation (multiplies it by the average salary per year of experience), and gives an output (predicted salary)

Calculating the prediction is simple multiplication. But before that, we needed to think about the weight we’ll be multiplying by

Here we started with an average, later we’ll look at better algorithms that can scale as we get more inputs and more complicated models

Finding the weight is the “training” stage. So whenever you hear of someone “training”, it just means finding the weights we use to calculate the prediction

Now, lets go back to our salary and see what we think we shouldve gotten paid. Let's say I have 4 years of experience, we can put this into our formula to check

Hmm does that look right? Now I'm curious, is average really the best number to look at here? Let's plot out our data to find out

And plot the prediction line

This line doesn't look very accurate. But what does accuracy mean? One way of checking the how good of a job our model does is by measuring how "off" the predicted value is from true values

Here we can see the actual price value, the predicted price value, and the difference between them

We call this our "Error".

Now that we defined our measuring stick for what makes a better model, let’s experiment with a couple more weight values and compare them with our average pick:

Oh there's one thing we never realized! When you start your first job, your salary actually goes from 0 (when you're in school) to the tens of thousands

Let's add this into our model as well:

In this context, we call it a “bias”

Let's try manually “training” our regression model, minimize the error by tweaking the weight and bias dials:

Congratulations on manually training your first regression model!

Next we can learn how machines do it with an algorithm called "Gradient Descent"