Imagine you've got two normally distributed random variables, x and y. Here is their relationship:
We can test this by trying to fit a linear model (y = ax + b) to the data (the red line). This procedure shows us that the variance in x accounts for less than 1% of the variance in y.
But let's imagine you can't collect all of the data. Instead your sample includes only two of those data points. Again you try and fit a linear model and...
That is overfitting. When your model has too many parameters relative to the number of data points, you're prone to overestimate the utility of your model.
We can keep going with this:
Again, sampling one more data point:
And so on.