Bias and Variance are two fundamental concepts for Machine Learning, and their intuition is just a little different from what you might have learned in your statistics class. Here I go through two examples that make these concepts super easy to understand.

For a complete index of all the StatQuest videos, check out:

statquest.org/video-index/

If you'd like to support StatQuest, please consider...

Patreon: www.patreon.com/statquest

...or...

BRvid Membership: brvid.net/show-UCtYLUTtgS3k1Fg4y5tAhLbwjoin

...a cool StatQuest t-shirt or sweatshirt (USA/Europe): teespring.com/stores/statquest

(everywhere):

www.redbubble.com/people/starmer/works/40421224-statquest-double-bam?asc=u&p=t-shirt

...buying one or two of my songs (or go large and get a whole album!)

joshuastarmer.bandcamp.com/

...or just donating to StatQuest!

www.paypal.me/statquest

Lastly, if you want to keep up with me as I research and create new StatQuests, follow me on twitter:

twitter.com/joshuastarmer

0:00 Awesome song and introduction

0:29 The data and the "true" model

1:23 Splitting the data into training and testing sets

1:40 Least Regression fit to the training data

2:16 Definition of Bias

2:33 Squiggly Line fit to the training data

3:40 Model performance with the testing dataset

4:06 Definition of Variance

5:10 Definition of Overfit

#statquest #ML

17 Set 2018

Minha playlist

Assista mais tarde

Acu 20 horas atrás

That intro is the reason I love it

StatQuest with Josh Starmer 18 horas atrás

Hooray!!! :)

Frank Zhang 10 dias atrás

Really clear!

StatQuest with Josh Starmer 9 dias atrás

Thanks! :)

Chicken Tikka Sauce 10 dias atrás

What an amazing fucking video. Bravo.

mammad mortaji 11 dias atrás

Thank you. This video was so intuitive.

StatQuest with Josh Starmer 10 dias atrás

BAM! :)

Aswin Tekur 12 dias atrás

TRIPLE BAM

StatQuest with Josh Starmer 12 dias atrás

YES! :)

Morty Smith 12 dias atrás

Notes for myself: Def. of Bias: The inability for a machine learning method to capture the true relationship is called Bias. Def. of Variance: The difference in fits between data sets is called Variance.

BrandonSLockey Dia atrás

M-m-Morty huh? Learning some m-m-machine learning? Your grandpa rick would be p-p-proud of you **burp**, Morty.

Usman Chaudhri 17 dias atrás

I was on a quest to understand Bias and Variance for a longtime until i saw StatQuest. Good work explaining, Josh.

StatQuest with Josh Starmer 17 dias atrás

BAM! :)

武逸仙 18 dias atrás

You just like an expert on teaching!! (And the video is really friendly to foreigners like me! Speaking pace and captions eliminate the language barrier. I can totally understand every sentence you said!! How lucky I am meeting you!!!!!

StatQuest with Josh Starmer 18 dias atrás

Hooray!!! I'm so glad you like my videos! :)

武逸仙 18 dias atrás

Your video just like a miracle!!💕

Kushal Kittu 18 dias atrás

Amazingly explained ! Subscribed

StatQuest with Josh Starmer 18 dias atrás

Awesome, thank you!

Isuru Karunarathna 19 dias atrás

using regression function in excel, we can find the best curve. cannot we?

StatQuest with Josh Starmer 19 dias atrás

The regression function will find the line that minimizes the least squares of the residuals. Is this the best curve? I don't know.

Ashish Gambhir 20 dias atrás

What is the nest step you recommend to become more good in ML after your course

StatQuest with Josh Starmer 20 dias atrás

I have a whole bunch of videos on ML and Statistics here: statquest.org/video-index/

Joe Bater 20 dias atrás

Best, most intuitively understoood, explanation of this that I've ever seen!

StatQuest with Josh Starmer 20 dias atrás

BAM! :)

Zinc CYanide 21 dia atrás

Man you probably explain things better than Andrew Ng!

StatQuest with Josh Starmer 21 dia atrás

Thank you very much! :)

ankit sharma 23 dias atrás

Thanks sir. Its gives me a detailed explanation

StatQuest with Josh Starmer 23 dias atrás

Thanks! :)

Abhinandan Choubey 23 dias atrás

I can seriously binge-watch this Channel!! Thanks, @JoshStarmer

StatQuest with Josh Starmer 23 dias atrás

BAM! :)

Koray Can Canut 25 dias atrás

Linear regression estimators are unbiased estimators. I dont understand what you mean by bias.

Koray Can Canut 25 dias atrás

@StatQuest with Josh Starmer thank you

StatQuest with Josh Starmer 25 dias atrás

In machine learning, bias refers to how accurate the model reflects the true process.

Chirag Palan 26 dias atrás

BAM!!!

ANKITKUMAR SINGH 29 dias atrás

BAM!!!

BrandonSLockey Mês atrás

6:03 ahhh yes boosting, my favourite way to make money in league of legends ;)

Milton Simões Mês atrás

Crazy good! Thank u!

StatQuest with Josh Starmer Mês atrás

Thanks! :)

chengqun liu Mês atrás

Nice!

StatQuest with Josh Starmer Mês atrás

Thanks!

Rahul Singh Mês atrás

Clear and precise. Bam!!

nur wani Mês atrás

very clear, no extra unnecessary "noise". I really enjoyed this lesson.

StatQuest with Josh Starmer Mês atrás

Awesome, thank you!

Basma Al-Ghali Mês atrás

Thank u

StatQuest with Josh Starmer Mês atrás

Thanks! :)

Bla Bla Mês atrás

StatQuest is TRIPLE BAMMMMM!

StatQuest with Josh Starmer Mês atrás

YES! :)

SHUBHAM SHAH Mês atrás

Fantastic Explanation, This channel is on god mode!

StatQuest with Josh Starmer Mês atrás

:)

Ousseynou Dieng Mês atrás

exactly what i was searching for

StatQuest with Josh Starmer Mês atrás

Hooray! :)

Aggelos Didachos Mês atrás

I like your humor too. It is deliberately slightly bad which makes it even better! Also, after a lot of search, I finally found a tangible explanation of bias and variance instead of abstract definitions. So, thank you sir!

StatQuest with Josh Starmer Mês atrás

Thank you very much! :)

Rakesh Ranjan Mês atrás

Dude you are awesome, this is my first video that I have seen from your channel. Plan on watching your other videos as well. Such great visualizations. just wow.

StatQuest with Josh Starmer Mês atrás

Thank you very much! :)

sayaji jadhav Mês atrás

What's is mean by aka?

StatQuest with Josh Starmer Mês atrás

aka = also known as

TOT Mês atrás

awesome baammmm

StatQuest with Josh Starmer Mês atrás

Thank you! :)

Peter J.M. Puyneers Mês atrás

very good explanation especially for someone who started studying statistics after getting his university degree 30 years ago

StatQuest with Josh Starmer Mês atrás

Thank you! Good luck with your studies! :)

Puff Vayne Mês atrás

I like the opening song a lot, thanks for your explaining !

StatQuest with Josh Starmer Mês atrás

Glad you like it! :)

Nishant Kumar Bundela Mês atrás

low bias- overfitting. high variance- overfitting. BAM. My mind is confused.

StatQuest with Josh Starmer Mês atrás

Low bias doesn't necessarily mean overfitting. You can have a model that fits the training data well (low bias) AND the testing data (low variance). However, if you have high variance, then you have over fit.

Bhargav Potluri Mês atrás

Yes. Better than many online and paid videos. I have gone through almost all the videos on Machine Learning of yours & started my 1st comment with your videos :). Thanks a lot. Can you please come up with one video where algorithms perform poorly as well, if you have time

StatQuest with Josh Starmer Mês atrás

I'm glad you like the videos! :)

camila onofri Mês atrás

best explanation ever!!!!

StatQuest with Josh Starmer Mês atrás

Hooray! :)

Ayush Gupta Mês atrás

Best explanation ever

StatQuest with Josh Starmer Mês atrás

Thank you! :)

ExoPhantomFalcon Mês atrás

4:18 you misspelled height.

Daniel A Esmaili Mês atrás

Short and to the point and clear. Thanks

StatQuest with Josh Starmer Mês atrás

Thanks! :)

Tarak B Mês atrás

After this video ,I played a song of yours and ended up listening whole album"song of the month" serious contenders to billboard sir 🔥🔥🔥🔥

StatQuest with Josh Starmer Mês atrás

Thank you very much!! :)

Yan Nurlanl Mês atrás

Your "BAM" is absolutely "BAM"

StatQuest with Josh Starmer Mês atrás

Hooray! :)

Joseph Iles Mês atrás

What an outstandingly simple and intuitive explanation, bravo!

StatQuest with Josh Starmer Mês atrás

Thank you! :)

Lily Ha Mês atrás

Hi, how do you decide which data points are supposed to be in the training set and which one in the test? are the training and test in this example are the same as "Learning data set" and the "test dataset"?

Lily Ha Mês atrás

@StatQuest with Josh Starmer Thanks you Josh, You're amazing :)

StatQuest with Josh Starmer Mês atrás

We use cross validation to pick which data points go in the training and testing datasets. For details, see: brvid.net/video/video-fSytzGwwBVw.html

Noname Noname Mês atrás

Can someone tell me where the terms "bias" and "variance" come from? I guess "variance" is due to the high variability of the fit, when a different training set is used. But what about "bias"? Why not use "underfit" instead? And is there an easy trick to remember that bias goes with underfitting and variance with overfitting?

Noname Noname Mês atrás

@StatQuest with Josh Starmer thanks for the exhaustive answer! I wasn't aware of the difference of bias in statistics and ML

StatQuest with Josh Starmer Mês atrás

The term "bias" has different, but related meanings in statistics and machine learning. Since a lot of people learn statistics before they learn machine learning, I thought I'd point out how to relate the statistical meaning to the machine learning meaning. However, regardless of what oder you learn the concepts, here they are. In statistics, bias refers to consistently over estimating or consistently under estimating. A model with high bias will make predictions that are (consistently) way higher or (consistently) way lower than they should be. A model with low bias will only be off by little bit in either direction. In machine learning, bias refers to how well the model fits the training data. A model with high bias will have a poorly fitting model, and its predictions will be way off - but maybe not way off in a consistent way like when we talk about things in a statistical sense. A model with low bias will fit the data pretty well and the predictions will only be off by a little bit. NOTE: "over estimating" is different than "over fitting". In fact "over estimating" is more closely related to "under fitting". If we consistently over estimate something, then our model can not be over fitting the data.

Noname Noname Mês atrás

@StatQuest with Josh Starmer Thanks! I thought high bias always means "underfitting" and high variance means "overfitting". But you say high bias can also mean overfitting?

StatQuest with Josh Starmer Mês atrás

In statistics, the term "bias" means that "the model (or statistic) will tend to give over estimates" or "the model (or statistic) will tend to give under estimates". So, in machine learning, I think the bias is relative to the training dataset, in that when we increase bias, we make the model consistently over or under estimate the training dataset.

Somjit Mitra Mês atrás

sir i do have a doubt ...in overfitting we use to fit a curve on datasets to find the line of best fit , so it is a method for regressor? or it can be also used for a classifier?

StatQuest with Josh Starmer Mês atrás

Anything can overfit the training data. Both regression and classification.

vysakh vm Mês atrás

All your intro music give me a feeling tat the concepts are easy to understand....thanks you for building tat confidence.

StatQuest with Josh Starmer Mês atrás

Hooray! :)

Buse Maden 2 meses atrás

Thanks for amazing explanation

StatQuest with Josh Starmer 2 meses atrás

@Buse Maden YES! :)

Buse Maden 2 meses atrás

Thats a "TRIPLE BAM "@StatQuest with Josh Starmer

StatQuest with Josh Starmer 2 meses atrás

Thank you! :)

lvstet 2 meses atrás

I think this might be your best intro song. You are doing great job (with stats and songs), keep on doing it!

StatQuest with Josh Starmer 2 meses atrás

Thank you very much! :)

المبرمجة الصغيرة 2 meses atrás

you are great , thanks a lot

StatQuest with Josh Starmer 2 meses atrás

Thank you very much! :)

harsha dineth 2 meses atrás

Best ML course. Love the way you are explaining.

StatQuest with Josh Starmer 2 meses atrás

Thank you! :)

Nan 2 meses atrás

So bias is same thing like an error function in context of linear regression?

Nan 2 meses atrás

@StatQuest with Josh Starmer ah I see, thankk youu

StatQuest with Josh Starmer 2 meses atrás

@Nan They are both used in evaluation. You want to see how well the model predicts your training data (bias) and how well it predicts your testing data (variance).

Nan 2 meses atrás

@StatQuest with Josh Starmer how about variance, is same thing as evaluation? the difference between them are just where the evaluation of model to be estimated right?

Nan 2 meses atrás

@StatQuest with Josh Starmer for better understanding, we knew that the error function is part of the evaluation so can we say bias is about evaluation? If we can then to estimate the bias we could use accuracy as another option?

Nan 2 meses atrás

@StatQuest with Josh Starmer wow.. now it clears everything all this time that has been bothering me, thankss

Tomas Ramilison 2 meses atrás

Hi there, cool videos! I'm just wondering why "straight line" method is correctly referred to as linear regression, whereas "squiggly line" method was just referred to as squiggly line. Isn't it called logistic regression when we're using curves or am I missing something..?

StatQuest with Josh Starmer 2 meses atrás

Logistic Regression has a different squiggle. Regardless, the point isn't to be talking about specific machine learning methods, but machine learning methods in general. Some fit, like decision trees, overfit the data, others are not prone to this as much.

Hamed Dadgour 2 meses atrás

Dude, people are coping your explanations on BRvid and posting them as their own :)

Hamed Dadgour 2 meses atrás

@StatQuest with Josh Starmer I know. That sucks. Good luck man!

StatQuest with Josh Starmer 2 meses atrás

Thanks for sending me the link. Wow! That guy stole my video! I put in a complaint to youtube. Hopefully they will deal with it.

Hamed Dadgour 2 meses atrás

@StatQuest with Josh Starmer This guy's bias-variance video is awfully similar to yours. brvid.net/video/video-9YLdXNSVnHw.html

StatQuest with Josh Starmer 2 meses atrás

Can you send me the links?

shirkanbagira 2 meses atrás

Should it be variance or variability at 4:23 ? Really great video. Thank you !

StatQuest with Josh Starmer 2 meses atrás

The terms are interchangeable.

Cathal King 2 meses atrás

Is Variance the difference in fits between training and testing datasets or between any 2 datasets?

StatQuest with Josh Starmer 2 meses atrás

Most commonly it is thought of as difference between training and testing. However, it also the difference between training and any other dataset that you might use your trained model on.

Gurunath Hari 2 meses atrás

Josh, you said nonchalantly -"we square the distances so that negative and positive distances don't cancel out" . Very few teachers ever tell. They just state " we use residual squared. . blah blah and just move on" dumb students like me are left wondering why. dumb because we don't know how to ask the question 'Why do we square it" :) After many years i got the answer. TRIPLE BAM for me. I.E Thank you so much. All this besides a masterly understanding of the concept of bias variance tradeoff in the bargain ;) BTW Josh joshuastarmer.bandcamp.com/ says IP address not found..would love to contribute.

StatQuest with Josh Starmer 2 meses atrás

Awesome! Thank you very much. :)

Gurunath Hari 2 meses atrás

@StatQuest with Josh Starmer Noted, mate. Will look to make a humble contribution. Cheers

StatQuest with Josh Starmer 2 meses atrás

Hooray!!! I’m glad the video was helpful. And I just tried the bandcamp link and it seems to be working again. :)

Danilo Amorim 2 meses atrás

One thing that is very confusing for me, is the difference between *learning bias* (the one you explain) and regular *statistical bias* . I think that the statistical bias of a model is the systematic error made in the predictions. For example, if a model is positively biased: in average the predictions will be always higher than the true value. You can make a video with the difference between these two concepts?

StatQuest with Josh Starmer 2 meses atrás

I can put that on the to-do list.

Mary Street 2 meses atrás

i found this so clear and helpful thank you! one question - can you explain the difference between the "residuals" and "predicted errors"? I thought the vertical dashed lines from the data points to the regression line was the predicted errors (the difference between the predicted and observed value). But now I am understanding (I think) that the residuals are the difference between the observed values (from the training set) and the predicted values (a.k.a. the regression line) and the predicted errors are the difference between the observed values (from the test set) and the predicted values (a.k.a. the regression line). Is my understanding correct??

StatQuest with Josh Starmer 2 meses atrás

Unfortunately, "residual" and "error" are often used interchangeably in statistics, so there's no difference between those two - they both refer to distances between the observed values (training) and the predicted values. In Machine Learning lingo, however, the differences between the observed values (testing) and the predicted values is sometimes called "variance". Ultimately, we want to reduce variance (the differences between the observed values (testing) and the predicted values.

Ajay Vijayakumar 2 meses atrás

This is absolutely brilliant M8, crisp, clear and very concise. Well Done!! You've got one more stat fan now!

StatQuest with Josh Starmer 2 meses atrás

Hooray! Thank you very much! :)

Darya Vorozheykina 2 meses atrás

Wow, you are amazing!!

StatQuest with Josh Starmer 2 meses atrás

Thank you very much! :)

Saad Salman 2 meses atrás

best video ever

StatQuest with Josh Starmer 2 meses atrás

Thank you very much! :)

kirtan desai 2 meses atrás

My favourite education channel

StatQuest with Josh Starmer 2 meses atrás

Hooray!! Thank you very much! :)

Jorge Andrés Franco 2 meses atrás

Thank you.

Trevor Cousins 2 meses atrás

excellent description. Thanks!

StatQuest with Josh Starmer 2 meses atrás

Thanks! :)

wing tsang 2 meses atrás

i heard there's also bias-variance tradeoff, could you please elaborate it?

StatQuest with Josh Starmer 2 meses atrás

This entire video explains both the concepts of bias and variance and the bias-variance trade off. The bias-variance tradeoff is covered at 4:13.

Abhishek Sharma 3 meses atrás

Your introductory music is always awesome

StatQuest with Josh Starmer 3 meses atrás

You have great taste in music!!! :)

なみやたびと 3 meses atrás

The video is good. Thx a lot. I also like the "Double Bams!!!"

StatQuest with Josh Starmer 3 meses atrás

Hooray! I'm glad you like the videos. :)

Luiz Cordolino 3 meses atrás

Your lessons are amazing good.

StatQuest with Josh Starmer 3 meses atrás

Thank you! :)

BHARATH MUKKA 3 meses atrás

Can you suggest me a good software for practising and playing with data sets ?

StatQuest with Josh Starmer 3 meses atrás

R is my favorite. Python is also good.

RahulEdvin 3 meses atrás

You are a Legend Sir !

StatQuest with Josh Starmer 3 meses atrás

Thank you! :)

Siwaphat Boonbangyang 3 meses atrás

I subbed immediately after I heard the intro song I like that.

StatQuest with Josh Starmer 3 meses atrás

Awesome! :)

Mohammadreza 3 meses atrás

Man you're my boss:)

StatQuest with Josh Starmer 3 meses atrás

Thanks!

Hirdyansh Bhalla 3 meses atrás

Because of your wonderful humor you just earned a subscriber ☺

StatQuest with Josh Starmer 3 meses atrás

Hooray! Thank you very much! :)

sacha 3 meses atrás

I think we can use two more terms : flexibility and generalization - The flexibility of a model to fit the training dataset ( bias) - the ability of the training model to generalize to the test dataset ( variance ) The first one is tuned with hyperparameters and the second one with parameters.

sacha 3 meses atrás

@StatQuest with Josh Starmer BAM... I was wrong... !

StatQuest with Josh Starmer 3 meses atrás

Is that correct? When we do Ridge Regression, which reduces variance (at the expense of a small amount of bias) we use hyperparameters.

Pallavi Dandamudi 3 meses atrás

This is the first video of yours I've seen & damn this is awesome!

StatQuest with Josh Starmer 3 meses atrás

Thank you! :)

luzan fero 3 meses atrás

Wonderful!!!!

StatQuest with Josh Starmer 3 meses atrás

Thank you! :)

anil sarode 4 meses atrás

Thanks a lot for this wonderful explanation. I have one question-If the line or fit has zero bias does that mean it is always an overfit? second question If so the overfit always has high variance?

StatQuest with Josh Starmer 4 meses atrás

Answer to question 1: Bias is related to difference between the model we are using the the true process that generated the data. If the bias is 0, then that could be due to us identifying the process that generated the data and using that as our "model". So bias = 0 does not always mean that the data is overfit. However, usually we do not know everything there is to know about the original process so the "model" is just an approximation, and a low bias value can indicate that the approximation is overfit. Answer to question 2: if the model is overfit, then it always has high variance.

Prerana Das 4 meses atrás

Hi Josh! You are the "God of ML and Stats". You really made me fall in love with these subjects. I had a query. According to you, if we cut the data into training and testing sets, what % should be assigned to test? I think it should vary with the amount of data, but is there a thumb rule?

StatQuest with Josh Starmer 4 meses atrás

There are a handful of "rules of thumb". One simple one is if you do 10 fold cross validation, then you divide your data into 10 equally sized bins (see the StatQuest on cross validation: brvid.net/video/video-fSytzGwwBVw.html ). Another standard is to use 75% for training and 25% for testing. This is the default setting for Python's scikit-learn function train_test_split().

Otman Werfaly 4 meses atrás

ربنا يحميك من ليبيا الحبيبة أحسنت وشكرا جزيلا

StatQuest with Josh Starmer 4 meses atrás

@Otman Werfaly Imagine we are building a decision tree (for information on decision trees, see: brvid.net/video/video-7VeUPuFGJHk.html ) that predicts whether or not a person is tall or short based on their weight. Typically, when we build that three, we would test every possible weight threshold in the training dataset and see if it does a good job classifying people as "tall" or "short". If the dataset is relatively small, then we can test every option and that is fine. In this case we are guaranteed to find the best threshold. However, if the dataset is huge, then we might just test a random subset of the options. In this case we are not guaranteed to find the best threshold, but, depending on how large the subset is, we may still find one that is pretty good. So we would make that subset by randomly selecting different thresholds and just testing those. This will make building the tree possible when the dataset is very big. Does that make sense?

Otman Werfaly 4 meses atrás

@StatQuest with Josh Starmer again, thank you very much for your reply. I read this somewhere : ''Most algorithms can be randomized, e.g. greedy algorithms: Pick from the N best options at random instead of always picking the best options E.g.: test selection in decision trees or rule learning'' and I could not really understadn what does he mean? if you can clearfiy this for that would be awesome .. Thankd

StatQuest with Josh Starmer 4 meses atrás

@Otman Werfaly Can you be more specific, or give me some context for the term?

Otman Werfaly 4 meses atrás

@StatQuest with Josh Starmer I have small question tho, do you have any idea about randomaztion ?

StatQuest with Josh Starmer 4 meses atrás

Thank you! :)

Yahya Kenoussi 4 meses atrás

thanks man !

StatQuest with Josh Starmer 4 meses atrás

:)

Visualizações 272 000

Visualizações 213 000

Visualizações 286 000

Visualizações 272 559

Visualizações 589 547

Visualizações 56 000

Visualizações 543 000

Visualizações 211 000

Visualizações 296 000

Visualizações 130 000

Visualizações 555 000

Visualizações 407 000

Visualizações 660 801

Visualizações 744 789

Visualizações 1 451 383

Visualizações 307 200

Visualizações 997 572

Visualizações 304 423

Visualizações 691 005

Visualizações 118 301

Visualizações 1 407 123

Visualizações 341 224