Machine Learning Fundamentals: Bias and Variance

StatQuest with Josh Starmer
Visualizações 167 406
98% 4 705 65

Bias and Variance are two fundamental concepts for Machine Learning, and their intuition is just a little different from what you might have learned in your statistics class. Here I go through two examples that make these concepts super easy to understand.
For a complete index of all the StatQuest videos, check out:
If you'd like to support StatQuest, please consider...
BRvid Membership:
...a cool StatQuest t-shirt or sweatshirt (USA/Europe):
...buying one or two of my songs (or go large and get a whole album!)
...or just donating to StatQuest!
Lastly, if you want to keep up with me as I research and create new StatQuests, follow me on twitter:

Publicado em


17 Set 2018



Baixar vídeos:

Carregando o link.....

Adicionar a:

Minha playlist
Assista mais tarde
Comentários 315
Vikash Chauhan
Vikash Chauhan 22 horas atrás
Such a simple and elegant explanation, thank you so much ....
StatQuest with Josh Starmer
Thank you! :)
Vivek Menon
Vivek Menon 2 dias atrás
Another great video. Thank you!!!! You couldn't have made it any simpler.
StatQuest with Josh Starmer
Thanks! :)
Long Nguyen-Vu
Long Nguyen-Vu 2 dias atrás
Clear and concise, extremely helpful for beginners before getting into the scary math behind
StatQuest with Josh Starmer
Thank you! :)
Rushi Raiyani
Rushi Raiyani 4 dias atrás
You, sir, are a legend!
StatQuest with Josh Starmer
Thanks! :)
Allan Kálnay
Allan Kálnay 5 dias atrás
3:10: "they are squared so that negative distances do not cancel out the positive distances" -> why do we always do the squared errors instead of taking absolute values of the errors? These will be always positive as well. What's the difference, please?
Allan Kálnay
Allan Kálnay 4 dias atrás
@StatQuest with Josh Starmer Okaay, I get it now! Thank you very much. Anyways, your videos helped me a lot with statistics already before and now again :)
StatQuest with Josh Starmer
The short answer is that it makes the math easier in the long run. Here's the longer answer: Often in machine learning and in statistics we want to minimize the distance from the a predicted value and the actual/observed value. Like at 3:10 - we have measured the distance between the line and the data. If we want to find the straight line that minimizes those distances (not a squiggly line that fits them perfectly, but a straight line that's not perfect) then the easiest way to do that is to take the derivatives of the differences between the observed values and the values predicted by the line. Anyway, taking the derivatives of squared values is way easier than taking the derivatives absolute values, because absolute values are not continuous at 0, and thus, the derivative of an absolute value does not exit at 0.
Jorge Leandro
Jorge Leandro 6 dias atrás
Man, you're very didactic! For each statement, there is a 'because', so that your students never ends with a question mark in the head. Besides that, you don't mind to repeat the because's again and again in different ways, and that's what make things clearer. Why can't teachers, coaches, tutors realize that? Triple BAMMM!
StatQuest with Josh Starmer
Thank you very much!! :)
Ishita Saxena
Ishita Saxena 9 dias atrás
thanks again my savior
StatQuest with Josh Starmer
Hooray! :)
sakunoful1 9 dias atrás
This is so very helpful but I can't help but cringe when he sings in the beginning or when he says bam x)
StatQuest with Josh Starmer
You can always skip over the intro. And the "bams" always follow something awesome. So when something awesome happens... skip the next 3 seconds. ;)
ZHIYUAN YAO 12 dias atrás
I cannot believe this channel only has 141K subs. Thank you for the great tutorials!
StatQuest with Josh Starmer
Thanks! :)
Timo Bohnstedt
Timo Bohnstedt 15 dias atrás
I LOVE your videos!
StatQuest with Josh Starmer
Thank you! :)
Joel John J
Joel John J 19 dias atrás
Good Work 😀
StatQuest with Josh Starmer
Thanks! :)
queenforever 20 dias atrás
I went from BUMMED to DOUBLE BAM in six and a half minutes. God bless you!
StatQuest with Josh Starmer
Hooray! :)
Shivam Maheshwari
Shivam Maheshwari 22 dias atrás
*Give this guy a medal*
StatQuest with Josh Starmer
yuthpati rathi
yuthpati rathi Mês atrás
Amazing explanation
StatQuest with Josh Starmer
Thank you! :)
A Jawad
A Jawad Mês atrás
linear regression (aka least square) finally, now I can die in peace. you explain things in very nice way.
StatQuest with Josh Starmer
hari20001 Mês atrás
Brilliant and clear and concise explanation: the best i have seen!!! Congrats and many thanks.
StatQuest with Josh Starmer
Thank you! :)
Payton Zhong
Payton Zhong Mês atrás
what an interesting nerd Lmaooo
Yusra Shaikh
Yusra Shaikh Mês atrás
MAN!!! i was reading about bias and variance trade off, but not a word got into my head...this video made it beyond clear!! thanks a ton!!
StatQuest with Josh Starmer
Hooray! I'm glad the video was helpful. :)
srikar goud
srikar goud Mês atrás
3:09 psst. I can listen to this all day.
SAMAR KHAN Mês atrás
Thank u soo much . Really liked the way u explained . I learnt n I njoyed it too. Plz make more videos like this on related topics.👍🏼👍🏼👍🏼👍🏼👍🏼
StatQuest with Josh Starmer
Thank you! Yes, I plan on making as many videos as possible.
João Pedro Voga
João Pedro Voga Mês atrás
Amazing video, helped me a lot!
StatQuest with Josh Starmer
ati safarkhah
ati safarkhah Mês atrás
StatQuest with Josh Starmer
Chien-Hsun Lai
Chien-Hsun Lai Mês atrás
I like the way you say DOUBLE BAM!
StatQuest with Josh Starmer
Thank you! :)
sridevi A
sridevi A Mês atrás
Nice video. the concept was clearly explained with visualizations
StatQuest with Josh Starmer
Thank you!!! :)
Raman Jha
Raman Jha 2 meses atrás
Man, you are amazing. I have listened to many so called self proclaimed educators who are nothing more than assholes. You are great !!!
StatQuest with Josh Starmer
Thanks! :)
Bhabesh Roy
Bhabesh Roy 2 meses atrás
Woah your original songs are beautiful too'
Bhabesh Roy
Bhabesh Roy 2 meses atrás
Awesome explaination
StatQuest with Josh Starmer
Thanks! :)
a a
a a 2 meses atrás
I am a simple man: I see a StatQuest video - I give an upvote.
StatQuest with Josh Starmer
Nice! :)
Robert Smith
Robert Smith 2 meses atrás
You should sell these videos as DVD sets. I bet a lot of educators would buy them.
Parijat Bandyopadhyay
Parijat Bandyopadhyay 2 meses atrás
Josh, Thanks for making such beautiful videos :) you rock man
StatQuest with Josh Starmer
@Pete Murphy Thanks! :)
Pete Murphy
Pete Murphy 2 meses atrás
Perfect statement, I absolutely agree, Josh is really a legend!!!!
StatQuest with Josh Starmer
Awesome! Thank you! :)
Sunny Dsouza
Sunny Dsouza 2 meses atrás
The BAMs are quite cringy
AdityaFingerstyle 2 meses atrás
At first yes but soon you'll crave to hear that. Double BAM !!
Tales Araujo
Tales Araujo 2 meses atrás
It's been a while I had sent a Portuguese translation to this video (b/c I really like your channel and I thought it was a great video to explain these "tricky" concepts - I wanted to show it to some awesome ppl who don't understand English properly). Why didn't you take it, though? =(
StatQuest with Josh Starmer
I am really, really sorry!!!! BRvid does not notify me when there is a translation, so I did not know that you did so much work. I just approved it (and about 15 other translations that I did not know about). Thank you so much for putting in the time and effort for the translation. I'm sorry it took so long for approval and from now on I will check for translations more frequently.
Daniel Rodrigues Pipa
Daniel Rodrigues Pipa 2 meses atrás
The concepts of bias and variance are wrong. The correct are bias = E(\hat{\theta} - \theta) and var = E[(\hat{\theta} - \theta)^2]. They measure respectively how well the estimator performs in the mean (positives do cancel with negatives errors!) and how spread are the outcomes of the estimator.
Daniel Rodrigues Pipa
Daniel Rodrigues Pipa 2 meses atrás
@StatQuest with Josh Starmer I checked the source and it's consistent with the statistical definition I gave, which is the only correct. Machine Learning is just statistical inference in disguise and, thus, is suppose to use statistical terminology consistently.
StatQuest with Josh Starmer
For more information, check out page 33 in the Introduction to Statistical Learning in R (which is a free download):
StatQuest with Josh Starmer
You're talking about bias and variance in terms of statistics. This video describes how the terms are used in machine learning. They are related, but different.
Joel Vaz
Joel Vaz 2 meses atrás
Thanks man, i do not know what the start was about, but your video really helped me. Thanks
Ahmed 2 meses atrás
Extremely Great
StatQuest with Josh Starmer
Thank you! :)
Dropfire Music
Dropfire Music 2 meses atrás
I have understood not only the Bias and Variance, but also even more ML terminology that has been quite difficult for me to understand until this point! Keep it up brother! Very good job :)
StatQuest with Josh Starmer
Awesome!!! :)
Nelson Tovar
Nelson Tovar 3 meses atrás
You don't have idea of how huge help you just gave me. I'm currently working with some real data, and i'm kind of leaned towards a cuadratic model instead of an exponential one. Happens that this first fits with R = 0.96 and the second with R = 0.90, however the first model included some negative values and our response (Y) can't be negative. I was thinking in working with the absolute value of the cuadratic model however, i'm not sure if i should get to that extent to keep only some better adjustment, i mean, R=0.90 isn't bad either. I think this is the overfitting you just mentioned.
StatQuest with Josh Starmer
Awesome! Good luck with your models! :)
Raj Kiran Reddy Marri
Raj Kiran Reddy Marri 3 meses atrás
Awesome and meticulous explanation , keep it up Josh !
StatQuest with Josh Starmer
Thank you! :)
Sepehr Alian
Sepehr Alian 3 meses atrás
Thanks mate. That was great. Cheers
StatQuest with Josh Starmer
Thanks! :)
Tanishk Singh
Tanishk Singh 3 meses atrás
Amazing explanation,you are Awesome!
StatQuest with Josh Starmer
Thank you! :)
superxp1412 3 meses atrás
Correct me if I'm wrong. At 2:56 the dot lines to show the distance to the line should be vertical toward the line.
superxp1412 3 meses atrás
@StatQuest with Josh Starmer Thanks for your reply. It makes sense.
StatQuest with Josh Starmer
@superxp1412 In machine learning (and Statistics), the distances from the data to the line that we use for prediction are measured by using a vertical line from the data and not a line perpendicular to the prediction line. The reason for this is that we want to use the values on the x-axis to predict values on the y-axis. Thus, a value on the x-axis corresponds to a value on the y-axis by way of the prediction line. If, instead, we measured the perpendicular distance between the data and the prediction line, then this relationship would be destroyed.
superxp1412 3 meses atrás
@StatQuest with Josh Starmer I mean the dotted lines on the left side should not be vertical. It should be perpendicular to the line to represent the distance between the dot and the line. Correct me If I'm wrong.
StatQuest with Josh Starmer
Ummmm. I'm not sure I understand your comment. The dotted lines on the left side are vertical. On the right side, the squiggly red dotted line fits the data perfectly, so the distance between it and the data is zero. Thus, there, are no black dotted lines to draw on the right side.
Anonymous Noman
Anonymous Noman 3 meses atrás
We are not actually calculating the distance between predicted points and the straight line (which is actually vertical /perpendicular on the line) Instead, to find the error you have to just calculate abs( predicted y - original_y ) (in this case it's actually parallel to Y axis)
SunkuSai 3 meses atrás
Question: What kind of bear is best?
valor36az 3 meses atrás
I wish you will write a book.
StatQuest with Josh Starmer
One day when I have more time... :)
Vaibhav Bisht
Vaibhav Bisht 3 meses atrás
This is some quality educational content...Keep up the good work brother!! Definitely gonna buy some merch to support the channel!!
StatQuest with Josh Starmer
Awesome! Thank you! :)
Rohit Pingale
Rohit Pingale 3 meses atrás
The videos of statsquest are sweet spots!
StatQuest with Josh Starmer
genie52 3 meses atrás
Wow this was so straight to the point with great visuals that I managed to figure out all in one go! Great stuff!
StatQuest with Josh Starmer
Awesome!!! :)
Maulik Naik
Maulik Naik 3 meses atrás
Also, can you tube customize the like button to BAM!! that would be Great.. ;)
StatQuest with Josh Starmer
That would be awesome! :)
Maulik Naik
Maulik Naik 3 meses atrás
Have watched many of your videos and that have forced me to write a comment, Stat Quest is AWESOME!! and @Josh Starmer, I am you fan. The way you begin your videos and go about explaining some of the most difficult concepts in Statistics and Machine Learning is GREAT. Many books and tutorials mention making the complex simple, but rarely do so. This channel is not one of them, it truly makes things simple to understand. I have just one request (i think most of your followers would agree to this point), please write a book on Machine Learning and it's application of various algorithms (may be a series of books).
StatQuest with Josh Starmer
Thanks so much! If I ever have time, I'll write a book, but right now I only have time to do the videos.
George Carvalho
George Carvalho 3 meses atrás
GateCrashers 4 meses atrás
please explain what is the term bias in a linear regression formula ? please explain at simply as possible. thank you
Santhosh Murali
Santhosh Murali 4 meses atrás
wow! you are the best!
StatQuest with Josh Starmer
Thanks! :)
Finn Janson
Finn Janson 4 meses atrás
Thank goodness you exist... I've never ever understood why squaring the distances mattered until your foot note at 3:12
StatQuest with Josh Starmer
Nice! :)
Otonium 4 meses atrás
Double Bam.... so well narrated!
StatQuest with Josh Starmer
Thank you! :)
Lavneet Sharma
Lavneet Sharma 4 meses atrás
BAM!!!! I finally understood the idea
StatQuest with Josh Starmer
Hooray! :)
Marcus Cactus
Marcus Cactus 4 meses atrás
Very instructive.
Bharath Shashanka Katkam
Thanks for the lovely explanation, Sir... Could we fit the squiggly line by using the Maximum Likelihood Estimation?
Bharath Shashanka Katkam
@Malini Aravindan Thank you, Ma'am. But, what if there is a squiggly line with a large sample size? In that context of large sample size, can't we go with the Maximum Likelihood Estimation, instead of RMLE?
Malini Aravindan
Malini Aravindan 3 meses atrás
Use rmel - restricted mle
Divinity 4 meses atrás
Awesome thanks
Stefanos Moungkoulis
Stefanos Moungkoulis 4 meses atrás
BAM. Subscribed.
yogurt1989 4 meses atrás
*Opens StatQuest Videos* -> Automatically clicks 'Like'
Abeer B
Abeer B 4 meses atrás
awsome and very clear explanation!
Próximos vídeos
StatQuest: Logistic Regression
Say Your Goodbyes
Visualizações 468
I'm Alive
Visualizações 707
The Central Limit Theorem
Visualizações 51
A Drink From The Well
Visualizações 3
Wildest Dreams
Visualizações 2
StatQuest: The standard error
Visualizações 385 612
Frozen 2 | Trailer 3 Dublado
Visualizações 717 248