Welcome to the third and final part of the “So You Want to Be a Data Scientist” series! In this concluding installment, I’ll share some of the best resources to help you take your skills to the next level—whether you’re just starting out or looking to deepen your expertise. While this wraps up the series, it’s far from the end of the conversation. I’ll continue exploring data science from different angles in future posts, diving deeper into topics that matter most to aspiring professionals and seasoned experts alike. Let’s get started!

Online Resources

You will want to supplement your formal education with online resources, such as a data science bootcamp on Udemy or Coursera. Andrew Ng’s Machine Learning course on Coursera is another excellent resource. These are especially helpful if your school did not offer these courses, but even if you took a formal class, I recommend at least skimming through one of these online courses to reinforce your knowledge. Keep in mind that these courses won’t make you an expert, but they are great for building a strong foundation.

Additionally, MIT’s OpenCourseWare is a fantastic study resource. I highly recommend Gilbert Strang’s Linear Algebra, which you can take online, ideally while or around the time you’re enrolled in a linear algebra course. 

1. Coursera

2. edX

  • Online platform with courses from top universities.

3. DataCamp

  • Focuses on data science and programming skills.

4. Udemy 

  • Marketplace for various courses. 

5. MIT OpenCourseWare

6. Khan Academy

  • Free foundational courses in math and science, great for beginners.

7. TrevTutor

  • Paid website with a variety of math and stats topics.

Recommended Books

These books provide a strong foundation in data science, covering linear algebra, statistics, machine learning, and proof-based reasoning. They will help you build practical skills, deepen theoretical knowledge, and master the mathematical and statistical foundations essential for tackling real-world problems and building robust models. Whether you’re starting out or filling gaps in your expertise, this collection is a reliable guide for your data science journey.

1. Introduction to Linear Algebra by Gilbert Strang (Strongly Recommend)

  • A foundational text that covers essential linear algebra concepts, critical for data science.

2. Categorical Data Analysis by Alan Agresti 

  • A comprehensive guide to analyzing categorical data, with practical applications.

3. Probability and Statistics for Engineering and the Sciences by Jay Devore  

  • An accessible introduction to probability and statistics, ideal for applied contexts.

4. Mathematical Proofs: A Transition to Advanced Mathematics by Gary Chartrand  

  • Perfect for developing proof-writing skills and logical thinking.

5. Proofs That Really Count by Arthur T. Benjamin  

  • A helpful resource for those struggling with combinatorial proofs.

6. Linear Models with R by Julian J. Faraway (Strongly Recommended)

  • Focused on linear modeling with practical implementation in R.

7. An Introduction to Statistical Learning by Gareth James et al.  

  • A beginner-friendly introduction to machine learning concepts and algorithms.

8. Statistical Inference by George Casella and Roger L. Berger  

  • A deeper dive into the theory behind statistical methods.

Free Resources

YouTube Channels

YouTube has many great, free resources if you are willing to sit through occasional commercials. Unfortunately, these are not necessarily vetted for accuracy.

1. Sandra Herke

  • Focuses on graph theory, offering clear explanations for foundational concepts.

2. 3blue1brown 

  • Provides exceptional visual explanations for linear algebra, calculus, and neural networks, making complex topics intuitive.

3. JBStatistics

  • Specializes in probability and statistics concepts, with a strong focus on regression, offering clear examples and applications.

Graphing Tools

Tools to help you visualize functions.

1. Desmos

  • An intuitive and powerful tool for 2D graphing, ideal for visualizing functions and data.

2. Academo 3D Surface Plotter

  • A user-friendly platform for creating 3D visualizations, perfect for exploring multidimensional concepts.

Final Thoughts

Becoming a data scientist is as much about navigating challenges as it is about acquiring skills. Throughout this series, we’ve covered the essential knowledge, the critical mindset, and now the resources to help you succeed. But let’s not overlook the common pitfalls that can derail even the most dedicated learners. Whether it’s trying to master every tool at once, overlooking the importance of strong communication skills, or getting lost in theory without real-world application, these obstacles can slow your progress.

The key is to stay focused on the fundamentals, build a solid foundation, and approach learning with curiosity and patience. Mistakes and setbacks aren’t failures—they’re opportunities to refine your understanding and grow. As this series concludes, I hope it has given you the guidance and confidence to overcome these challenges and move forward with purpose.

While this wraps up the “So You Want to Be a Data Scientist” series, it’s far from the end of the discussion. I’ll continue exploring the field from different angles in future posts, offering insights to help you tackle new challenges and seize emerging opportunities. Keep learning, keep experimenting, and most importantly, keep asking questions. That’s the real secret to success in data science.