The Bare Minimum
Last updated
Last updated
Ensure that you are code literate and are familiar with all the concepts and tools present in the the above page.
While the basic code literacy resources point you to courses that you started quickly with Python, the algorithmic thinking necessary for doing machine learning is a more subtle. Understanding problem complexity, figuring out what parts of the problems are non-deterministic is a critical aspect of the formulating good machine learning models. Follow the evolution of thinking presented in the following course to get a better handle on the thinking mindset you need to have:
Most Machine Learning code in Python revolves is anchored around a few standard libraries. Getting familiar with numpy
and pandas
is critical for being able to manage and manipulate data. Spending time on learning how to do data preprocessing and techniques such as standardization and label encoding are critical skills that will force you to learn how to utilize these libraries. More often than not, we find ourselves with data that needs to be standardized so that the ML models can actually gain meaningful insights from them.
Finally, being able to visualize the results, data, etc. is super critical. Being familiar with how to use visualization libraries such as matplotlib
and seaborn
will ensure that you're able to communicate the information you are learning. These are fundamental tools necessary for you to communicate ideas, showcase results and debug problems with a statistical lens.
Traditional ML algorithms such as Regression, SVM, Decision Trees, are mathematical models. As one continues to use these methods for solving classification and prediction problems, one learns how to think the following:
Machine Learning as system for model fitting
The maturity to make tradeoffs for different methods based on the preliminary statistical analysis and the perceived complexity of the problem statement. (e.g. Would you use a polynomial regression, if the phenomenon you are trying to model is only a linear system ?)
TBA: Key Realization Points
When you encounter techniques for clustering, dimensionality reduction, its important to remember that we are trying to manipulate the pre-images
and the images
(Math Definition of Images) and we are essentially applying each of the methods like autoencoding, bagging and boosting in order to transform the data in order to have a clearer pre-image.
TBA: Key Realization Points
Optimizers are at the heart of all machine learning techniques, having a utilitarian understanding of the different kinds of optimizers and their utilization is critical for understanding a foundational part of Machine Learning. Here's a course that can help you develop the thinking of how optimizers work:
TBA: More relevant ML optimizers material.
Building neural networks is very easy today so being fluent PyTorch or Tensorflow (Aegion only sticks to Pytorch). Being fluent in the library would take you a long way. Take a look at these series that gets you from 0 -> 0.1.
Start studying neural networks for discriminative deep learning, including ANNs, CNNs, RNNs, and other networks for sequence data processing, then follow this up by learning generative deep learning, including GANs, VAEs, and Transformers.
TBA: Key realizations, editorial on how to approach different classes of neural networks, utilizing formalism and learning to formalize.