With OpenAI, Midjourney, and all the other cool kids generating your fever dreams, let’s look at what the core of machine learning is.
Machine learning is a subset of artificial intelligence (AI) that focuses on developing systems capable of learning and improving from experience, without being explicitly programmed. This learning process involves feeding large amounts of data to the algorithm and allowing it to adjust and improve its performance to make accurate predictions or decisions.
In essence, machine learning involves three core components:
- Model: This is the system or algorithm that makes predictions or identifications.
- Parameters: These are the factors considered by the model to make predictions. They are the “knowledge” the model learns from the training data.
- Learner: This is the system that adjusts the parameters and aspects of the model based on the feedback it gets from its predictions to improve the model’s performance.
There are several types of machine learning, including:
- Supervised Learning: The model is provided with labeled training data and the desired outputs. The goal is to learn a general rule that maps inputs to outputs. An example might be a spam detection system, where emails are labeled as “spam” or “not spam”.
- Unsupervised Learning: The model is given data without any specific instructions on what to do with it. The goal is to identify patterns and relationships within the data. An example might be customer segmentation in a dataset of customer behavior information.
- Reinforcement Learning: The model learns to make decisions by performing certain actions and receiving feedback in terms of rewards or punishments. An example might be a machine learning algorithm learning to play a game by playing many rounds and improving its strategy based on the results.
Machine learning is a rapidly evolving field and forms the backbone of many modern technologies, from recommendation systems (like those used by Netflix or Amazon) to autonomous vehicles.
Is it still on the cutting edge, in the age of generative AI?
Yes, machine learning is still considered cutting-edge. This is because generative AI is a subset of machine learning. Specifically, generative AI is a part of a category of machine learning called deep learning.
Machine learning is a broad field that includes many different techniques and models. While generative AI models, such as GANs (Generative Adversarial Networks) or transformer models like GPT, have gained a lot of attention due to their impressive capabilities, they are only a part of the larger field of machine learning.
Machine learning includes areas like reinforcement learning, supervised learning, unsupervised learning, semi-supervised learning, and more, all of which continue to be areas of active research and development. Techniques such as decision trees, random forests, and support vector machines are still commonly used and developed, and new methods continue to be proposed and explored.
Moreover, machine learning as a field is always evolving, with new technologies and methodologies continually being developed to solve new and complex problems. For example, techniques for dealing with imbalanced datasets, handling missing data, model explainability, and privacy preserving machine learning are just some of the areas that are considered cutting edge.
So, while generative AI is an exciting part of machine learning, it’s just one part of a larger, dynamic, and continually evolving field.
What’s the connection between machine learning and LLMs like ChatGPT?
Language models like ChatGPT (Generative Pretrained Transformer) are a type of machine learning model specifically designed to work with language data. These models fall under a subset of machine learning called deep learning, which is concerned with algorithms inspired by the structure and function of the brain called artificial neural networks.
In the case of GPT, the underlying architecture is a transformer model, which is a type of artificial neural network that uses self-attention mechanisms. The model is trained on a large corpus of text data, and learns to predict the next word in a sentence given the previous words. This is a form of unsupervised learning, because the model learns the patterns and structures in the language data without being explicitly told what to learn.
During training, the model adjusts its internal parameters based on the differences between its predictions and the actual data. This process is driven by a technique called gradient descent, which is a common method used in machine learning to minimize the error of a model.
Once trained, the model can generate text by taking an input (which can be a single word, a sentence, or longer text), and predicting what comes next. This generation can be controlled to some degree by adjusting parameters such as temperature and maximum token length.
So, large language models (LLMs) like ChatGPT are a product of machine learning techniques, specifically from the deep learning field, and they leverage the power of unsupervised learning and neural networks to understand and generate human-like text.