top of page

What is Machine Learning?



Welcome to the "Machine Learning Fundamentals" series at Data Science Flow! In this blog post, we will be discussing the following topics:

  1. What is Machine Learning?

  2. Different types of machine learning, including:

  • Supervised Learning

  • Unsupervised Learning

  • Semisupervised Learning

  • Reinforcement Learning

  • Shallow Learning

  • Deep Learning

Now let us begin!


What is Machine Learning?


At its core, machine learning is a computational approach that enables computer programs to improve their performance on a task by learning from experience. More formally:

A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E.
-- Mitchell, 1997

The task T can be thought of as the problem we would like to solve, for instance, predicting house price. The experience E and measurement P can then be regarded as the home sales data and the difference between the predicted price and the real one. In this context, machine learning involves training a computer program (a.k.a. a model) to make accurate house price predictions by learning from home sales data.


Different Types of Machine Learning


There are several ways to categorize machine learning approaches based on different criteria. Let us explore some of these categories.


Supervised Learning


Supervised learning is a type of machine learning where the model is trained on labeled data. This means that the dataset contains both features (e.g., square footage of a house) and target (house price). The goal of supervised learning is to learn a mapping between the features and the target so that the model can make accurate predictions on new, unseen data.


Supervised learning approaches can be further divided into two categories:

  • Regression: where the target is a continuous variable (such as house price, which can take infinite number of values)

  • Classification: where the target is a discrete variable (such as the outcome of flu test, which can only take finite number of values)


Unsupervised Learning


In unsupervised learning, the model is trained on unlabeled data, meaning that it only has features but no target. The primary objective of unsupervised learning is to find structures or patterns within the data.


Some of the most important areas in unsupervised learning include:

  • Clustering Analysis: dividing the data into clusters

  • Anomaly Detection or Novelty Detection: detecting the outlier that is different from most of the samples (in the case of anomaly detection) or different from all of the samples (in the case of novelty detection)

  • Dimensionality Reduction: transforming high-dimensional data into low-dimensional data

  • Association Rule Learning: detecting the correlation between features


Semisupervised Learning


Semisupervised learning combines elements of both supervised and unsupervised learning. This approach involves training a model using a dataset that contains a mix of labeled and unlabeled samples. A primary goal of semisupervised learning is to utilize both labeled and unlabeled data to improve a model's performance on labeled samples. We will come back to this when discussing Autoencoder in later blogs.


Reinforcement Learning


Reinforcement learning is a unique approach where an agent interacts with an environment and learns to take actions to maximize a cumulative reward. The agent learns by receiving feedback from the environment in the form of rewards (or penalties) based on its actions. Reinforcement learning has been widely applied to areas such as self-driving cars (e.g., Tesla) and gaming (e.g., AlphaGo Zero).


Shallow Learning and Deep Learning


Shallow learning refers to machine learning techniques that use traditional models that do not involve Deep Neural Networks. These methods are often applied to tabular data. On the other hand, deep learning is based on deep neural networks and is particularly well-suited for tasks involving perceptual data such as text, image, audio and video.


Takeaways


In this blog post, we have provided an overview of machine learning and its various types. Here are the key takeaways:

  • When the data has target, we can do supervised learning

  • When the data has no target, we can do unsupervised learning

  • Shallow learning is more suitable for tabular data

  • Deep learning is more suitable for text, image, audio and video

Thank you for reading. We will see you next time!


References

  • Mitchell, T. M., 1997. Machine Learning. McGraw Hill.




コメント


bottom of page