Machine Learning (ML) is a fast-growing field because of its impact on our day-to-day activities. Read about ML fundamentals in my previous blog. In this article, I will be writing about the methods used to train data in ML.
Supervised and unsupervised learning are subcategories of machine learning. They are probably words we've come across before. For instance, we supervise/monitor projects so we can keep track of their progress and maybe give reports on them, likewise when we don't supervise our project, we can say the project is unsupervised i.e. no monitoring.
Supervised Learning
Supervised learning, also known as supervised machine learning, is a subcategory of ML. Supervised learning uses labeled datasets to train ML algorithms to classify data or predict outcomes accurately. Supervised learning is designed to learn by example.
The dataset above is the Titanic dataset and it is an example of labeled datasets. The target variable or output in this dataset is "Survived"; the column marked with red. Supervised learning in this case will use the input variables; columns marked with yellow, to predict the output column.
Supervised learning makes machine learns explicitly because data with clear output is given.
In supervised machine learning, the machine learns under supervision.
Supervised learning is used to solve two types of problems, they are:
Classification
Regression
Classification
Classification is a supervised machine learning method where the model tries to correctly predict the label of the input data. This ML problem has categorical values. e.g. Will it be hot or cold tomorrow? Is an email spam or not? Note that categories can be more than two e.g. predict the grade of students in a class (Grade A, B, C, D).
Classification algorithms include
Decision trees
Support Vector Machines (SVM)
Logistic regression
Naive Bayes classifier etc.
Regression
Regression is a supervised machine learning method used to predict continuous numerical values for output. e.g. what is the temperature going to be tomorrow? predict the price of houses given different house features.
The ultimate goal of the regression algorithm is to plot a best-fit line or a curve between the data.
Regression algorithms include
Linear regression
Regression trees etc
Some Applications of Supervised Learning
You might be curious to know some areas where supervised learning is being used, check them out below.
Risk Assessment: Supervised learning is used to assess the risk in financial services in other to minimize the risk portfolio of the companies.
Image Classification: Image classification is one of the key areas of demonstrating supervised machine learning. For example, Facebook can recognize your friend in a picture from an album of tagged photos.
Fraud Detection: To identify whether the transactions are fraudulent or not.
Visual Recognition: The ability of a machine learning model to identify objects, places, people, actions, and images.
Unsupervised Learning
Unsupervised learning is where you have unlabeled data (or no target variable) in the dataset. Unlike supervised learning, there are no correct answers and there is no teacher/supervisor. Algorithms are left to their own to discover and present the interesting structure in the data.
Unsupervised learning = learning without labels
Importance of Unsupervised Learning
In the real world, most of the data is available in an unstructured format. Hence, it becomes extremely difficult to draw insights from them. Unsupervised learning helps to draw similarities between the data and separate them into groups having unique labels, In this way, the unstructured data can be converted into a structured format. Due to this cognitive power to draw insights, deduce patterns from data, and learn from them, unsupervised learning is often compared with human intelligence.
Some Applications of Unsupervised Learning
Here is the fun part of unsupervised learning, let's check out some areas where unsupervised learning is used.
Market Segmentation: This is the practice of dividing consumers into groups based on shared needs, desires and performances. This is an effective method used by e-commerce and physical stores to satisfy their customers.
Social Network Analysis (SNA): This involves mapping and measuring the relationship between people, groups, organizations, computers, URLs, and other connected entities. An example is Facebook, Facebook is a social network site and it uses SNA to identify and recommend potential friends based on friends-of-friends.
Astronomical Data Analysis: A simple visualization of complicated data makes the science behind it seem obvious. Unsupervised learning techniques help in clustering the galaxies based on certain characteristics.
Use Cases of Unsupervised Learning
Unsupervised learning is broadly used for 2 things in ML
Clustering: This deals with finding a structure in a collection of unlabeled data.
Dimensionality Reduction: This includes techniques that reduce the number of input variables in a dataset.
K-means clustering, Hierarchical clustering, and Apriori algorithm are the most commonly used unsupervised learning algorithms.
You stayed till the end! Awesome
Both supervised and unsupervised learning are useful in the world of machine learning, we just need to know when and how to apply them to solve problems.
That's a wrap! If you have thoughts on this, be sure to leave a comment.
If this article is helpful to you, kindly give some claps
Stay tuned for more beginner-friendly articles from me.