See the original version of this article on LinkedIn Pulse.
Machine learning. You have heard it. Machines can learn now. They no longer keep doing the same task over and over again. And let’s be honest, it was about time computers became a bit more independent. Who wouldn’t like a super accurate auto-generated email response?
So, What is Machine Learning?
It is the ability of a machine to learn something using datasets and a basic set of algorithms. These datasets are often referred to as ‘experiences’, much like humans have experiences we learn from. The basic set of algorithms add the ability to compare and contrast different objects, much like humans use our five senses to do the same. Machine learning created a solid step for machines to learn like humans, in that it takes experiences and uses them to process future experiences.
There are three main types of machine learning: supervised, unsupervised, and reinforcement learning. We’re going to talk about them in respect to TurtleID, a hypothetical program that identifies turtles in images.
Supervised Machine Learning
Let’s think of a children’s book with animal photos. As you turn the pages over, you would see a picture of an animal and a caption that lists the name of the animal.
Seeing this picture for the first time and identifying the caption “Turtle” a child will form the initial connection between how a turtle looks like and what it is called. A few books later, and perhaps supported with other real life interactions where a child is exposed to turtles of all shapes and sizes, the connection between the form of the animal and the name is solid in the brain.
Supervised machine learning mimics this approach. We give the machine both the input and the resulting output. As a child you would see a picture of the turtle in a book, the input, with a caption that says “Turtle,” the output. For the TurtleID dataset we would have a large set of pictures of turtles, and the output we give it for each picture is, “This is a turtle.” Now TurtleID has learned what a turtle looks like using different preset parameters such as length, width, color and any other useful parameters we decide on.
TurtleID can use past experiences in the dataset to have a good chance of predicting whether the next picture is or isn’t a turtle. Again, this works much like a child learning what a turtle is and using the five senses to identify turtles based on past experiences.
Unsupervised Machine Learning
Though unsupervised machine learning is underutilized, it provides a more exciting path for the future of machine learning. Unsupervised machine learning is when you give the input and allow the machine to identify its own output. This means it has to create different clusters based on similarities within the data.
When you were too young to understand full words, you did the same thing. You could identify a turtle, place it in a group of ‘turtle like things,’ but you never really knew what it was until you could learn the word “turtle.”
So if we give TurtleID 20 pictures of turtles, it will notice similarities between those pictures and classify the dataset in its own language, since it doesn’t know the word “turtle.” Let’s call that dataset A1. Now we give it 20 more pictures consisting of turtles, cars and the ocean. It classifies the cars as A2, the ocean as A3, and puts the turtles under the original A1. The interesting part is that the machine just created new words in its own dictionary, and – just like you as a child – it won’t call A1 “turtle” unless it is told to do so.
The exciting part about unsupervised machine learning is its ability to think differently. Let’s imagine we gave this machine a set of data with animals ranging from different eras, and each animal is classified by the 25 letters of the alphabet. Unsupervised machine learning can use its clustering approach to connect things that humans may not think of. It can identify animal X as the cousin to animal B due to similar size of snout, even though it was never thought of by humans.
The possibilities for unsupervised machine learning are endless, and it may make discoveries and connections in places that we would never see due to our restricted knowledge of input with a definite output, while unsupervised learning is given the space for curiosity to discover the output on its own.
Reinforcement (Semi-supervised) Machine Learning
Reinforcement machine learning is a mix of supervised and unsupervised. We give TurtleID 3 pictures, one of a turtle, one of a dog and one of a horse. We tell it that the first picture is ‘a turtle,’ while the next two are ‘not a turtle.’ Since we gave it these labels, it can use its algorithms to place the unlabeled data it is given into the specific dataset by comparing characteristics in the image.
Now we give it a bunch of unlabeled data, meaning it doesn’t know if it is ‘a turtle’ or is ‘not a turtle,’ it just sees inputs. The algorithm used for both the supervised and unsupervised sides will be like the machine’s eyes, allowing it to discern shapes and colors. It will use the algorithm to say, “Oh, this unlabeled image’s shapes and colors are similar to this group of labeled image’s shapes and colors. Therefore, I will place it in the similar category.”
This method is desirable for large datasets where you don’t want to spend all your time labeling data to find out more about it, but you also want to ensure that the machine can assign appropriate labels to each element of the dataset.
Why is Machine Learning growing now?
The reason for machine learning’s recent resurgence has to do with big data, and better technologies to process the data. Namely, a big portion of the data is being gathered through Internet-users and their online engagement. We can use some of this as labeled data to analyze large amounts of data. The second important factor is the affordable cloud computing power we have out our disposal and the platforms/frameworks that handle the heavy lifting for machine learning.
Limitations still exist on the road to human level intelligence. We struggle to match the human quality of the five senses and the ability to associate them with information using a complex neural network like the brain. However, the future possibilities of machine learning and our ability to integrate it into emerging technologies are immeasurable. It is already a lot more useful than identifying pictures of turtles and we are just getting started.