Machine Learning (ML) is a sub-category of artificial intelligence, that refers to the process by which computers develop pattern recognition, or the ability to continuously learn from and make predictions based on data, then make adjustments without being specifically programmed to do so.
Whether or not you’re excited by the idea of artificial neural networks one day growing sophisticated enough to replicate human consciousness, there are undeniable practical advantages to machine learning, namely:
Intelligent big data management – The sheer volume and variety of data being generated as humans and other environmental forces interact with technology would be impossible to process and draw insights from without the speed and sophistication of machine learning.
Smart devices – From wearable devices that track health and fitness goals to self-driving cars to "smart cities" with infrastructure that can automatically reduce wasted time and energy, the Internet of Things (IoT) holds great promise, and machine learning can help make sense of this significant increase in data.
Rich consumer experiences – Machine learning enables search engines, web apps, and other technology to customize results and recommendations to match user preferences, creating delightfully personalized experiences for consumers.
How does machine learning work?
Machine learning is incredibly complex and how it works varies depending on the task and the algorithm used to accomplish it. However, at its core, a machine learning model is a computer looking at data and identifying patterns, and then using those insights to better complete its assigned task. Any task that relies upon a set of data points or rules can be automated using machine learning, even those more complex tasks such as responding to customer service calls and reviewing resumes.
Depending on the situation, machine learning algorithms function using more or less human intervention/reinforcement. The four major machine learning models are supervised learning, unsupervised learning, semi-supervised learning, and reinforcement learning.
With supervised learning, the computer is provided with a labeled set of data that enables it to learn how to do a human task. This is the least complex model, as it attempts to replicate human learning.
With unsupervised learning, the computer is provided with unlabeled data and extracts previously unknown patterns/insights from it. There are many different ways machine learning algorithms do this, including:
Clustering, in which the computer finds similar data points within a data set and groups them accordingly (creating “clusters”).
Density estimation, in which the computer discovers insights by looking at how a data set is distributed.
Anomaly detection, in which the computer identifies data points within a data set that are significantly different from the rest of the data.
Principal component analysis (PCA), in which the computer analyzes a data set and summarizes it so that it can be used to make accurate predictions.
With semi-supervised learning, the computer is provided with a set of partially labeled data and performs its task using the labeled data to understand the parameters for interpreting the unlabeled data.
With reinforcement learning, the computer observes its environment and uses that data to identify the ideal behavior that will minimize risk and/or maximize reward. This is an iterative approach that requires some kind of reinforcement signal to help the computer better identify its best action.
How are deep learning and machine learning related?
Machine learning is the broader category of algorithms that are able to take a data set and use it to identify patterns, discover insights, and/or make predictions. Deep learning is a particular branch of machine learning that takes ML’s functionality and moves beyond its capabilities.
With machine learning in general, there is some human involvement in that engineers are able to review an algorithm’s results and make adjustments to it based on their accuracy. Deep learning doesn't rely on this review. Instead, a deep learning algorithm uses its own neural network to check the accuracy of its results and then learn from them.
What is machine learning
To understand the basic principles of machine learning you do not need to have a PhD in computer science or have done a complex mathematical or technological study with a Master of Science (MSc) degree. Machine learning should be open and beneficial for everyone. So it is important that everyone can learn and understand the basics and the underlying principles of machine learning.
This section outlines common used terms that are used within the machine learning field. If you are short on time and want to know what the machine learning buzz is all about: This is the section you should read!
Before introducing terms and definitions: Be aware that no unified de-facto definition of machine learning exists. So be aware that when people are writing and talking about ‘machine learning’ they can be talking about totally different things and subjects. The machine learning (ML) label is often misused and intertwined with artificial intelligence (AI).
Investments in machine learning by large commercial companies are still growing. But a lot of documentation that is freely available on machine learning, especially some documents created by commercial vendors, is biased. In the reference section of this book you find a collection of open access resources for a more in depth study on various machine learning subjects. Be aware that also open access publications are not free from commercial interest. So also open access publications on machine learning are not always objective and free from bias.
Attention
Be aware of facts and fads when reading machine learning papers and books. Always be critical.
This section outlines essential concepts surrounding machine learning more in depth.
Machine Learning (ML) and Artificial Intelligence (AI) are terms that are crucial to know when creating machine learning driven solutions. But also the term NLP (Natural language processing) is a term that is crucial for understanding current machine learning application that are created for speech or text. E.g. for bots which you can converse with instead of humans.
So let’s start with a high level separation of common used terms and their meaning:
AI (Artificial intelligence) is concerned with solving tasks that are easy for humans but hard for computers.
ML (Machine learning) is the science of getting computers to act without being explicitly programmed. Machine learning (ML) is basically learning through doing. Often machine learning is regarded as a subset of AI.
NLP (Natural language processing) is the part of machine learning that has to do with language (usually written). NLP concepts are outlined more in depth in another chapter of this book.
ML,AI and NLP
A clear distinction between AI and ML is hard to make. Discussions on making a clear distinguishing are often a waste of time and are heavily biased. For this publication we use the term machine learning (ML), since machine learning can be brought down to tangible hard mathematical algebra, software implementations and tangible applications. Philosophical discussions on questions ‘what is intelligence?’ are mostly related to AI discussions.
At its core, machine learning is simply a way of achieving AI. Machine learning can be seen as currently the only viable approach to building AI systems that can operate in complicated real-world environments.
A few other definitions of artificial intelligence:
A branch of computer science dealing with the simulation of intelligent behaviour in computers.
The capability of a machine to imitate intelligent human behaviour.
A computer system able to perform tasks that normally require human intelligence, such as visual perception, speech recognition, decision-making, and translation between languages.
There are a lot of ways to simulate human intelligence, and some methods are more intelligent than others. AI raises questions on the philosophical spectrum, like ‘What is intelligence?’, ‘How do we measure intelligence?’ AI also gives a lot of fuel for ethical discussions like:
Should AI driven machine learning be a legal entity?
How do we prevent AI machines to kill human life, since AI machines will be ‘smarter’ than human intelligence ever will be.
These ethical questions should not be neglected. In the section ‘ML in Business problems’ a deep dive in the ethical issues for applying machine learning for business use cases is given.
Machine Learning is the most used current application of AI based around the idea that we should really just be able to give machines access to data and let them learn for themselves.
The paradigm shift: Creating smart software
To really understand machine learning a new view on how software can be created and how it works is needed. Most of our current computer programs are coded by using requirements, logic and design principles for creating good software. E.g. When you add an item to your shopping cart, you trigger an application component to store an entry in a shopping cart database table. So humans create an algorithm to solve a problem. Algorithms are a sequence of computer instructions used to solve a problem.
Many real world problems aren’t easy to solve. A good solution requires knowledge of the context and a lot of domain knowledge built from experience. The domain knowledge needed is often difficult to identify exactly.
Determining the exact context of a car in traffic and in order to make a decision within milliseconds to go left or right is a very hard programming challenge. It takes you decades and you will never do it right. This is why a paradigm shift in creating software for the next phase of automation is needed.
Programming computers the traditional way made it possible to put a man on the moon. To break new barriers in automation in our daily lives and science, requires new ways of thinking about creating intelligent software. Machine learning is a new way to ‘program’ computers. When a programming challenge is too large to solve with traditional programming methods (requirements collection, decision rules collection, etc) a program for a computer should be ‘generated’. Generated based on some known desired output types. But knowing all desired output types in front for a problem solution is often impossible. So your new machine learning ‘program’ will get it wrong sometimes. Large amounts of input data increases the quality of the generated prediction model. In the old traditional paradigm called ‘the program’.
ML vs traditional programming
Difference between general programming and (supervised) machine learning.
In essence machine learning makes computers learn the same way people learn: Through experience. And just as with humans, algorithms exist that makes it possible to make use of learned experience of other computers to make your machine learning application faster and better.h
The essence of machine learning is that a model is constructed based on so called training data. In machine learning, learning algorithms, so not computer programmers, create the rules.
The term machine learning model refers to the model artefact that is created by the training process. With this machine learning model it is now possible to create meaningful output based on new input. At least when the trained model is functioning as intended. In the figure below another view of the essence of the working of machine learning.
Machine learning working
What is a machine learning model
A machine learning model consists of numbers. Most of the time a very large amount of numbers. With the danger of getting into math: A machine learning model is a collection of numbers that are presented in a large multi dimensional matrix.
A model in the machine learning world is not different than any other mathematical model that presents some knowledge or (trained)information. It is just a large amount of numbers. So you need the algorithm to use it.
A model of data (plain numbers) can be used for any number of things. E.g.
To simply tell you about the behaviour of your data. For example, the mean is a model. If you imaging picking numbers at random from 1-10, a mean does summarize some useful information about your data. The same with the median and the variance. These are extremely lossy models, but they are models of your data.
To classify data. Say you’ve trained a classifier that classifies whether a photo contains a cat or not. That classifier concisely summarizes your data as “cat photo” or “non-cat photo.”
An efficient way to represent data for some other task. For example, you might generate paraphrases of a documents and model this as vector data. You can then use this model to classify the unique author of the text. So if you present a new document to this model using a simple machine learning algorithms the model gives you a number that indicates if this new document is from the same author or not.
Statistics is not machine learning.
Statistics is not machine learning. So let repeat this one more time:Statistics is not machine learning.
But the truth is that statistics and machine learning are intertwined and can not be seen separated. So for a good understanding and basic knowledge of machine learning, basic statistics knowledge is important.
The question ‘What’s the difference between Machine Learning and Statistics?’ is a questions that occurs often and leads to heavy discussion among scientists. To get it straight: A very clear separation between machine learning and statistics is hard to make. Machine Learning is however more a hybrid field than statistics. Some answers on this question are:
Machine learning is essentially a form of applied statistics.
Machine learning is glorified statistics.
Machine learning is statistics scaled up to big data.
Machine learning improves a model by learning using data, where a statistical model is not automatically improved feeding it more data.
Statistics emphasizes inference, whereas machine learning emphasized prediction.
Of course all answers are a bit true. With Machine Learning insights improve based when using more data. Using pure statistical models, learning and improving is not automatically guaranteed when more data is added. Statistical and machine learning methods and the reasoning about data do have a large overlap, but the purpose of using statistics is often very different than when machine learning is used.
Machine Learning can be defined as:
Machine learning is a field of computer science that uses statistical techniques to give computer systems the ability to “learn” with data, without being explicitly programmed.it So for example progressively improve learning performance for a specific task based on data input.
The underlying algorithms used for machine learning are essentially based around statistics methods. Machine learning is similar to the concepts around data mining. An algorithm attempts to find patterns in data to classify, predict, or uncover meaningful trends. Machine learning is often only useful if enough data is available. And if the data has been prepared correctly. So despite the promises of machine learning, when you want to apply machine learning you always have a data challenge. Getting good and large amounts of data that is usable for input of a machine learning algorithm is not a simple problem to solve. Not only getting enough quality data, but also managing (storing, processing etc) the retrieved data is hard.
Most of the time the storage and performance aspect are the easiest problems to solve regardless.