### Understanding Buzzwords in Data Science

Below are some notes from my diary, which I have found very useful, so I wanted to share them with you. When I started with Data Science, I was had lot of questions concerning jargon and buzz words. These notes helped me to get them straight.

At some point you may not agree with me, if so, please let me know in the comments as this will be new learning for me and will find the correct answer. Please mind that the below words are just a basic introduction, as each of them is a vast topic.

- Artificial Intelligence (AI)
- Learning System or Algorithm
- How to decide which Learning Algorithm to use
- Machine Learning
- Machine Learning Algorithms
- Deep Learning
- Deep Learning
- Artificial Neural Network (ANN)
- Machine Learning Vs Deep Learning
- AI Vs ML Vs DL
- Deep Learning Vs Neural Network
- Data Science
- Data Science Flow Chart
- Why Deep Learning and why not SVM?
- What is deep learning? Why is this a growing trend in machine learning? Why not use SVMs?
- Five main reasons why deep learning is so popular
- Explainable AI (XAI)
- References

**Artificial Intelligence (AI)**

According to John McCarthy, godfathers of AI, it is **“The science and engineering of making intelligent machines, especially intelligent computer programs”.**

Artificial intelligence is the simulation of human intelligence processes by machines, especially computer systems.Somewhere I read an article (I missed the details) where D.J. Patil, from LinkedIn says:

“Artificial intelligence is a broad term that simply means: any intelligence run by a computer. Machine learning is a subpart of this, but you don’t need machine learning to have an AI.

Andrew NG (One of my favorite Data Scientists) from Coursera tells us that AI is too big for any one person to understand—there are so many ways AI can be used and put into practice, it’s nearly impossible to document and understand them all.

Some of the activities computers with artificial intelligence are designed for include:

- Speech recognition
- Learning
- Planning
- Problem-solving

**Goals of AI**

- To Create Expert Systems − The systems which show intelligent behavior, learn, demonstrate, explain, and advice its users.
- To Implement Human Intelligence in Machines − Building systems that understand, think, learn, and behave like people.

AI can be split between 2 branches as below

- Applied AI (Week AI) -->
A machine which perform some specific tasks, such as Alexa; google assistant; email spam; housing prediction; stock prediction; weather predictions etc.

- Generalized AI (Strong AI) -->
A machine which acts like a human and perform task like a human

Machines which can take decision, react instantly.

Examples are Robots; Self-Driving cars etc.

Usually when anyone talks around us related to Artificial Intelligence, they are mainly referring to Week AI.

**Learning System or Algorithm**

There are 3 main types of Learning system or algorithm or techniques used in Machine Learning; Deep Learning or in Artificial Intelligence.

Please do refer to article https://datascience.foundation/datatalk/machine-learning-algorithm

**Supervised learning**- Meaning of supervised is "observe and direct the execution of (a task or activity)" or “supervisor” or “teacher”.
- Supervised learning is a type of machine learning algorithm that uses a known dataset (called the training dataset) to make predictions.
- o The training dataset includes input data (features / attributes / vector / independent variables etc.,.) and response values (dependent variable / class / target etc.,). From it, the supervised learning algorithm seeks to build a model that can make predictions of the response values for a new dataset. A test dataset is often used to validate the model. Using larger training datasets often yield models with higher predictive power that can generalize well for new datasets.
- Supervised learning classified into two categories of algorithms.
- Classification: A classification problem is when the output variable is a category, such as “Red” or “blue” / “disease” and “no disease”.
In-short, for categorical response values, where the data can be separated into specific “classes”.

- Regression: A regression problem is when the output variable is a real value, such as “dollars” or “weight”.
In-short, for continuous-response values.

- Classification: A classification problem is when the output variable is a category, such as “Red” or “blue” / “disease” and “no disease”.
- Below are the few Algorithms used in Supervised Learning.

**Unsupervised learning**- Meaning of unsupervised is “no supervisor” or “no teacher” or to act without anyone’s supervision or direction.
- Unsupervised learning means there is no training phase where we feed labelled data to the learning algorithm in order to train the model. Instead the algorithm must figure out things by itself.
- Unsupervised learning finds hidden patterns or intrinsic structures in input data. It is used to draw inferences from datasets consisting of input data without labeled responses
- In the unsupervised case, the goal is to discover patterns, deep insights, understand variation, find unknown subgroups (amongst the variables or observations), and so on in the data. Unsupervised learning can be quite subjective compared to supervised learning.
- The two most commonly used techniques in unsupervised learning are Association and Clustering.
- Association algorithms identity relationships between variables. A frequently quoted example is that if we feed sales data, it can identify patterns such as the people who bought item “A” has a probability of p% for buying item “B” too.
- Clustering algorithms groups data into clusters based on similar patterns. An example, if you feed many (say thousands; millions etc of) pictures of various animals, the clustering algorithm will group them into various clusters such as cats, dogs etc.
- Another example of Cluster algorithm, If a Telecom company wants optimize the locations where they build cell phone towers, they can use machine learning to estimate the number of clusters of people relying on their towers. A phone can only talk to one tower at a time, so the team uses clustering algorithms to design the best placement of cell towers to optimize signal reception for groups, or clusters, of their customers.
- Some techniques or algorithms that are used in Unsupervised learning are
- PCA
- SVM
- k-Means,
- Anomaly detection Algorithm
- Neural Networks, and
- Latent Variable Models

**Reinforcement Learning**- Reinforcement learning is training by rewards and punishments.
- In reinforcement learning the system learns from the environment. When the system does something right, it is rewarded. When it does something wrong, it is not.
- The system learns in a very similar way to how a person would learn.
- In this type of machine learning, the machine itself learn how to behave in the environment by performing actions and comparing with the results.
- It is like machine performing trial and error method to determine the best action possible based on the experience.
- Reinforcement learning involves goal-oriented algorithms, which attain a complex goal with multiple steps which ultimately improves the performance of the machine to predict things.
- The aim of the game in reinforcement learning is to maximize the reward.
- There are many different types of algorithms for reinforcement learning in python.
Two of the most common for the multi-arm bandit problem are upper confidence bound and Thompson sampling.

- Neural networks are the solution to most of the complex problems in Artificial intelligence like Computer vision, machine translation etc. If Neural networks combined with reinforcement learning, then it is very easy to solve even more complex problems. This way of integrating neural networks with reinforcement learning is known as Deep Reinforcement learning.

**How to decide which Learning Algorithm to use:**

There is no best method or one size fits all. But algorithm selection also depends on the size and type of data you’re working with, the insights you want to get from the data, and how those insights will be used.

**Machine Learning**

Machine learning is simply how computers “think” through and execute a task without being programmed to. It is a subset of artificial intelligence that involves algorithms and models that can automatically analyze and learn data to make inferences and decisions without human intervention.

Tom Michael Mitchell, an American computer scientist and author of the book Machine Learning gave a simple description of machine learning systems: “A computer program is said to learn from experience E in respect to some class of tasks T and performance P if its performance at tasks in T, as measured by P, improves with experience E.”

Machine learning is an application of artificial intelligence (AI) that helps to build the ability of the systems to learn automatically learn and improve from experience without being explicitly programmed. Machine learning focuses on the development of computer programs that can access data and use it learn for themselves a model based on the most important features which help to predict the future.

Machine learning is a subset of AI. That is, all machine learning counts as AI, but not all AI counts as machine learning.

In 1959, Arthur Samuel, one of the pioneers of machine learning, defined machine learning as a “field of study that gives computers the ability to learn without being explicitly programmed.” That is, machine-learning programs have not been explicitly entered into a computer. Machine-learning programs, in a sense, adjust themselves in response to the data they’re exposed to (like a child that is born knowing nothing adjusts its understanding of the world in response to experience).

**Machine Learning Algorithm**

The most popular Machine Learning algorithms are as below.

- Linear Regression->
- Linear Regression is one of the most (or probably “the most”) popular Machine Learning algorithms. Remember when in high school you had to plot data points on a graph (given X axis and Y axis) and then find the line of best fit? That is a very simple Machine Learning algorithm.
- In more technical terms, linear regression attempts to establish a relationship between one or more independent variables (points on X axis) and a numeric outcome or dependent variable (value on Y axis) by fitting the equation of a line to observed data:
**y = mX + c**- Please refer to article https://datascience.foundation/sciencewhitepaper/understanding-linear-regression-with-python-practical-guide-2 for more details on Linear Regression with some hands-on with python.

- Logistic Regression->
- Logistic Regression has the same main idea as linear regression. The difference is that this technique is used when the output or dependent variable is binary — the outcome can have only two possible values. For example, let’s say that we want to predict if age has an influence on the probability of having a heart attack. In this case, our prediction needs only be a “yes” or “no” answer — only two possible values.
- In logistic regression, the line of best fit is not a straight line anymore. The prediction for the final output is transformed using a non-linear S-shaped function called the logistic function, g().
- This logistic function maps the intermediate outcome values into an outcome variable Y with values ranging from 0 to 1. This 0 to 1 values can then be interpreted as the probability of occurrence of Y.
- Please refer to article https://datascience.foundation/sciencewhitepaper/understanding-logistic-regression-with-python-practical-guide-1 for more details on Logistic Regression with some hands-on with python.

- Decision Trees->
- Decision Trees belongs to the category of supervised learning algorithms. They can be used for solving both regression and classification tasks.
- In this algorithm, the training model learns to predict values of the target variable by learning decision rules and a tree representation.
- Please refer to article https://datascience.foundation/sciencewhitepaper/understanding-decision-trees-with-python for more details on Decision Tree with some hands-on with python.

- Random Forest->
- Random Forest is one of the most popular and powerful Machine Learning algorithms. It is a type of ensemble algorithm.
- In Random Forest, we have an ensemble of decision trees.
- The underlying idea for ensemble learning is collective opinion of many which is more likely to be accurate than that of one. The outcome of each of the models is combined and a prediction is made.

- Naive Bayes->
- Naive Bayes is a simple and widely used Machine Learning algorithm based on the Bayes Theorem.
- It is called naive because the classifier assumes that the input variables are independent of each other (a strong and unrealistic assumption for real data).
- The Bayes theorem is given by the equation below:

- Support Vector Machines (SVM)->
- Support vector machines (SVM) is a supervised machine learning algorithm which can be used for classification or regression problems.
- It uses a technique called the kernel trick to transform your data and then based on these transformations it finds an optimal boundary between the possible outputs. The word "kernel" is used in mathematics to denote a weighting function for a weighted sum or integral. SVM is a discriminative classifier formally defined by a separating hyperplane.
- The optimization problem of SVM is convex, which is guaranteed to have a global optimal solution.

- K-Nearest Neighbors (KNN or k-NN) ->
- KNN algorithm is a very simple and popular technique.
- k-NN is a supervised algorithm used for classification.
- The goal is to classify a new data point/observation into one of multiple existing categories. So, a number of neighbor’s ‘k’ is selected (usually k = 5), and the k closest data points are identified (either using Euclidean or Manhattan distances)

- K-Means->
- k-Means is an unsupervised algorithm used for clustering. By unsupervised we mean that we don’t have any labeled data upfront to train the model. Hence the algorithm just relies on the dynamics of the independent features to make inferences on unseen data.

**Deep Learning**

Deep Learning is a next evolution of Machine Learning.

Deep learning (also known as deep structured learning) is part of a broader family of machine learning methods based on artificial neural networks with representation learning. Learning can be supervised, semi-supervised or unsupervised.

Deep learning architectures such as

- Deep Neural Networks (DNN)
- Deep Belief Networks (DBN)
- Recurrent Neural Networks (RNN)
- Convolutional Neural Networks (CNN)
- Artificial Neural Network (ANN)

have been applied to fields including computer vision, machine vision, speech recognition, natural language processing, audio recognition, social network filtering, machine translation, bioinformatics, drug design, medical image analysis, material inspection and board game programs, where they have produced results comparable to and in some cases surpassing human expert performance.

Deep Learning is a subfield of machine learning concerned with algorithms inspired by the structure and function of the brain called Artificial Neural Networks.

Usually, when people use the term deep learning, they are referring to deep artificial neural networks.

Deep learning excels on problem domains where the inputs (and even output) are analog. Meaning, they are not a few quantities in a tabular format but instead are images of pixel data, documents of text data or files of audio data.

Deep is a technical term. It refers to the number of layers in a neural network. A shallow network has one so-called hidden layer, and a deep network has more than one. Multiple hidden layers allow deep neural networks to learn features of the data in a so-called feature hierarchy.

**Neural Network (NN) **

Neural Networks (NN), also called as Artificial Neural Network (ANN) is named after its artificial representation of working of a human being’s nervous system. Remember the below diagram? Most of us have been taught in High School!

Neural Network is a biological representation of neurons in the brain it deals with all the connections and chemical reactions in the brain. While artificial neural network is a neural network which is created for classification and prediction problems which comes under Deep Learning concepts.

Though in deep learning, artificial neural networks are of several types

- Perceptron ANN
- Convolution ANN
- Recurrent ANN
- GANS

There are three major classes of artificial neural networks. They are feed forward, feedback and statistical neural networks. The way the neurons are interconnected and the way the weights get adjusted (neural network training) are two major criteria for this classification.

The details will not stop here, there are many other similar names which keeps us confusing, yes at-time these are used interchangeably but for sure have some differences between them.

Ohhh… so confusing.

First, let’s understand what a network is?

A network is a system of nodes (connection points) and connections between nodes. Another way to look at it is a bunch of simple components that come together to create a complex system.

Typically, networks are made with the purpose of transmitting and receiving information. Nodes are capable of taking inputs and processing them to produce an output. Connections are responsible for information flow between nodes, which could be unidirectional (one direction only) or bidirectional (information can flow back and forth).

The cool part about networks is that the global behavior of the overall network is emergent. That is, the network has more powerful abilities than its individual components, despite being composed of those components. Each individual component and interaction compounds to create a larger and more effective entity.

An artificial neural network is essentially a computational network based on biological neural networks. These models aim to duplicate the complex network of neurons in our brains, refer to the image shown above of human brain neurons.

So, this time, the nodes are programmed to behave like actual neurons. Although they’re really artificial neurons that try to behave like real ones, hence the name “artificial neural network.”

**Artificial Neural Network (ANN)**

A single perceptron (or neuron) can be imagined as a Logistic Regression. Artificial Neural Network, or ANN, is a group of multiple perceptron’s/ neurons at each layer.

ANN is also known as a Feed-Forward Neural network because inputs are processed only in the forward direction.

ANN can be used to solve problems related to:

- Tabular data
- Image data
- Text data

**Machine Learning Vs Deep Learning**

Both deep learning and usual machine learning are methods of teaching AI to perform tasks. Though some people use these terms interchangeably, but they’re not the same.

Let’s see the differences between deep learning and machine learning.

- Machine Learning is a method to implement artificial intelligence. Machine learning algorithms parse data learn from it, and then apply that learning to make informed decisions.
- Unlike machine learning algorithms which break problems down into different parts and individually solve them, deep learning solves a problem from end to end.
- A deep learning technique is capable of learning categories incrementally via its hidden layer architecture. Probably the biggest advantage of using this technique is the more data you feed deep learning algorithms, the better they get at solving a task.
- A deep learning technique is capable of learning categories incrementally via its hidden layer architecture. Probably the biggest advantage of using this technique is the more data you feed deep learning algorithms, the better they get at solving a task.
- Deep learning is inspired by the way the human brain works, it needs high-end machines and huge amounts of big data to provide optimum performance.
- Deep learning is a subset of machine learning, and machine learning is a subset of AI, which is an umbrella term for any computer program that does something smart. In other words, all machine learning is AI, but not all AI is machine learning, and so forth.

Now let’s see the differences between both based on some important points.

**Data dependencies**The biggest difference between machine learning and deep learning lies in their performance as the volume of data increases. Usual machine learning algorithms usually perform well even if the volume of the dataset is small. On the other hand, deep learning algorithms require a massive amount of data to perform perfectly.

**Decision boundary**Every Machine Learning algorithm learns the mapping from an input to output.

In case of parametric models, the algorithm learns a function with a few sets of weights.

In the case of classification problems, the algorithm learns the function that separates 2 classes – this is known as a Decision boundary.

A decision boundary helps us in determining whether a given data point belongs to a positive class or a negative class.

Every Machine Learning algorithm is not capable of learning all the functions. This limits the problems ML algorithms can solve that involve a complex relationship.

Say for example of Clustering; or Image classification etc., are the problems which is hard to solve via ML algorithms, whereas the same is easily solved using various techniques used in Deep Learning.

**Feature engineering**Feature Engineering is a key step in any model training process.

Feature engineering refers to the process of putting the domain knowledge into the modeling of feature extractors to lower the complexity of data and make the patterns more visible in order to learn the algorithms working, which are feature extraction and feature selection.

Consider an image classification problem, extracting the features manually from an image needs a strong domain knowledge. This process is expensive and difficult in terms of expertise and time. When compared to deep learning, this process is automated, and is taken care internally by the neurons and layers which we define, which means it learns features layer-wise. Hence, deep learning lowers the task of creating new feature extractor for each problem.

**Hardware Dependencies**As mentioned above, Deep Learning is to mimic human brains, thus machines need high-end configurations like GPU’s… hard to work on CPU. Whereas Machine Learning model training can be achieved on low configured machines.

**Execution Time**Deep Learning uses multiple layers and has many features to work on, also Deep Learning uses Forward and Backward propagation, and iterates multiple time to minimize the loss, all these processes take much time to execute when compared to Machine Learning models.

**AI Vs ML Vs DL**

Will try to differentiate it in one line.

AI -> Human intelligence exhibited by machines

ML -> An approach to achieve AI

DL -> A technique for implementing ML.

**Deep Learning Vs Neural Network**

Neural networks, a beautiful biologically inspired programming paradigm which enables a computer to learn from observational data

Deep learning, a powerful set of techniques for learning in neural networks.

Neural Network are a superset of Deep Learning. All Deep Learning is Neural Network, but not vice versa.

The “deep” in deep learning is referring to the depth of layers in a neural network.

If inserted in learning more into details about Deep Learning and Neural Network, I recommend eBook by Michael Nielsen http://neuralnetworksanddeeplearning.com/ , nicely explained all concepts.

Story in short -- the easiest way to think about artificial intelligence, machine learning, neural networks, and deep learning is to think of them like Russian nesting dolls. Each is essentially a component of the prior term.

**Data Science**

Data Science has an intersection with AI but is not a subset of AI.

Data Science is the Art and Science of drawing actionable insights from the data.

Applications where Data Science is used:

- Retail;
- Bank;
- e-Commerce;
- HealthCare;
- Telecom etc.

**Data Science Flow Chart**

Above mentioned concepts; system; architecture; algorithm can be achieved or make live using Data Science and other technology.

Below is the basic flow chart which data science follow, again there is no restriction on choice of technology or algorithm. One can pick any, but the flow would remain almost same. Some steps might be iterative.

If you are somewhat familiar with Machine Learning Algorithm’s, and worked on SVM (Support Vector Machine), then definitely you might have a question that why Deep Learning and why not SVM?

**Why Deep Learning and why not SVM?**

**What is deep learning? Why is this a growing trend in machine learning? Why not use SVMs?**

On Quora I found a wonderful explanation of this from Varsha Lal, Master's Computer Networking, Florida State University.

Thanks to Varsha Lal for explaining this. Will just copy-paste the details here.

What is SVM?

Support vector machines (SVM) is a supervised machine learning algorithm which can be used for classification or regression problems. It uses a technique called the kernel trick to transform your data and then based on these transformations it finds an optimal boundary between the possible outputs. The word "kernel" is used in mathematics to denote a weighting function for a weighted sum or integral. SVM is a discriminative classifier formally defined by a separating hyperplane.

The optimization problem of SVM is convex, which is guaranteed to have a global optimal solution.

Limitations of SVM:

SVMs are non-parametric models, hence, the complexity grows as the number of training samples increases. The computational cost grows linearly with the number of classes.

Deep Learning:

Until 2006 SVM were the best general-purpose algorithm for machine learning. In 2006 Hinton came up with deep learning and neural nets.

[G. E. Hinton, R. R. Salakhutdinov. Reducing the dimensionality of data with neural networks. Science, Vol. 313. no. 5786, pp. 504 - 507, 2006.]

Generally, SVM struggles with a dataset where the number of features is much larger than the number of observations. Deep learning can overcome those limitations.

The "deep" in "deep learning" refers to the number of layers through which the data is transformed.

Deep learning have a large number of layers as compared to classical neural networks. More layers capture more statistical invariances. Moreover deep Boltzmann machines are universal approximators. In deep learning, each level learns to transform its input data into a slightly more abstract and composite representation.

Deep learning neural networks are usually trained by using iterative, gradient-based optimizer that merely drive the cost function to a very low value, rather than the linear equation solvers used to train linear regression models or the convex optimization algorithms with global convergence guarantees used to train logistic regression or SVMs.

The classical approach to training neural networks is to minimize a (regularized) loss using backpropagation, a gradient descent method specialized to neural networks. Modern versions of backpropagation rely on stochastic gradient descent (SGD) to efficiently approximate the gradient for massive datasets.

Recently, Restricted Boltzmann Machines (RBMs) are used for deep neural networks that belong to so called Energy Based Models. RBMs achieved state of the art performance in collaborative filtering. In deep learning architectures the idea of energy is used as a metric for measurement of the models quality.

Why Deep Learning is a growing trend?

The dramatic 2012 breakthrough in solving the ImageNet Challenge is widely considered to be the beginning of the deep learning revolution of the 2010s. The ImageNet project is a large visual database designed for use in visual object recognition software research.

In 2012, AlexNet a Deep Learning network won the ImageNet challenge:

The Deep learning algorithms became more popular. AlexNet is a deep CNN trained on ImageNet and outperformed all the entries that year. The network was made up of 5 conv layers, max-pooling layers, dropout layers, and 3 fully connected layers at the end. AlexNet used ReLU for the nonlinearity functions, which they found to decrease training time because ReLUs are much faster than using tanh functions.

In 2013, Deep Learning Wins MICCAI 2013 Grand Challenge on Mitosis Detection.

In 2015, Microsoft ResNet (Deep residual network) which won the ImageNet challenge. There are 152 layers in the Microsoft ResNet.

**Five main reasons why deep learning is so popular**

- The deep learning networks can be efficiently implemented on massively parallel graphics processing units (GPUs).
- They are easy to implement.
- Deep learning networks can handle huge amounts of data
- Deep learning networks can perform feature extraction and classification in one model.
- As more and more data and computation power become available the use of deep learning will increase.

**Explainable AI (XAI) **

Explainable AI (XAI) refers to methods and techniques in the application of artificial intelligence technology (AI) such that the results of the solution can be understood by humans.

It contrasts with the concept of the "black box" in machine learning where even their designers cannot explain why the AI arrived at a specific decision.

XAI is an implementation of the social right to explanation.

Hope this will help you in some way to get familiar with the buzzwords we hear about now a day, and a bit of information on each.

Please do your own research and share with us.

**References **

- https://www.quora.com/
- https://en.wikipedia.org/wiki/
- https://arxiv.org/pdf/1404.7828v4.pdf
- http://www.deeplearningbook.org/
- https://www.analyticsvidhya.com/
- https://bernardmarr.com/default.asp?contentID=1789
- https://machinelearningmastery.com/what-is-deep-learning/
- https://www.investopedia.com/terms/d/deep-learning.asp
- https://www.mygreatlearning.com/blog/types-of-neural-networks/
- https://www.mygreatlearning.com/blog/what-is-artificial-intelligence/
- http://neuralnetworksanddeeplearning.com/chap1.html
- https://www.quora.com/What-is-deep-learning-Why-is-this-a-growing-trend-in-machine-learning-Why-not-use-SVMs
- https://pathmind.com/wiki/ai-vs-machine-learning-vs-deep-learning
- https://www.datapine.com/blog/business-intelligence-buzzwords/
- http://neuralnetworksanddeeplearning.com/

Michael Baron

24 Dec 2020 11:22:51 AM