Introduction
❖ What are artificial neural networks?
 A neuron receives a signal, processes it, and
propagates the signal (or not)
 The brain is comprised of around 100 billion
neurons, each connected to ~10k other neurons:
1015 synaptic connections
 ANNs are a simplistic imitation of a brain
comprised of dense net of simple structures
 Origins: Algorithms that try to mimic the brain
 Very widely used in 80s and early 90s; popularity
diminished in late 90s.
 Recent resurgence: State-of-the-art technique for
many applica1ons
                
              
            Trịnh Tấn Đạt
Khoa CNTT – Đại Học Sài Gòn
Email: 
[email protected]
Website: https://sites.google.com/site/ttdat88/
Contents
 Introduction
 Perceptron
 Neural Network
 Backpropagation Algorithm
Introduction
❖ What are artificial neural networks?
 A neuron receives a signal, processes it, and
propagates the signal (or not)
 The brain is comprised of around 100 billion
neurons, each connected to ~10k other neurons:
1015 synaptic connections
 ANNs are a simplistic imitation of a brain
comprised of dense net of simple structures
 Origins: Algorithms that try to mimic the brain
 Very widely used in 80s and early 90s; popularity 
diminished in late 90s.
 Recent resurgence: State-of-the-art technique for 
many applica1ons
Comparison of computing power
 Neural networks are designed to be massively parallel
 The brain is effectively a billion times faster
Applications of neural networks
Medical Imaging 
Fake Videos
Conceptual mathematical model
 Receives input from sources
 Computes weighted sum
 Passes through an activation function
 Sends the signal to m succeeding neurons
Artificial Neural Network
 Organized into layers of neurons
 Typically 3 or more: input, hidden and output
 Neural networks are made up of nodes or units, connected by links
 Each link has an associated weight and activation function
Perceptron
 Simplified (binary) artificial neuron
Perceptron
 Simplified (binary) artificial neuron with weights
Perceptron
 Simplified (binary) artificial neuron; no weights
Perceptron
 Simplified (binary) artificial neuron; add weights
Perceptron
 Simplified (binary) artificial neuron; add weights
Introducing Bias
 Perceptron needs to take into account the bias
o Bias is just like an intercept added in a linear equation.
o It is an additional parameter in the Neural Network which is used to
adjust the output along with the weighted sum of the inputs to the
neuron.
o Bias acts like a constant which helps themodel to fit the given data
Sigmoid Neuron
 The more common artificial neuron
Sigmoid Neuron
 In effect, a bias value allows you to
shift the activation function to the left
or right, which may be critical for
successful learning.
 Consider this 1-input, 1-output
network that has no bias:
 Here is the function that this network
computes, for various values of w0:
Sigmoid Neuron
 If we add a bias to that network, like 
so:
Having a weight of -5 for w1 shifts the curve to the right, which allows us to
have a network that outputs 0 when x is 2.
Simplified Two-Layer ANN
One hidden layer 
Simplified Two-Layer ANN
Optimization Primer
 Cost function`
Calculate its derivative
Gradient Descent
Gradient Descent
Gradient Descent Optimization
Backpropagation
Backpropagation
Activation functions
 Bias (threshold) activation function was proposed first
 Sigmoid and tanh introduce non-linearity with different codomains
 ReLU is one of the more popular ones because its simple to compute and very 
robust to noisy inputs
Sigmoid function
 Sigmoid non-linearity squashes real numbers between [0, 1]
 Historically a nice interpretation of neuron firing rate (i.e. not firing at all to 
fully saturated firing ).
 Currently, not used as much because really large values too close to 0 or 1 
result in gradients too close to 0 stopping
Sigmoid function
Tanh function
 Tanh function squashes real numbers [-1, 1]
 Same problem as sigmoid that its activations saturate thus killing gradients.
 But it is zero-centered minimizing the zig-zagging dynamics during gradient 
descent.
 Currently preferred sigmoid nonlinearity
ReLU: Rectifier Linear Unit
 ReLU’s activation is at threshold of zero
 Quite popular over the last few years
 Speeds up Stochastic Gradient Descent (SGD) convergence
 It is easier to implement due to simpler mathematical functions
 Sensitive to high learning rate during training resulting in “dead” neurons (i.e. 
neurons that will not activate across the entire dataset).
Neuron Modeling: Logistic Unit
ANNs
 1 hidden layer
Modeling
Other Network Architectures
Example
 Image Recognition: 4 classes ( one-hot encoding)
Example
Neural Network Classification
Example: Perceptron - Representing Boolean Functions
Example: Perceptron - Representing Boolean Functions
Example: Perceptron - Representing Boolean Functions
 Combining Representations to Create Non-Linear Functions
Example: MNIST data
Example: MNIST data
Neural Network Learning
Perceptron Learning Rule
Batch Perceptron
Learning in NN: Backpropagation
Cost Function
Optimizing the Neural Network
Forward Propagation
Backpropagation Intuition
Backpropagation Intuition
Backpropagation Intuition
Backpropagation Intuition
Backpropagation Intuition
Backpropagation: Gradient Computation
Backpropagation
Training
 Training a Neural Network via Gradient Descent with Backpropagation
Training a Neural Network
Homework
1) Implement iris flower classification using neural network
• Hint:
- Using MLPClassifier from sklearn module 
https://www.python-course.eu/neural_networks_with_scikit.php
- Keras model 
https://gist.github.com/NiharG15/cd8272c9639941cf8f481a7c4478d525