Introduction
❖ What are artificial neural networks?
A neuron receives a signal, processes it, and
propagates the signal (or not)
The brain is comprised of around 100 billion
neurons, each connected to ~10k other neurons:
1015 synaptic connections
ANNs are a simplistic imitation of a brain
comprised of dense net of simple structures
Origins: Algorithms that try to mimic the brain
Very widely used in 80s and early 90s; popularity
diminished in late 90s.
Recent resurgence: State-of-the-art technique for
many applica1ons
62 trang |
Chia sẻ: thanhle95 | Lượt xem: 465 | Lượt tải: 1
Bạn đang xem trước 20 trang tài liệu Bài giảng Máy học nâng cao - Chương 6: Artificical Neural Network - Trịnh Tấn Đạt, để xem tài liệu hoàn chỉnh bạn click vào nút DOWNLOAD ở trên
Trịnh Tấn Đạt
Khoa CNTT – Đại Học Sài Gòn
Email: trinhtandat@sgu.edu.vn
Website: https://sites.google.com/site/ttdat88/
Contents
Introduction
Perceptron
Neural Network
Backpropagation Algorithm
Introduction
❖ What are artificial neural networks?
A neuron receives a signal, processes it, and
propagates the signal (or not)
The brain is comprised of around 100 billion
neurons, each connected to ~10k other neurons:
1015 synaptic connections
ANNs are a simplistic imitation of a brain
comprised of dense net of simple structures
Origins: Algorithms that try to mimic the brain
Very widely used in 80s and early 90s; popularity
diminished in late 90s.
Recent resurgence: State-of-the-art technique for
many applica1ons
Comparison of computing power
Neural networks are designed to be massively parallel
The brain is effectively a billion times faster
Applications of neural networks
Medical Imaging
Fake Videos
Conceptual mathematical model
Receives input from sources
Computes weighted sum
Passes through an activation function
Sends the signal to m succeeding neurons
Artificial Neural Network
Organized into layers of neurons
Typically 3 or more: input, hidden and output
Neural networks are made up of nodes or units, connected by links
Each link has an associated weight and activation function
Perceptron
Simplified (binary) artificial neuron
Perceptron
Simplified (binary) artificial neuron with weights
Perceptron
Simplified (binary) artificial neuron; no weights
Perceptron
Simplified (binary) artificial neuron; add weights
Perceptron
Simplified (binary) artificial neuron; add weights
Introducing Bias
Perceptron needs to take into account the bias
o Bias is just like an intercept added in a linear equation.
o It is an additional parameter in the Neural Network which is used to
adjust the output along with the weighted sum of the inputs to the
neuron.
o Bias acts like a constant which helps themodel to fit the given data
Sigmoid Neuron
The more common artificial neuron
Sigmoid Neuron
In effect, a bias value allows you to
shift the activation function to the left
or right, which may be critical for
successful learning.
Consider this 1-input, 1-output
network that has no bias:
Here is the function that this network
computes, for various values of w0:
Sigmoid Neuron
If we add a bias to that network, like
so:
Having a weight of -5 for w1 shifts the curve to the right, which allows us to
have a network that outputs 0 when x is 2.
Simplified Two-Layer ANN
One hidden layer
Simplified Two-Layer ANN
Optimization Primer
Cost function`
Calculate its derivative
Gradient Descent
Gradient Descent
Gradient Descent Optimization
Backpropagation
Backpropagation
Activation functions
Bias (threshold) activation function was proposed first
Sigmoid and tanh introduce non-linearity with different codomains
ReLU is one of the more popular ones because its simple to compute and very
robust to noisy inputs
Sigmoid function
Sigmoid non-linearity squashes real numbers between [0, 1]
Historically a nice interpretation of neuron firing rate (i.e. not firing at all to
fully saturated firing ).
Currently, not used as much because really large values too close to 0 or 1
result in gradients too close to 0 stopping
Sigmoid function
Tanh function
Tanh function squashes real numbers [-1, 1]
Same problem as sigmoid that its activations saturate thus killing gradients.
But it is zero-centered minimizing the zig-zagging dynamics during gradient
descent.
Currently preferred sigmoid nonlinearity
ReLU: Rectifier Linear Unit
ReLU’s activation is at threshold of zero
Quite popular over the last few years
Speeds up Stochastic Gradient Descent (SGD) convergence
It is easier to implement due to simpler mathematical functions
Sensitive to high learning rate during training resulting in “dead” neurons (i.e.
neurons that will not activate across the entire dataset).
Neuron Modeling: Logistic Unit
ANNs
1 hidden layer
Modeling
Other Network Architectures
Example
Image Recognition: 4 classes ( one-hot encoding)
Example
Neural Network Classification
Example: Perceptron - Representing Boolean Functions
Example: Perceptron - Representing Boolean Functions
Example: Perceptron - Representing Boolean Functions
Combining Representations to Create Non-Linear Functions
Example: MNIST data
Example: MNIST data
Neural Network Learning
Perceptron Learning Rule
Batch Perceptron
Learning in NN: Backpropagation
Cost Function
Optimizing the Neural Network
Forward Propagation
Backpropagation Intuition
Backpropagation Intuition
Backpropagation Intuition
Backpropagation Intuition
Backpropagation Intuition
Backpropagation: Gradient Computation
Backpropagation
Training
Training a Neural Network via Gradient Descent with Backpropagation
Training a Neural Network
Homework
1) Implement iris flower classification using neural network
• Hint:
- Using MLPClassifier from sklearn module
https://www.python-course.eu/neural_networks_with_scikit.php
- Keras model
https://gist.github.com/NiharG15/cd8272c9639941cf8f481a7c4478d525