Bài giảng Máy học nâng cao - Chương 7: Deep learning an Introduction - Trịnh Tấn Đạt

What is Deep Learning (DL)?  A machine learning subfield of learning representations of data. Exceptional effective at learning patterns.  Deep learning algorithms attempt to learn (multiple levels of) representation by using a hierarchy of multiple layers  If you provide the system tons of information, it begins to understand it and respond in useful ways.

pdf109 trang | Chia sẻ: thanhle95 | Lượt xem: 501 | Lượt tải: 1download
Bạn đang xem trước 20 trang tài liệu Bài giảng Máy học nâng cao - Chương 7: Deep learning an Introduction - Trịnh Tấn Đạt, để xem tài liệu hoàn chỉnh bạn click vào nút DOWNLOAD ở trên
Trịnh Tấn Đạt Khoa CNTT – Đại Học Sài Gòn Email: trinhtandat@sgu.edu.vn Website: https://sites.google.com/site/ttdat88/ Contents  Introduction  Applications  Convolutional Neural Networks vs. Recurrent Neural Networks  Hardware and Software Introduction to Deep Learning Introduction to Deep Learning Why Deep Learning?  Machine learning is a field of computer science that gives computers the ability to learn without being explicitly programmed  Methods that can learn from and make predictions on data Why Deep Learning? Why Deep Learning? Why Deep Learning?  Can we learn the underlying features directly from data? Why Deep Learning?  ML vs. Deep Learning:  Most machine learning methods work well because of human-designed representations and input features ML becomes just optimizing weights to best make a final prediction Why Deep Learning?  Challenges of ML:  Relevant data acquisition  Data preprocessing  Feature selection  Model selection: simplicity versus complexity  Result interpretation. What is Deep Learning (DL)?  A machine learning subfield of learning representations of data. Exceptional effective at learning patterns.  Deep learning algorithms attempt to learn (multiple levels of) representation by using a hierarchy of multiple layers  If you provide the system tons of information, it begins to understand it and respond in useful ways. Why is DL useful?  Manually designed features are often over-specified, incomplete and take a long time to design and validate  Learned Features are easy to adapt, fast to learn  Deep learning provides a very flexible, (almost?) universal, learnable framework for representing world, visual and linguistic information.  Can learn both unsupervised and supervised  Utilize large amounts of training data In ~2010 DL started outperforming other ML techniques first in speech and vision, then NLP Why is DL useful? Why is DL useful? Why Now? The Perceptron: Forward Propagation  Neural Network Architectures  Back Propagation for Weight Update Importance of Activation Functions  The purpose of activation functions is to introduce non-linearities into the network Introduction to Deep Learning  Activation function Introduction to Deep Learning  Neural Network Adjustements Introduction to Deep Learning  How do I know what architecture to use?  Don’t be a hero.  ❖ Take whatever works best.  ❖ Download a pretrained model.  ❖ Add/delete some parts of it.  ❖ Finetune it on your application. The Problem of Overfitting Handling Overfitting  Reduce the network’s capacity by removing layers or reducing the number of elements in the hidden layers.  Apply regularization, which comes down to adding a cost to the loss function for large weights  Use Dropout layers, which will randomly remove certain features by setting them to zero Dropout  During training, randomly set some activations to 0  Typically ‘drop’ 50% of activations in layer  Forces network to not rely on any 1 node Early Stopping Applications Applications Applications Applications  DeepDream Applications Applications Applications Applications Applications Applications  Object Detection:  R-CNN  Fast R-CNN  Faster R-CNN  YOLO  SDD  RetinaNet Applications  Instance Segmentation  Mask R-CNN: Very Good Results! Applications  Open Source Frameworks  Lots of good implementations on GitHub!  TensorFlow Detection API: https://github.com/tensorflow/models/tree/master/research/object_detect ion Faster RCNN, SSD, RFCN, Mask R-CNN Applications Applications Applications  Generative Adversarial Networks (GANs)  Super Resolution 8K 65 inch QLED TV Q900R with 8K AI Upscaling https://www.samsung.com/uk/tvs/qledtv- q900r/QE65Q900RATXXU/ Applications  Generative Adversarial Networks (GANs)  Photo Inpainting Convolutional Neural Networks Convolutional Neural Networks  History:  Gradient-based learning applied to document recognition [LeCun, Bottou, Bengio, Haffner 1998] LeNet Convolutional Neural Networks  ImageNet Classification with Deep Convolutional Neural Networks [Krizhevsky, Sutskever, Hinton, 2012] AlexNet Convolutional Neural Networks  Fast-forward to today: ConvNets are everywhere Convolutional Neural Networks  Fast-forward to today: ConvNets are everywhere Convolutional Neural Networks  Fast-forward to today: ConvNets are everywhere Convolutional Neural Networks Convolutional Neural Networks  Fully Connected Layer Convolutional Neural Networks  Convolution Layer Convolutional Neural Networks Convolutional Neural Networks Convolutional Neural Networks Convolutional Neural Networks Convolutional Neural Networks Convolutional Neural Networks Convolutional Neural Networks Convolutional Neural Networks Convolutional Neural Networks Convolutional Neural Networks Convolutional Neural Networks Convolutional Neural Networks Convolutional Neural Networks Convolutional Neural Networks Convolutional Neural Networks Convolutional Neural Networks [ConvNetJS demo: training on CIFAR-10] State of the art Today: CNN Architectures  Today: CNN Architectures  VGG  GoogLeNet  ResNet  DenseNet  MobileNets  SENet Wide ResNet  . Recurrent Neural Networks Recurrent Neural Networks Recurrent Neural Networks Recurrent Neural Networks Recurrent Neural Networks  Image Captioning Recurrent Neural Networks Recurrent Neural Networks  Long Short Term Memory (LSTM) [Hochreiter et al., 1997] Spech Recognition Optical Character Recognition (OCR)  Gated Recurrent Units (GRU) Hardware and Software Hardware and Software  Deep learning hardware  CPU, GPU, TPU  Deep learning software  TensorFlow, Keras, PyTorch, MxNet Hardware and Software Hardware and Software Hardware and Software Hardware and Software Software Software  Tensorflow Software Software Software Software Software  Keras: High-Level Software  TensorFlow: Pretrained Models  tf.keras: (https://www.tensorflow.org/api_docs/python/tf/keras/applications)  TF-Slim: (https://github.com/tensorflow/models/tree/master/research/slim) Software Bài Tập  1) Cài đặt chương trình demo MNIST - image classification dùng convolutional neural network (CNN) MNIST - image classification  Add DATA: Kaggle MNIST dataset  MNIST dataset Model – Ví dụ Training loss/Valid loss