Studies
Admissions
The Institute
Resources
Studies
Admissions
The Institute
Resources
Studies
Admissions
The Institute
Resources

DS407BKK

Neural Networks and Computer Vision

Bangkok Campus
Jan 08, 2024 - Jan 26, 2024
During this course, students will go over the fundamentals of Deep Learning and its main frameworks, and explore neural architectures related to basic computer vision problems.
Bangkok Campus
Jan 08, 2024 - Jan 26, 2024

Faculty Profiles

Sergey Nikolenko

Sergey Nikolenko

Chief Research Officer, Neuromation Head of AI Lab, PDMI RAS

Aleksei Shabanov

Aleksei Shabanov

Applied Data Scientist / Machine Learning Engineer

Course length

3 weeks

Duration

3 hours
per day

Total hours

45 hours

Credits

6 ECTS

Language

English

Course type

Offline

Fee for single course

€1500

Fee for degree students

€750

Skills you’ll learn

Computer VisionData ScienceDeep Learning basicsBasic neural networksFundamentals of Generative Models
OverviewCourse outlineCourse materialsPrerequisitesMethod & grading

Overview

Deep learning, i.e., training multilayered neural architectures, was one of the oldest tools in machine learning but has revolutionized the industry over the last decade. In this course, we begin with the fundamentals of deep learning and then proceed to modern architectures related to basic computer vision problems: image classification, object detection, segmentation and others.

Modern computer vision is almost entirely based on deep convolutional neural networks, so this is a natural fit that lets us explore interesting architectures, while at the same time staying focused and not going into too wide of a survey of the entire field of deep learning. Computer vision is also a key element in robotics: vision systems are necessary for navigation, localization and mapping, and scene understanding, which are all key problems for creating industrial and home robots.

Learning highlights

  • Learn to apply Deep Learning techniques in practice
  • Understand the theory behind Deep Learning from basics to state-of-the-art approaches
  • Learn how to train various deep neural architectures
  • Understand a wide variety of neural architectures suited for real-life computer vision problems
  • Gain essential experience with main Deep Learning frameworks

Course outline

15 classes

Dive into the details of the course and get a sense of what each class will cover.
Monday
Tuesday
Wednesday
Thursday
Friday
Monday
1

Neural network basics.

Neural networks: history and basic idea. The perceptron: basic construction, training, activation functions.

Practice: Tensors in PyTorch, computational graph, functions, auto-grad.

Tuesday
2

Feedforward neural networks

Feedforward neural networks. Gradient descent basics. Computation graph and computing gradients on the computation graph (backpropagation).

Practice: PyTorch Modules, their parameters, eval/train modes, built-in optimizers.

Wednesday
3

Optimization in neural networks

Gradient descent: motivation, problems. Modifications, ideas: momentum, Nesterov’s momentum, Adagrad, RMSProp, Adam. Second-order methods.

Practice: PyTorch: losses, datasets, first training loop, collate fn.

Thursday
4

Regularization in neural networks

Regularization: L1, L2, early stopping. Dropout. Data augmentation.

Practice: Implementing different optimizers.

Friday
5

Weight initialization and batch norm

Weight initialization: supervised pre-training idea, why straightforward random init fails, Xavier initialization. Covariate shift and batch normalization.

Practice: Components of training neural networks: lr and its’ scheduling, optimizers, early stopping, batch size, troubleshooting.

Monday
6

Convolutional neural networks

Convolutional architectures: idea and structure. Modern convolutional architectures: AlexNet, VGG, network-in-network, GoogLeNet, ResNet, EfficientNet.

Practice: Finetuning image classifier (ResNet), augmentations, working with GPU.

Tuesday
7

Object detection and segmentation

Object detection: the R-CNN family, the YOLO family. Image segmentation: FCNs, U-Net, Mask R-CNN.

Practice: Working with pre-trained object detectors, Precision-Recall curves, MAP.

Wednesday
8

Generative models in deep learning

Generative models and neural networks. Types of generative models. Autoregressive deep learning models, WaveNet.

Practice: Training semantic segmentation model.

Thursday
9

Mid-term test

Mid-term test

Friday
10

Generative adversarial networks

Generative adversarial networks: idea, DCGAN, AAE, conditional GANs. Wasserstein GANs. Various loss functions in GANs. GANs for image generation.

Practice: DCGAN on CIFAR10: evaluation and analysis, inception score, dataset memorization problem.

Monday
11

Variational autoencoders

Variational autoencoders: ideas, construction, derivation.

Practice: DCGAN: training on Fashion MNIST.

Tuesday
12

Transformers

Another machine learning revolution: the Transformer architecture. Idea, formal description, applications. BERT and GPT families.

Practice: Transformers in practice: attention, multi-head attention, ViT. Self-supervised learning.

Wednesday
13

Transformers for images and video

Vision Transformers. ViT. Transformers for video processing: problems and solutions.

Practice: Image retrieval: task setup, benchmarks, metrics, representation power of ViT vs ResNet.

Thursday
14

Case study: video retrieval

Multimodal Transformers: CLIP and BLIP. Transformers for video retrieval.

Practice: Training image retrieval models. Angular losses, contrastive losses. Sampling and mining for contrastive losses.

Friday
15

Final exam

Final exam

Prerequisites

Master’s Machine Learning

Python programming experience

At least basic knowledge of Linear Algebra, Probability Theory and Optimisation

Methodology

The course will be organized into three-hour sessions and self-study practical assignments. Sessions will contain both theoretical and practical parts with different ratios depending on the materials.

Grading

The final grade will be composed of the following criteria:
40% - Homework Assignments
20% - Theoretical Tests
40% - Final Project
Sergey Nikolenko

Faculty

Sergey Nikolenko

Chief Research Officer, Neuromation Head of AI Lab, PDMI RAS

Sergey Nikolenko is a computer scientist with vast experience in machine learning and data analysis, algorithms design and analysis, theoretical computer science, and algebra. He graduated from St. Petersburg State University in 2005, majoring in algebra (Chevalley groups), and earned his Ph.D at the Steklov Mathematical Institute at St. Petersburg in 2009 in theoretical computer science (circuit complexity and theoretical cryptography). Since then, Sergey has been interested in machine learning and probabilistic modeling, producing theoretical results and working on practical projects for the industry.

Sergey Nikolenko is currently serving as the Chief Research Officer at Neuromation, leading the Artificial Intelligence Lab at the Steklov Mathematical Institute at St. Petersburg, and teaching at the St. Petersburg State University and Higher School of Economics. Dr. Nikolenko has published more than 170 research papers on machine learning (ICML, CVPR, ACL, SIGIR, WSDM...), analysis of algorithms (SIGCOMM, INFOCOM, ICNP…), and other fields, several books, including a bestselling “Deep Learning” book (in Russian), lecture courses in ML, DL, other fields of computer science (St. Petersburg State University, NRU Higher School of Economics...) and much more. He has extensive experience in managing research and industrial AI/ML projects.

See full profile
Aleksei Shabanov

Faculty

Aleksei Shabanov

Applied Data Scientist / Machine Learning Engineer

Aleksei Shabanov is an Applied Data Scientist / Machine Learning Engineer with 7+ years of industrial experience. His main interest is Deep Computer Vision. Alexei has hands-on experience in Image Search, Person tracking and re-identification, Object Detection, Segmentation, and many others. He is also the main author of an open-source project named Open Metric Learning and a former active contributor to the Catalyst library. As a teacher, Alexei usually focuses on practical Deep Learning, linking theory to industry applications.

See full profile

Apply for this course

Snap up your chance to enroll before all spaces fill up.

Neural Networks and Computer Vision

by Sergey Nikolenko, Aleksei Shabanov

Total hours

45 Hours

Dates

Jan 08 - Jan 26, 2024

Fee for single course

€1500

Fee for degree students

€750

How to secure your spot

Complete the form below to kickstart your application

Schedule your Harbour.Space interview

If successful, get ready to join us on campus

FAQ

Will I receive a certificate after completion?

Yes. Upon completion of the course, you will receive a certificate signed by the director of the program your course belonged to.

Do I need a visa?

This depends on your case. Please check with the Spanish or Thai consulate in your country of residence about visa requirements. We will do our part to provide you with the necessary documents, such as the Certificate of Enrollment.

Can I get a discount?

Yes. The easiest way to enroll in a course at a discounted price is to register for multiple courses. Registering for multiple courses will reduce the cost per individual course. Please ask the Admissions Office for more information about the other kinds of discounts we offer and what you can do to receive one.