СS207

Faculty
Sergey Khoroshenkikh
Senior Software Engineer at Yandex
Course length
Duration
Total hours
Credits
Language
Course type
Fee for single course
Fee for degree students
Skills you’ll learn
The Natural Language Processing (NLP) field has gained attention in recent years because of impressive algorithmic advances in Deep Learning and significant progress in hardware.
Text Mining is a subset of NLP focused on unsupervised and semi-supervised algorithms of text analysis. The course covers the main algorithms and concepts of Text Mining, including both “classical” methods from the Information Retrieval domain (like TF-IDF and topic modelling) and modern Deep Learning architectures.
15 classes
NLP pipeline with spaCy. TF-IDF. Text analysis with scikit-learn.
Definition, algorithms, and evaluation.
Algorithms and feature engineering.
K-means clustering. Non-negative matrix factorisation (NMF).
Latent Semantic Indexing (LSI). Latent Dirichlet Allocation (LDA).
Distributional hypothesis. Word2Vec algorithm.
GloVe algorithm. Typical use cases of word vectors in NLP tasks. Word2Vec in recommendation systems. Analysis of graphs using Node2Vec.
Feedforward neural networks. Computation graph and backpropagation. Optimization methods.
Tensors, gradients, layers
Vanilla RNN. Neural Language Models.
Vanishing gradients. LSTM and GRU. Bidirectional RNN.
Contextual word embeddings. ELMo. ULMfit.
Attention. Transformer block, encoder, decoder. BERT.
Case study: news aggregator
Final project session
Strong programming background (Python).
Understanding of machine learning concepts and algorithms.
Solid knowledge of multivariate calculus and linear algebra.
The course is focused on practical tools and applications of text mining yet providing the necessary theoretical and algorithmic background.
During the course, students will choose a text mining problem, explore it and present the research results in the final session. Also, sessions 1-13 will be followed by graded assignments.
Sergey Khoroshenkikh is a senior software engineer with eight years of experience in applied machine learning and data analysis. He graduated from the Moscow Institute of Physics and Technology in 2015. At Yandex, he has been working on large-scale machine learning solutions for web advertising as well as routing algorithms for Yandex Delivery.
Research/Academic Interests: Random graphs, complex networks
See full profileApply for this course
by Sergey Khoroshenkikh
Total hours
45 Hours
Dates
Feb 21 - Mar 11, 2022
Fee for single course
€1500
Fee for degree students
€750
How to secure your spot
Complete the form below to kickstart your application
Schedule your Harbour.Space interview
If successful, get ready to join us on campus
FAQ
Will I receive a certificate after completion?
Yes. Upon completion of the course, you will receive a certificate signed by the director of the program your course belonged to.
Do I need a visa?
This depends on your case. Please check with the Spanish or Thai consulate in your country of residence about visa requirements. We will do our part to provide you with the necessary documents, such as the Certificate of Enrollment.
Can I get a discount?
Yes. The easiest way to enroll in a course at a discounted price is to register for multiple courses. Registering for multiple courses will reduce the cost per individual course. Please ask the Admissions Office for more information about the other kinds of discounts we offer and what you can do to receive one.