Studies
Admissions
The Institute
Resources
Studies
Admissions
The Institute
Resources
Studies
Admissions
The Institute
Resources

Big Data and Distributed Data Analysis

Online
Jul 27, 2020 - Aug 14, 2020
Online
Jul 27, 2020 - Aug 14, 2020
Alexey Dral

Faculty

Alexey Dral

Founder and CEO at BigData Team; Head of “Big Data for Data Engineers” Coursera specialisation

Course length

3 weeks

Duration

3 hours
per day

Total hours

45 hours

Credits

6 ECTS

Language

English

Course type

Online

Fee for single course

€1500

Fee for degree students

€750

Skills you’ll learn

Big DataSQLSpark DataFrameGraphFrame
OverviewCourse outlineCourse materialsPrerequisites

Overview

During this course, the students will master and sharpen their knowledge in basic technologies of the modern Big Data landscape, namely: HDFS, MapReduce, Hive, Spark, and NoSQL. The subject of particular interest during this course is efficient data warehousing using Hive and Spark.

Under the teacher’s supervision, they will study the intricacies of the system’s internals and their applications, as well as learn distributed file systems, the purpose of their existence, and the ways of their application. The listeners will also practice using the MapReduce framework, a workhorse for many modern Big Data applications. The possibility to apply knowledge into practice in order to process texts and solve sample business cases is the key element of this course. The participants will deal with Spark, the next-generation computational framework, from its basic concepts up to advanced applications made to squeeze maximum performance. Finally, they will build and deploy their own service which will utilise SQL or NoSQL databases on the big scale.

Learning highlights

  • Construct their own Big Data Service
  • Identify practical problems which can be solved with machine learning on the big scale
  • Explain the principles of work and usage of NoSQL databases compared to traditional RDBMS systems
  • Optimize data warehouse for storage and processing
  • Apply the acquired skills in finance, social networks, telecommunications and many other fields

Course outline

15 classes

Dive into the details of the course and get a sense of what each class will cover.
Monday
Tuesday
Wednesday
Thursday
Friday
Monday
1

Class 1

Working with distributed file systems (HDFS)

Tuesday
2

Class 2

Understanding and working with MapReduce

Wednesday
3

Class 3

Understanding and working with MapReduce

Thursday
4

Class 4

SQL over BigData: Hive

Friday
5

Class 5

SQL over BigData: Hive

Monday
6

Class 6

Spark: in-memory computational model

Tuesday
7

Class 7

Spark: in-memory computational model

Wednesday
8

Class 8

Spark DataFrame / SQL / GraphFrame

Thursday
9

Class 9

Spark DataFrame / SQL / GraphFrame

Friday
10

Class 10

Big Data applications examples and Spark optimisation

Monday
11

Class 11

Spark ML: classification / regression / clusterization

Tuesday
12

Class 12

Spark ML: classification / regression / clusterization

Wednesday
13

Class 13

NoSQL (HBase / Cassandra / …)

Thursday
14

Class 14

NoSQL (HBase / Cassandra / …)

Friday
15

Class 15

Building Big Data Service

Prerequisites

This course is one of three in a wholistic series.

Students that have already taken MSL-111 and those with prior experience with HTML, CSS, and Javascript building simple web pages will be good candidates for this module.

Alexey Dral

Faculty

Alexey Dral

Founder and CEO at BigData Team; Head of “Big Data for Data Engineers” Coursera specialisation

Alexey Dral is the Founder and Chief Executive Officer at BigData Team. His 10 year working experience for the top Russian and international companies, including Amazon AWS, Yandex, Rambler and qualification on large-scale problems, leading R&D teams, defining strategy for the whole departments build up a great mentor with an aspiration to offer insights to his students. He teaches around the globe, launches new onsite and online courses to reduce the gap between industry and academia, supervise Data Science and Data Engineering initiatives.

-10 years of IT experience (Amazon AWS, Yandex, Rambler);

See full profile

Apply for this course

Snap up your chance to enroll before all spaces fill up.

Big Data and Distributed Data Analysis

by Alexey Dral

Total hours

45 Hours

Dates

Jul 27 - Aug 14, 2020

Fee for single course

€1500

Fee for degree students

€750

How to secure your spot

Complete the form below to kickstart your application

Schedule your Harbour.Space interview

If successful, get ready to join us on campus

FAQ

Will I receive a certificate after completion?

Yes. Upon completion of the course, you will receive a certificate signed by the director of the program your course belonged to.

Do I need a visa?

This depends on your case. Please check with the Spanish or Thai consulate in your country of residence about visa requirements. We will do our part to provide you with the necessary documents, such as the Certificate of Enrollment.

Can I get a discount?

Yes. The easiest way to enroll in a course at a discounted price is to register for multiple courses. Registering for multiple courses will reduce the cost per individual course. Please ask the Admissions Office for more information about the other kinds of discounts we offer and what you can do to receive one.