Apache Spark
Apache Spark has gained popularity for its ability to handle diverse data processing workloads, including batch processing, real-time data streaming, machine learning, and graph processing.
Its flexibility, performance, and rich set of libraries make it a versatile choice for building data-intensive applications and performing complex analytics on large-scale datasets.
This introduction course covers the main concepts in Apache Spark:
-
the core (architecture, RDDs/Dataframes/Datasets, transformations & actions, DAG)
-
the SQL engine
-
the streaming engine
-
machine learning libraries
It also highlights the possible usage of Spark in different use cases like: ETL, Analytics and Machine Learning.
During the course, we will build an end-to-end case with Spark, from data input, data cleaning, data storage, and machine learning. We will work in a cloud environment and we will use Apache Zeppelin for all the Spark coding/exercises (Scala).
This course is taught in Romanian, with course materials available in either English, or Romanian. The course can be customized - on request.
TOPICS
12
TOPICS
5
TOPICS
6
Contact Us
Feel free to leave us your thoughts so we can discover the solution together!
academy@esolutions.ro
Get in touch
0753.029.187
Our address
20 Constantin Budisteanu Street , 1 st. District, Bucharest
academy@esolutions.ro
Get in touch
0753.029.187
Our address
20 Constantin Budisteanu Street , 1 st. District, Bucharest