Big Data Architecture & Technology Concepts

Valentina Crisan
Big Data Architecture & Technology Concepts

Acest curs este predat în limba română, iar materialele sunt în limba engleză şi/sau în limba română, după caz.

La cerere, cursul poate fi personalizat.

This course is intended for:
– Existing architects from different other domains interested in understanding what means architecting a big data solution
– Solutions responsables/engineers knowledgeable in different Big Data solutions (Hadoop, Spark, NoSQL solutions, ETL components, ML) that would like to understand “the big picture” of what architecting a Big Data solutions means. The course is designed to make sure the participant will understand the usage and applicability of big data technologies like Hadoop, Spark, Cassandra, Hbase, Kafka , and which aspects to consider when starting to build a Big Data architecture.

Caracteristici curs

  • Capitole 28
  • Durata 4 zile
  • Nivel cunostinte Orice nivel
  • Limba Romana
  • Cursanti 12
  • Day 1: Big Data Architecture overview: components and their role in an architecture

    • Capitol 1.1 Specific technologies overview and details: Locked 0m
    • Capitol 1.2 Storage: NoSQL databases (random reads on data) Locked 0m
    • Capitol 1.3 Overview of different NoSQL solutions Locked 0m
    • Capitol 1.4 Cassandra detailed overview Locked 0m
    • Capitol 1.5 Main concepts: data partitioning, distribution, replication, consistency, how to write and read data, compaction Locked 0m
    • Capitol 1.6 How data is inserted / updated / deleted Locked 0m
    • Capitol 1.7 Understand the main patterns and anti-patterns Locked 0m
    • Capitol 1.8 Basic data modeling rules for Cassandra best performance Locked 0m
    • Capitol 1.9 Use case based on Cassandra – hands on session Locked 0m
  • Day 2: Hbase

    • Capitol 2.1 Main concepts recap Locked 0m
    • Capitol 2.2 Differences to Cassandra Locked 0m
    • Capitol 2.3 Main data modeling aspects: how data is best read Locked 0m
    • Capitol 2.4 How to choose a noSQL solution – the considerations Locked 0m
    • Capitol 2.5 Recap storage options: long term storage of immutable data (HDFS) & random writes/reads of data (Hbase/Cassandra/..). Locked 0m
  • Day 3: Distributed data processing frameworks

    • Capitol 3.1 Overview of different solutions : Storm, Spark , … Locked 0m
    • Capitol 3.2 Distributed computations and Stream processing with Spark Locked 0m
    • Capitol 3.3 Spark as ETL tool – examples and demo Locked 0m
    • Capitol 3.4 Examples on using Spark and Spark streaming + demo session Locked 0m
    • Capitol 3.5 Data Analysis Locked 0m
    • Capitol 3.6 SQL on everything options: Hive, Impala, Spark SQL, Apache Drill Locked 0m
    • Capitol 3.7 Integrate Cassandra with Spark SQL for data analytics, what kind of analytics can be performed Locked 0m
    • Capitol 3.8 Hands on exercises for analytics with Spark SQL and Cassandra Locked 0m
    • Capitol 3.9 Search solutions – the extra capability needed besides an SQL engine: Locked 0m
    • Capitol 3.10 Overview of the capabilities of Solr, Elastic – what extra functionalities those bring to a SQL Engine solution Locked 0m
  • Day 4: Messaging bus: Kafka

    • Capitol 4.1 Why a messaging bus in a big data architecture? Locked 0m
    • Capitol 4.2 Kafka demo in combination with Spark streaming Locked 0m
    • Capitol 4.3 Clustering and Resource Management : Mesos Locked 0m
    • Capitol 4.4 End to End example including Machine Learning – (we will complete/demo a noSQL + Spark app with ML) Locked 0m
Valentina Crisan