Intro to Big Data Hadoop Architecture

Valentina Crisan
Intro to Big Data Hadoop Architecture

Acest curs este predat în limba română, iar materialele sunt în limba engleză şi/sau în limba română, după caz.

La cerere, cursul poate fi personalizat.

The scope of this course is to provide an understanding of Apache Hadoop architecture and its different commercial distributions (with a special emphasis on Cloudera and Map-R). This course is intended for people working on architectural level. At the end of this course the participant will be able to understand the Hadoop ecosystem, applicability and scope of the different components of the Hadoop ecosystem and to compare 2 of its commercial distributions: Cloudera and Map-R. The Course will include real example of Cloudera and Map-R installations and the architectural points that led to the respective choices.

Caracteristici curs

  • Capitole 29
  • Durata 3 zile
  • Nivel cunostinte Orice nivel
  • Limba Romana
  • Cursanti 12
  • Day 1: Intro in Big Data and Hadoop Architecture/ecosystem

    • Capitol 1.1 Big Data: a bit of history and the V’s Locked 0m
    • Capitol 1.2 Some known and not so much know Big Data uses Locked 0m
    • Capitol 1.3 Lambda architecture overview Locked 0m
    • Capitol 1.4 What is Hadoop, HDFS emergence and MapReduce evolution Locked 0m
    • Capitol 1.5 Use cases of Hadoop Locked 0m
    • Capitol 1.6 Hadoop architecture overview, detailed and applicability cases : HDFS & MapReduce essentials, YARN Locked 0m
    • Capitol 1.7 Exercises on MapReduce Locked 0m
    • Capitol 1.8 Apache Hive, Impala + Exercises Locked 0m
    • Capitol 1.9 Understanding role of file formats in Hadoop: Apache Avro, Parquet, ORC (hands on exercises) Locked 0m
  • Day 2: Other projects that are most often part of the Hadoop ecosystem

    • Capitol 2.1 Data storage: Locked 0m
    • Capitol 2.2 Architecting data in Hadoop: storage options considerations Locked 0m
    • Capitol 2.3 Apache Hbase in the Hadoop Eco-system Locked 0m
    • Capitol 2.4 Distributed systems and CAP theorem Locked 0m
    • Capitol 2.5 noSQL solutions overview ( incl Redis) Locked 0m
    • Capitol 2.6 Data computing: Apache Spark intro Locked 0m
    • Capitol 2.7 Data analysis: SQL on Hadoop: Hive, Impala; Search: Solr/Elastic; Spark SQL Locked 0m
    • Capitol 2.8 Machine Learning: Spark MLlib Locked 0m
    • Capitol 2.9 Data ingestion: Apache Kafka Locked 0m
    • Capitol 2.10 Other: Oozie, Zookeeper, Hue Locked 0m
    • Capitol 2.11 Building a possible architecture for streaming data and batch data (incl ML): Trainer driven exercises Locked 0m
  • Day 3: Commercial distributions of Hadoop

    • Capitol 3.1 Cloudera: Locked 0m
    • Capitol 3.2 Architecture & components Locked 0m
    • Capitol 3.3 Cloudera specific tools & functions Locked 0m
    • Capitol 3.4 Use cases of Couldera Locked 0m
    • Capitol 3.5 Map-R: Locked 0m
    • Capitol 3.6 Architecture and components Locked 0m
    • Capitol 3.7 MapR specific features Locked 0m
    • Capitol 3.8 Use cases of Hadoop Locked 0m
    • Capitol 3.9 Comparison Cloudera and MapR Locked 0m
Valentina Crisan