Big Data Architecture

« Click to add text»

Syllabus

  • Introduction
    • Big Data Fundamentals and Concepts
    • Components of Big Data Applications
    • CAP Theorem and Eventual Consistency
    • Batch- vs Realtime Processes
    • Lambda Architecture
    • Exercises
  • Batch-Processing
    • Map Reduce
    • Workflow Organization
    • NO-SQL Key-Value Stores
    • Exercises
  • Realtime Processing
    • Message Passing
    • Stream Processing
    • NO-SQL DB Cassandra
    • Exercises
  • Workflow Processing
    • Storm Workflows
    • Cascading Workflows
  • Big Data Technology Groups

Big Data Fundamentals and Concepts

Here goes topic description

Title

  • Text bullet 1
  • Text bullet 2
 
  • Point 
  • Point
 

Introduction

Here goes description

Learning Objectives

  • Knowledge:
    • Big Data Processing vs. Big Data Storage Systems
    • Hadoop & Dynamo Overview
    • CAP Theorem
    • Lambda Architecture
    • Batch vs. Realtime View
  • Skills:
    • No SQL databases
    • SPRAIN
    • Complex event processing
    • Horizontal scalability
    • Eventual consistency
    • Big data definition
    • Precomputed views 
    • De-normalization
    • Relationship to commercial Big Data Architectures

Big Data Fundamentals and Concepts

Here goes topic description

Title

  • Text bullet 1
  • Text bullet 2
 

Learning Objectives

  • Knowledge:
    • Map Reduce Concept 
    • Key-Value Store Interface
  • Skills:
    • Mapreduce concept
    • Hadoop
    • Hadoop distributed file system
    • Hadoop Mapreduce Example
    • HDFS replication
    • YARN
    • Disributed hash tables
    • Quorum-based systems
    • Conflict resolution with vector clocks
    • Distributed key value storage (Voldemort)

New slide

  • Point1
  • Text bullet 2
 

Batch-Processing

Here goes description

Learning Objectives

  • Knowledge:
    • Map Reduce Concept 
    • Key-Value Store Interface
  • Skills:
    • Mapreduce concept
    • Hadoop
    • Hadoop distributed file system
    • Hadoop Mapreduce Example
    • HDFS replication
    • YARN
    • Disributed hash tables
    • Quorum-based systems
    • Conflict resolution with vector clocks
    • Distributed key value storage (Voldemort)

Map Reduce

« Click to add text»

Workflow Organization

« Click to add text»

No-SQL Key-Value Stores

« Click to add text»

Exercises

Here goes description of exercises

Components of Big Data Applications

« Click to add text»

CAP Theorem and Eventual Consistency

« Click to add text»

Batch- vs Realtime Processes

« Click to add text»

Lambda Architecture

« Click to add text»

Realtime Processing

Topic description

Learning Objectives

  • Knowledge:
    • Speed Layer Overview
    • Storm Overview
  • Skills:
    • Messaging systems (Kafka)
    • Publisher/Subscriber communication model (Zookeeper)
    • Parallel streamprocessing (Storm)
    • Topologies: Spouts, bolts, tasks
    • Stream grouping
    • Storm vs. Hadoop?
    • Key value stores with columns (Cassandra)
    • High write performance
    • Speed layer


Message Passing

« Click to add text»

Stream Processing

« Click to add text»

No-SQL-Database Cassandra

« Click to add text»

Exercises

« Click to add text»

Workflow Processing

Topic description

Learning Objectives

  • Knowledge:
    • Cascading Workflow Engine
    • Workflows and Lambda Architecture 
  • Skills:
    • Cascading: taps, tuples, pipes and operators 
    • Join processing

Storm Workflows

« Click to add text»

Cascading Workflows

« Click to add text»

Big Data Technology Groups

« Click to add text»

Learning Objectives

  • Big Data technology groups
  • NoSQL databases
  • Distributed batch processing
  • Distributes stram processing/complex event processing
  • Complex event processing
  • Data extraction & transformation technologies
  • Distributed search & indexing of documents
  • Data ex/import
  • Distributed data analysis techniques
  • Hadoop ecosystem
  • Open source products

Sources for further reading

Title

  • Text bullet 1
  • Text bullet 2