Cloud Computing

Course Description

Cloud computing systems today, whether open-source or used inside companies, are built using a common set of core techniques, algorithms, and design philosophies all centered around distributed systems. Learn about such fundamental distributed computing "concepts" for cloud computing.
Some of these concepts include: clouds, MapReduce, key-value/NoSQL stores, classical distributed algorithms, widely-used distributed algorithms, scalability, trending areas, and much, much more!
Know how these systems work from the inside out. Get your hands dirty using these concepts with provided homework exercises. In the programming assignments, implement some of these concepts in template code (programs) provided in the C++ programming language. Prior experience with C++ is required.
The course also features interviews with leading researchers and managers, from both industry and academia.

Course Instructor


Mustakin Choudhury

An experienced Corporate Trainer Specialized in Cloud computing, Cyber Security



Orientation, Introduction to Clouds, MapReduce

This course is oriented towards learners with similar backgrounds as juniors and seniors in a CS undergraduate curriculum. Since learners come from various backgrounds, it is critical you view this lecture AND pass the prerequisite test. This will ensure you have many of the assumed prerequisite pieces of knowledge required to enjoy this course.

  • Introduction to Cloud Computing Concepts
  • Orientation Towards Cloud Computing Concepts: Some Basic Computer Science Fundamentals
  • Week 1 Introduction
  • 1.1. Why Clouds?
  • 1.2. What is a Cloud?
  • 1.3. Introduction to Clouds: History
  • 1.4. Introduction to Clouds: What's New in Today's Clouds
  • 1.5. Introduction to Clouds: New Aspects of Clouds
  • 1.6. Introduction to Clouds: Economics of Clouds
  • 2.1. A cloud IS a distributed system
  • 2.2. What is a distributed system?
  • 3.1. MapReduce Paradigm
  • 3.2. MapReduce Examples
  • 3.3. MapReduce Scheduling
  • 3.4. MapReduce Fault-Tolerance


Gossip, Membership, and Grids

This module teaches how the multicast problem is solved by using epidemic/gossip protocols. It also teaches analysis of such protocols. Lesson 2: This module covers the design of failure detectors, a key component in any distributed system. Membership protocols, which use failure detectors as components, are also covered. Lesson 3: This module covers Grid computing, an important precursor to cloud computing

  • 1.1. Multicast Problem
  • 1.2. The Gossip Protocol
  • 1.3. Gossip Analysis
  • 1.4. Gossip Implementations
  • 2.1. What is Group Membership List?
  • 2.2. Failure Detectors
  • 2.3. Gossip-Style Membership
  • 2.4. Which is the best failure detector?
  • 2.5. Another Probabilistic Failure Detector
  • 2.6. Dissemination and suspicion
  • 3.1. Grid Applications
  • 3.2. Grid Infrastucture
  • Interview with William Gropp
  • 2 readings Week 2 Overview
  • Homework 2 Instructions


P2P Systems

P2P systems: This module teaches the detailed design of two classes of peer to peer systems: (a) popular ones including Napster, Gnutella, FastTrack, and BitTorrent; and (b) efficient ones including distributed hash tables (Chord, Pastry, and Kelips). Besides focusing on design, the module also analyzes these systems in detail.

  • 1. P2P Systems Introduction
  • 2. Napster
  • 3. Gnutella
  • 4. FastTrack and BitTorrent
  • 5. Chord
  • 6. Failures in Chord
  • 7. Pastry
  • 8. Kelips
  • Blue Waters Supercomputer


Key-Value Stores, Time, and Ordering

This module motivates and teaches the design of key-value/NoSQL storage/database systems. We cover the design of two major industry systems: Apache Cassandra and HBase. We also cover the famous CAP theorem. Lesson 2: Distributed systems are asynchronous, which makes clocks at different machines hard to synchronize. This module first covers various clock synchronization algorithms, and then covers ways of tagging events with causal timestamps that avoid synchronizing clocks. These classical algorithms were invented decades ago, yet are used widely in today’s cloud systems.

  • 1.1. Why Key-Value/NOSQL?
  • 1.2. Cassandra
  • 1.3. The Mystery of X-The Cap Theorem
  • 1.4. The Consistency Spectrum
  • 1.5. HBase
  • 2.1. Introduction and Basics
  • 2.2. Cristian's Algorithm
  • 2.3. NTP
  • 2.4. Lamport Timestamps
  • 2.5. Vector Clocks


Classical Distributed Algorithms

Lesson 1: This module covers how to calculate a distributed snapshot, leveraging causality again to circumvent the synchronization problem. Lesson 2: This lecture teaches how to order multicasts in any distributed system. Algorithms for assigning timestamp tags to multicasts using various flavors of ordering – FIFO, Causal, and Total – are covered. The module also covers virtual synchrony, a paradigm that combines reliable multicasts with membership views. Lesson 3: Consensus is one of the most important problems in a distributed system, enabling multiple machines to agree. This module uses Paxos, one of the most popular consensus solutions used in the industry today. Paxos is not perfect because consensus cannot be solved completely – an optional lecture presents the famous FLP proof of impossibility of consensus.

  • 1.1. What is Global Snapshot?
  • 1.2. Global Snapshot Algorithm
  • 1.3. Consistent Cuts
  • 1.4. Safety and Liveness
  • 2.1. Multicast Ordering
  • 2.2. Implementing Multicast Ordering
  • 2.3. Implementing Multicast Ordering
  • 2.4. Reliable Multicast
  • 2.5. Virtual Synchrony
  • 3.1. The Consensus Problem
  • 3.2. Consensus In Synchronous Systems
  • 3.3. Paxos, Simply
  • 3.4. The FLP Proof [OPTIONAL]
  • Interview with Tushar Chandra
  • Conclusion to Cloud Computing Concepts