[Announcements | Schedule and Readings | Assignments and Quizzes | Syllabus/Policies ]

Reading Questions, Lecture 23 (12/6)

Papers: High Availability in Streaming Systems,

This paper addresses the problem of handling node failures in a distributed stream processing system. The paper first presents three types of recovery guarantees and then describes and compares various recovery techniques.

As you read the paper, consider the following questions:

  1. What are the main differences between precise recovery, rollback recovery, and gap recovery? What are some advantages of providing only weaker recovery guarantees?
  2. For rollback-recovery, which approach has the lowest runtime overhead? Which approach offers the fastest recovery?
  3. In upstream backup, how does a node determine which tuples can safely be dropped by its upstream neighbors? How do input-tuple indicators help?
  4. Which properties of a query network affect the runtime overhead of high availability? Which properties affect the recovery time?

Samuel Madden (madden at csail dot mit dot edu)
Last modified: Fri Dec 3 11:00:31 EST 2004