6.830: Lecture 2

In this lecture, we will continue our discussion of data models and database system architecture, looking in more detail at the relational model.

Please read the following papers in preparation for lecture:

Michael Stonebraker and Joseph Hellerstein. What Goes Around Comes Around. In "Readings in Database Systems" (aka the Red Book), or online here (link to PDF). Read Sections 1-4 (if you know something about XML, you may also enjoy reading Sections 10 and 11).
Pages 57-63 of Ramakrishnan and Gehrke for a brief overview of the relational model.

Although it is not required, you may be curious to look at the original specification of the relational model by Ted Codd (Turing award winner who created the idea of the relational data model.) It's dense and mathy, but jam packed with ideas.:

E.F. Codd. A relational model of data for large shared data banks. Communications of the ACM, 1970. [PDF]. (This is linked to the ACM website and is only accessible with an ACM account or from an MIT IP. Focus on Sections 1.3 and all of Section 2.)

As you read these papers, think about and be prepared to answer the following questions in Lecture:

What is the notion of data independence? Why is it important?
What are the key ideas behind the relational model? Why are they an improvement over what came before? In what ways is the relational model restrictive?
What are the most important differences between the "hierarchical" model (as exemplified by systems like IMS) and the relational model that Codd proposed?