6.5830/6.5831: Database Systems
Spring 2026

Overview

This course is designed to introduce graduate/undergraduate students to the foundations of database systems, focusing on basics such as the relational algebra and data model, query optimization, query processing, and transactions. This is not a course on database design or SQL programming (though we will discuss these issues briefly). No prior database experience is assumed though students who have taken an undergraduate course in databases are encouraged to attend.

Classes will consist of lectures and discussions based on readings from the database literature. For 6.5830, there will be a semester long project, as well as two quizzes and 6 assignments -- 3 Go based programming "Labs" and 3 problem sets. For 6.5831, quizzes and assignments are the same as 6.5830, except that students may opt to do one additional lab in place of the final project.

Enrollment may be limited.

The course web site is http://dsg.csail.mit.edu/6.5830/.

Lectures

Lectures are held twice a week, from 2:30 – 4:00 pm on Tuesdays and Thursdays in 35-225. Attendance at lectures is mandatory and you are expected to show up prepared to answer questions and participate in discussion.

Topics Covered

Topics related to the engineering and design of database systems, including: data models; database and schema design; schema normalization and integrity constraints; query processing; query optimization and cost estimation; transactions; recovery; concurrency control; isolation and consistency; distributed, parallel, and heterogeneous databases; adaptive databases; trigger systems; key-value stores; object-relational mappings; streaming databases; DB as a Service. Lecture and readings from original research papers. 6.5830 includes a semester-long project.

Prerequisites

To do well in this class, students are expected to have taken 6.1210 (6.006, Introduction to Algorithms) or equivalent. They should also be familiar with computer systems (e.g., topics covered in 6.1800 (6.033, Computer Systems Engineering)) and feel comfortable debugging, implementing, and designing software at scale. If you do not have experience in these subjects and would like to take the course, please email the instructor. Prior database experience is not required. We will be using the Go programming language (https://go.dev); most students in the class will not have used it before. We have restructured the labs versus previous years to include an intial go tutorial. Please refer to the FAQ for more details.

Units

3-0-9. 6.5830 is a Grad-H class. It counts as an engineering concentration (EC) subject in Systems. For Area II Ph.D. students in EECS, it satisfies the Systems TQE requirement.

6.5831 is a undergraduate class designed to satisfy the AUS requirement in the EECS curriculum. For Spring 2026 (and possibly in future semesters) you may petition to have it satisfy the systems header requirement in lieu of 6.1800 (6.033) -- see here.

Grading

Grades are assigned based on labs, homeworks, 2 quizzes, final project (for 6.5830), and class participation. The grading breakdown is as follows:

Grading is handled differently for the two versions of the class:

6.5830

  • Assignment (Problem Sets and Labs): 35% total
    • PSET 1: 3.33%
    • PSET 2: 5%
    • PSET 3: 5%
    • Lab 1: 6.67%
    • Lab 2: 6.67%
    • Lab 3: 8.33%
  • Quizzes: 15% each
  • Course Project: 30%
  • Class Participation: 5%

6.5831

  • Assignments (Problem Sets and Labs): 65% total
    • PSET 1: 5%
    • PSET 2: 7.5%
    • PSET 3: 7.5%
    • Lab 1: 10%
    • Lab 2: 10%
    • Lab 3: 12.5%
    • Lab 4: 12.5%
  • Quizzes: 15% each
  • Class Participation: 5%

Each student is allowed 5 "late days", each of which may be used to turn in one problem set or lab one day (24 hour period) later than it is due without penalty. After all five late days are used, 10% of the corresponding assignment grade will be subtracted for each day an assignment is late. Each assignment may be late for a maximum of 5 days before resulting in a penalty of 100%.

Late days may not be used for the course project report and video, lab 4, or quizzes. If you have special circumstances and need to take quizzes or present your course project at alternate times, please let us know in advance.

Please don't hesistate to reach out to the course staff if you are struggling; we are generally happy to offer extensions with a note from S3 or GradSupport.

Collaboration Policy

In line with MIT’s policy on Academic Integrity, here are our expectations regarding collaboration and sharing of work. We allow teams of 2 for labs (i.e., coding assignments). Because labs build on each other, we expect teams to remain the same throughout the semester. Reach out to the course staff if you wish to change teammates in the middle of the semester.
You are allowed to discuss your general ideas and approach with other students, but you are expected to write your own code and solutions. Here are some examples of things you should not do with anyone except your project teammate:

  • Let another student compare their solution with your solution to find a bug or problem
  • Email or share your code with another student
  • Share your github repo with another student
We will use software to detect copying of lab and homework assignments.
Problem sets are individual only.



[Office Hours and Piazza]
Instructors and TAs will hold office hours regularly according to the posted schedule. If you are unable to attend, please contact the instructor and we are happy to make accommodations for special circumstances. Students are encouraged to check Piazza and ask questions there instead of waiting for office hours. The course staff will be checking Piazza at regular intervals, but they are not expected to do so in real-time. You should not expect real-time responses (Please go to office hour for that!).

[Public Work]
Please do not make solutions of any of your 6.5830/1 labs or problems sets public. Sharing your solutions with others or posting them publicly is a violation of class rules on collaboration. Keep in mind that when work on a problem set or project is copied, both the provider and the consumer of copied materials are violating academic honesty standards.

[NEW: Generative AI]
GenAI-based coding tools have made huge progress since the inception of this course, and they can enhance productivity and facilitate learning when used responsibly. But it's easy to misuse them, and you are responsible for making sure you are using them in a way consistent with class rules.

In general, in settings where class rules allow you to use AI tools, you should not use any paid/premium AI services, with the exception of premium services that are available for everyone in this class (e.g., Google Gemini's free pro plan for students). For problem sets, you are allowed to ask AI tools general questions, but you are not permitted to ask it to specifically solve the problems in any of the problem sets. Examples of questions that are allowed:
  • How does the GROUP BY clause work in SQL?
  • How do I eliminate duplicates in a SQL query?
Examples of questions that are not allowed:
  • Solve this problem set for me: INSERT PROBLEM SET HERE
For labs, you are allowed to use AI tools as you see fit, but you are expected to understand the code you write and be able to explain it to the course staff. If you do a lot of vibecoding, you will likely not be able to do well on the labs. Why? In this class, you are expected to write low level systems code incrementally in a large, complex codebase. Be warned that AI tools are known to be confident liars in this setting -- they will confidently suggest non-thread-safe code or create subtle inconsistencies across various files, which may lead to difficult bugs that the AI will not be able to resolve without human intervention. It is also likely to result in increased technical debt that makes your code more brittle and difficult to change as the semester goes on. You are solely responsible for understanding your implementation and maintaining it. Our TAs are here to help you learn, not to debug code generated by an AI. When you come to office hour or post a question on piazza, you must be able to explain the logic of your code and demonstrate sufficient understanding of your codebase. The course staff will NOT help you resolve issues resulting from bulk-generated AI code.

Finally, your grade is based on your understanding of the topics and the system, not just the software artifact as evaluated by the autograder. You should expect quiz questions that require you to reason about low level implementation issues and resolve them in a pen-and-paper setting. You are unlikely to succeed in these questions if you did not develop deep understanding of the system through the labs. The course staff also reserves to right to schedule "AI audits" with any student, where they will ask the student to describe their solution, explain certain logic, or make small modifications live. Inability to demonstrate sufficient understanding of your own work may result in a grade penalty, or in severe casesmay be treated as an academic integrity violation.

Text

The course readings will primarily be drawn from the 5th Edition of ''Readings in Database Systems'', edited by Stonebraker and Hellerstein. It is available online at this website. Note that PDFs of all the papers in the book are not necessarily linked from the website; we will include PDFs in reading assignments.

Supplemental Readings

There will be several other readings that will be posted on the course web site.

Last change: February 02 2026.

Questions or comments regarding 6.5830/6.5831? Send e-mail to the 6.5830/6.5831 staff at . Accessibility