Overview
|
The database group at MIT conducts
research on all areas of database
systems and information management.
Projects range from the design of new
user interfaces and query languages to
low-level query execution issues, ranging
from design of new systems for database analytics and
main memory databases to query processing in
next generation pervasive and ubiquitous
environments, such as sensor networks,
wide area information systems, personal
databases, and the Web.
Professor Madden offers a class in Database Systems (6.830).
|
|
| Projects |
|
Intel Science and Technology Center in Big Data
In the Big Data ISTC, our mission is to produce new data management systems and compute architectures for Big Data. Together, these systems will help people process data that exceeds the scale, rate, or sophistication of current data processing systems. We are working to demonstrate the effectiveness of these solutions on real applications in science, engineering, and medicine, making our results broadly available through open sourcing.
|
|
|
CarTel
In CarTel, we are building a system for managing data in the face of intermittent and variable connectivity. We are focusing, in particular, on automotive applications that involve high-rate sensing of road, traffic, and infrastructure conditions. The two key technologies we are developing are CafNet, a carry-and-forward network stack, and a distributed, signal-oriented, priority-dgriven query processor.
|
|
|
RelationalCloud
In RelationalCloud, we are investigating research challenges to enable Database-as-a-Service (DaaS) within the Cloud Computing paradigm. In particular, we are focusing on the problems of (i) characterizing workloads and assigning
them on different data management solutions (ranging from multi-tenant database, to high-profile clustered main-memory solutions) and (ii) highly dynamic allocation of resources to accomodate evolving and bursty workloads in a transparent manner. Our long-term vision aims at combining multiple dedicated
data management solutions behind a unifying DaaS interface: "One Data Service to manage them all".
|
|
|
H-Store
The goal of the H-Store project is to investigate how recent
architectural and application trends affect the performance of online
transaction processing databases (such as those that back many
e-commerce sites, banks and reservation systems), and to study what
performance benefits would be possible with a complete redesign of
OLTP systems in light of these trends. Our idea is to build a main
memory system with a dramatically simplified concurrency control and
recovery model, which the goal of executing many times as many
transactions per second as existing databases that rely on logging,
expensive locking based conccurency control, and disk based recovery.
Our early results show that a simple prototype built from scratch
using modern assumptions can outperform current commercial DBMS
offerings by around a factor of 80 on OLTP workloads. We are currently
working to build a full-featured system that demonstrates these
performance wins in a more robust prototype.
|
|
|
Qurk Qurk is a database that answers queries using people. Crowdsourcing platforms
such as Amazon's Mechanical Turk make it possible to organize crowd
workers to perform tasks like translation or image labelling on
demand. Building these workflows is challenging: how much should you
pay crowd workers? can you trust the output of each worker? How can
you coordinate workers to perform complicated high-level tasks? Qurk
helps you build crowd-powered data processing workflows using a
PIG-like language while tackling these challenges on your behalf.
|
|
|
StatusQuo StatusQuo is a new programming system for developing database
applications. Programmers often go at length to make their applications perform, such as using
stored procedures, rewriting their applications into map / reduce tasks or custom query languages, etc.
StatusQuo frees the programmers from doing any of that.
By leveraging program analysis techniques, the system optimizes applications and
makes them perform. You can now write as inefficient code as you like and
StatusQuo will automatically handle the rest for you.
|
|
| |
| Past Projects |
|
C-Store C-Store is a
read-optimized relational DBMS that contrasts sharply with most
current systems, which are write-optimized. Among the many differences
in its design are: storage of data by column rather than by row,
careful coding and packing of objects into storage including main
memory during query processing, storing an overlapping collection of
column-oriented projections, rather than the current fare of tables
and indexes, a non-traditional implementation of transactions which
includes high availability and snapshot isolation for read-only
transactions, and the extensive use of bitmap indexes to complement
B-tree structures.
|
|
|
WaveScope WaveScope is a software
platform to make it easy to develop, deploy, and operate wireless
sensor networks that exhibit high data rates. In contrast to the
"first generation" of wireless sensor networks that are characterized
by relatively low sensor sampling rates, there are several important
emerging applications in which high rates of hundreds to tens of
thousands of sensor samples per second are common. These include civil
and structural engineering applications, including continuous
monitoring of physical structures, industrial equipment, and fluid
pipelines; "Smart space" applications that continuously monitor
sensors in a a space to support ubiquitous computing or security
applications; and, scientific data gathering applications, such as
outdoor acoustic monitoring systems for continuous habitat monitoring.
|
|
|
MACAQUE
This is an NSF-funded project to
investigate the management of
uncertainty in database systems. We
are looking at probabilistic models
and approximate query processing
techniques in a variety of real world
settings.
|
|
|
|
|
Haystack: The universal information client
Haystack is a tool designed to let
every individual manage all of their
information in the way that makes the
most sense to them. By removing the
arbitrary barriers created by
applications only handling certain
information "types", and recording
only a fixed set of relationships
defined by the developer, it aims to
let users define whichever
arrangements of, connections between,
and views of information they find
most effective.
|
|
|
People
|
Faculty
Administrative Assistant
Ph.D.
- Leilani Battle
- Anant Bhardwaj
- Alvin Cheung
- Yuan Mei
- Alex Pagan
- Rebecca Taft
- Stephen Tu
- Manasi Vartak
- Eugene Wu
Research Staff
- Lewis Girod
- Todd Mostak
- Nesime Tatbul (ISTC Researcher, and Visting Researcher)
Postdoc
M.Eng
|
Alumni
|
|
Recent and Selected Publications
- Barzan Mozafari, Carlo Curino, Samuel Madden.
Resource and Performance Prediction for Building a Next Generation Database Cloud.
In Proceedings of
CIDR, 2013.
[PDF]
- Eugene Wu, Samuel Madden.
Explanatory Lineage.
In Proceedings of
CIDR, 2013.
[PDF]
- Alvin Cheung, Owen Arden, Samuel Madden, Andrew Myers, Armando Solar-Lezama.
StatusQuo: Making Familiar Abstractions Perform Using Program Analysis.
In Proceedings of
CIDR, 2013.
[PDF]
- Stephen Tu, Frans Kaashoek, Nickolai Zeldovich, Samuel Madden.
Processing Analytical Queries over Encrypted Data.
In Proceedings of
VLDB, 2013.
[PDF]
- Sameer Agarwal, Barzan Mozafari, Aurojit Panda, Henry Milner, Samuel Madden, Ion Stoica.
BlinkDB: Queries with Bounded Errors and Bounded Response Times on Very Large Data.
In Proceedings of
EuroSys, 2013.
[PDF]
- Eugene Wu, Samuel Madden, Michael Stonebraker.
SubZero: a Fine-Grained Lineage System for Scientific Databases.
In Proceedings of
ICDE, 2013.
[PDF]
- Alvin Cheung, Armando Solar-Lezama, Samuel Madden.
Using Program Synthesis for Social Recommendations.
In Proceedings of
CIKM, 2012.
[PDF]
- Haystack Publications.
- Medusa Publications.
- Piotr's Publications.
|
|
|