Main Page

From New Wiki

Jump to: navigation, search

The Relational Cloud project is an MIT-based effort to investigate technologies and challenges related to Database-as-a-Service within cloud-computing. This project is supported by NSF Grant 1065219 ("A Scalable and Secure Database Service") and by Quanta Computer as a part of the T-Party Project.

Contents

Currently Involved Researchers

  • Carlo A. Curino (Post-Doc, now at Microsoft Research) [1]
  • Barzan Mozafari (Post-Doc) [2]
  • Evan Jones (Ph.D. Student, now at a startup) [3]
  • Raluca Popa (Ph.D. Student)
  • Eugene Wu (Ph.D. Student)
  • Sam Madden (Prof. and PI) [4]
  • Hari Balakrishnan (Prof. and PI) [5]
  • Nickolai Zeldovich (Prof. and PI)

Vision

The advent of cloud computing and hosted software as a service is creating a novel market for data management. Cloud-based DB services are starting to appear, and have the potential to attract customers from very diverse sectors of the market, from small businesses aiming at reducing the total cost of ownership (perfectly suited for multi-tenancy solutions), to very large enterprises seeking high-profile solutions spanning on potentially thousands of machines and capable of absorbing unexpected burst of traffic. At the same time we are witnessing traditional DBMS giants outshined in most common tasks by novel dedicated data management engines more specifically tailored for specific workload classes. This increase in the number of available data management products further complicates the process of choosing and deploying in-house solutions, in favor of database-as-a-service (DaaS) approaches, that hide most of the complexity behind simple interfaces, and clear service agreements.

Cloud-based providers can, thus, leverage several strong selling points, including lower total costs of ownership, zero-configuration, quality of service guarantees, and transparent scalability and elasticity, thus, reviving the lost data management dream of "one size fit all". In order to achieve this, DaaS providers must harness the many technological advances in data management by efficiently exploiting multiple DBMS engines (targeting different type of users) in a self-balancing solution, that optimizes the assignment of resources of large data centers to potentially thousands of users with very diverse needs.

From the user perspective DaaS means: (i) predictable costs, proportional to the quality of service and actual workloads, (ii) lower technical complexity, thanks to a unified and simplified service access interface, and (iii) virtually infinite resources ready at hand.

On the other side, the provider has the twofold goal of (i) guaranteeing the ``illusion of infinite resources, by continuously meeting user expectations (i.e., the SLA requirements) under evolving workloads, and (ii) minimizing the operational costs associated to each user. To this purpose what it used to be a problem of provisioning becomes mainly an optimization issue, where a large user base, multiple DBMS engines, and a large data center will provide an unprecedented opportunity to exploit economy of scale, smart load balancing and principled overselling. Furthermore, by combining complementary workloads from a very large pool of tenants the service provider can achieve full exploitation of the available computing power reducing the operational costs and increasing the power-efficiency of data centers.

Publications

  • 2013: "Processing Analytical Queries over Encrypted Data." Stephen Tu, M. Frans Kaashoek, Samuel Madden, Nickolai Zeldovich, (VLDB)
  • 2013: "Resource and Performance Prediction for Building a Next Generation Database Cloud" Barzan Mozafari, Carlo Curino, Sam Madden, (CIDR)
  • 2013: "An Ideal-Security Protocol for Order-Preserving Encoding." Raluca Ada Popa, Frank H. Li, Nickolai Zeldovich. (Proceedings of the 34th IEEE Symposium on Security and Privacy, San Francisco, CA.)
  • 2012: "Lookup Tables: Fine-Grained Partitioning for Distributed Databases" Aubrey Tatarowicz, Carlo Curino, Evan Jones, Sam Madden, (ICDE)
  • 2012: "Language Support for Efficient Computation over Encrypted Data." Meelap Shah, Emily Stark, Raluca Ada Popa, Nickolai Zeldovich, (Off the Beaten Track Workshop: Underrepresented Problems for Programming Language Researchers.)
  • 2012: "CryptDB: Processing Queries on an Encrypted Database. Raluca Ada Popa, Catherine M. S. Redfield, Nickolai Zeldovich, Hari Balakrishnan", (Communications of the ACM, 55(9):103–111.)
  • 2011: "CryptDB: Protecting Confidentiality with Encrypted Query Processing." Raluca Ada Popa, Catherine M. S. Redfield, Nickolai Zeldovich, and Hari Balakrishnan, (SOSP)
  • 2011: "Workload-aware Database Monitoring and Consolidation" Carlo Curino, Evan P. C. Jones, Sam Madden, Hari Balakrishnan, (SIGMOD)
  • 2011: "Relational Cloud: a Database Service for the cloud" Carlo Curino, Evan Jones, Raluca Popa, Nirmesh Malviya, Eugene Wu, Sam Madden, Har Balakrishnan, Nickolai Zeldovich, CIDR
  • 2010: "Schism: a Workload-Driven Approach to Database Replication and Partitioning", Carlo Curino, Yang Zhang, Evan Jones, Sam Madden, Proceedings of Very Large Data Base (PVLDB)
  • 2010: Talk at NEDBSummit 2010 [NEDBSummit http://db.csail.mit.edu/nedbday10/]

Resources

Personal tools