Main Page
From New Wiki
The Relational Cloud project is an MIT-based effort to investigate technologies and challenges related to Database-as-a-Service within cloud-computing.
Contents |
Involved Researchers
- Carlo A. Curino (Post-Doc) [1]
- Evan Jones (Ph.D. Student) [2]
- Sam Madden (Prof. and PI) [3]
- Hari Balakrishnan (Prof. and PI) [4]
- Yang Zhang (Ph.D. Student) [5]
Vision
The advent of cloud computing and hosted software as a service is creating a novel market for data management. Cloud-based DB services are starting to appear, and have the potential to attract customers from very diverse sectors of the market, from small businesses aiming at reducing the total cost of ownership (perfectly suited for multi-tenancy solutions), to very large enterprises seeking high-profile solutions spanning on potentially thousands of machines and capable of absorbing unexpected burst of traffic. At the same time we are witnessing traditional DBMS giants outshined in most common tasks by novel dedicated data management engines more specifically tailored for specific workload classes. This increase in the number of available data management products further complicates the process of choosing and deploying in-house solutions, in favor of database-as-a-service (DaaS) approaches, that hide most of the complexity behind simple interfaces, and clear service agreements.
Cloud-based providers can, thus, leverage several strong selling points, including lower total costs of ownership, zero-configuration, quality of service guarantees, and transparent scalability and elasticity, thus, reviving the lost data management dream of "one size fit all". In order to achieve this, DaaS providers must harness the many technological advances in data management by efficiently exploiting multiple DBMS engines (targeting different type of users) in a self-balancing solution, that optimizes the assignment of resources of large data centers to potentially thousands of users with very diverse needs.
From the user perspective DaaS means: (i) predictable costs, proportional to the quality of service and actual workloads, (ii) lower technical complexity, thanks to a unified and simplified service access interface, and (iii) virtually infinite resources ready at hand.
On the other side, the provider has the twofold goal of (i) guaranteeing the ``illusion of infinite resources, by continuously meeting user expectations (i.e., the SLA requirements) under evolving workloads, and (ii) minimizing the operational costs associated to each user. To this purpose what it used to be a problem of provisioning becomes mainly an optimization issue, where a large user base, multiple DBMS engines, and a large data center will provide an unprecedented opportunity to exploit economy of scale, smart load balancing and principled overselling. Furthermore, by combining complementary workloads from a very large pool of tenants the service provider can achieve full exploitation of the available computing power reducing the operational costs and increasing the power-efficiency of data centers.
Publications
- 2010: "Schism: a Workload-Driven Approach to Database Replication and Partitioning", Carlo Curino, Yang Zhang, Evan Jones, Sam Madden, accepted for publication to Proceedings of Very Large Data Base (PVLDB)
- 2010: Talk at NEDBSummit 2010 [NEDBSummit http://db.csail.mit.edu/nedbday10/]
