Comparison of Architectures and Performance of Database Replication Systems

  1. Dhamane, Rohit
Dirigida por:
  1. Marta Patiño Martínez Director/a

Universidad de defensa: Universidad Politécnica de Madrid

Fecha de defensa: 04 de febrero de 2016

Tribunal:
  1. Ernesto Jimenez Merino Presidente/a
  2. Tonghong Li Secretario/a
  3. Luis Rodero Merino Vocal
  4. Mikel Larrea Álava Vocal
  5. Gorka Guardiola Múzquiz Vocal

Tipo: Tesis

Resumen

One of the most demanding needs in cloud computing and big data is that of having scalable and highly available databases. One of the ways to attend these needs is to leverage the scalable replication techniques developed in the last decade. These techniques allow increasing both the availability and scalability of databases. Many replication protocols have been proposed during the last decade. The main research challenge was how to scale under the eager replication model, the one that provides consistency across replicas. This thesis provides an in depth study of three eager database replication systems based on relational systems: Middle-R, C-JDBC and MySQL Cluster and three systems based on In-Memory Data Grids: JBoss Data Grid, Oracle Coherence and Terracotta Ehcache. Thesis explore these systems based on their architecture, replication protocols, fault tolerance and various other functionalities. It also provides experimental analysis of these systems using state-of-the art benchmarks: TPC-C and TPC-W (for relational systems) and Yahoo! Cloud Serving Benchmark (In- Memory Data Grids). Thesis also discusses three Graph Databases, Neo4j, Titan and Sparksee based on their architecture and transactional capabilities and highlights the weaker transactional consistencies provided by these systems. It discusses an implementation of snapshot isolation in Neo4j graph database to provide stronger isolation guarantees for transactions.