Chapter 5

Scalability, Workloads, and Performance: Replication, Popularity, Modeling, and Geo-Distributed File Stores

Roy H. Campbell

Roy H. Campbell

Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, USA

Search for more papers by this author
Shadi A. Noghabi

Shadi A. Noghabi

Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, USA

Search for more papers by this author
Cristina L. Abad

Cristina L. Abad

Escuela Superior Politecnica del Litoral, ESPOL, Guayaquil, Ecuador

Search for more papers by this author
First published: 18 July 2018

Abstract

This chapter explores the problems of scalability of cloud computing systems. Scalability allows a cloud application to change in size, volume, or geographical distribution while meeting the needs of the cloud customer. A practical approach to scaling cloud applications is to improve the availability of the application by replicating the resources and files used; this includes creating multiple copies of the application across many nodes in the cloud. Replication improves availability through use of redundant resources, services, networks, file systems, and nodes, but also creates problems with respect to clients' ability to observe consistency as they are served from the multiple copies. Variability in data sizes, volumes, and the homogeneity and performance of the cloud components (disks, memory, networks, and processors) can impact scalability. Evaluating scalability is difficult, especially when there is a large degree of variability. That leads to the need to estimate how applications will scale on clouds based on probabilistic estimates of job load and performance. Scaling can have many different dimensions and properties. The emergence of low-latency worldwide services and the desire to have higher fault tolerance and reliability has led to the design of geo-distributed storage with replicas in multiple locations. At the end of this chapter, we consider scalability in terms of the issues involved with cloud services that are geo-distributed and also study, as a case example, scalable geo-distributed storage.

The full text of this article hosted at iucr.org is unavailable due to technical difficulties.