Volume 28, Issue 2 e1991
RESEARCH ARTICLE

A service-agnostic method for predicting service metrics in real time

Rerngvit Yanggratoke

Corresponding Author

Rerngvit Yanggratoke

ACCESS Linnaeus Center, KTH Royal Institute of Technology, Stockholm, Sweden

Correspondence

Rerngvit Yanggratoke, ACCESS Linnaeus Center, KTH Royal Institute of Technology, Stockholm, Sweden.

Email: [email protected]

Search for more papers by this author
Jawwad Ahmed

Jawwad Ahmed

Ericsson Research, Kista, Sweden

Search for more papers by this author
John Ardelius

John Ardelius

Swedish Institute of Computer Science (SICS), Stockholm, Sweden

Search for more papers by this author
Christofer Flinta

Christofer Flinta

Ericsson Research, Kista, Sweden

Search for more papers by this author
Andreas Johnsson

Andreas Johnsson

Ericsson Research, Kista, Sweden

Search for more papers by this author
Daniel Gillblad

Daniel Gillblad

Swedish Institute of Computer Science (SICS), Stockholm, Sweden

Search for more papers by this author
Rolf Stadler

Rolf Stadler

ACCESS Linnaeus Center, KTH Royal Institute of Technology, Stockholm, Sweden

Swedish Institute of Computer Science (SICS), Stockholm, Sweden

Search for more papers by this author
First published: 13 September 2017
Citations: 11

Summary

We predict performance metrics of cloud services using statistical learning, whereby the behaviour of a system is learned from observations. Specifically, we collect device and network statistics from a cloud testbed and apply regression methods to predict, in real-time, client-side service metrics for video streaming and key-value store services. Results from intensive evaluation on our testbed indicate that our method accurately predicts service metrics in real time (mean absolute error below 16% for video frame rate and read latency, for instance). Further, our method is service agnostic in the sense that it takes as input operating systems and network statistics instead of service-specific metrics. We show that feature set reduction significantly improves the prediction accuracy in our case, while simultaneously reducing model computation time. We find that the prediction accuracy decreases when, instead of a single service, both services run on the same testbed simultaneously or when the network quality on the path between the server cluster and the client deteriorates. Finally, we discuss the design and implementation of a real-time analytics engine, which processes streams of device statistics and service metrics from testbed sensors and produces model predictions through online learning.

The full text of this article hosted at iucr.org is unavailable due to technical difficulties.