A survey of methods for encrypted traffic classification and analysis
Corresponding Author
Petr Velan
Institute of Computer Science, Masaryk University, Brno, Czech Republic
Correspondence to: Petr Velan, Institute of Computer Science, Masaryk University, Botanická 68a, Brno, Czech Republic.
E-mail: [email protected]
Search for more papers by this authorMilan Čermák
Institute of Computer Science, Masaryk University, Brno, Czech Republic
Search for more papers by this authorPavel Čeleda
Institute of Computer Science, Masaryk University, Brno, Czech Republic
Search for more papers by this authorMartin Drašar
Institute of Computer Science, Masaryk University, Brno, Czech Republic
Search for more papers by this authorCorresponding Author
Petr Velan
Institute of Computer Science, Masaryk University, Brno, Czech Republic
Correspondence to: Petr Velan, Institute of Computer Science, Masaryk University, Botanická 68a, Brno, Czech Republic.
E-mail: [email protected]
Search for more papers by this authorMilan Čermák
Institute of Computer Science, Masaryk University, Brno, Czech Republic
Search for more papers by this authorPavel Čeleda
Institute of Computer Science, Masaryk University, Brno, Czech Republic
Search for more papers by this authorMartin Drašar
Institute of Computer Science, Masaryk University, Brno, Czech Republic
Search for more papers by this authorSummary
With the widespread use of encrypted data transport, network traffic encryption is becoming a standard nowadays. This presents a challenge for traffic measurement, especially for analysis and anomaly detection methods, which are dependent on the type of network traffic. In this paper, we survey existing approaches for classification and analysis of encrypted traffic. First, we describe the most widespread encryption protocols used throughout the Internet. We show that the initiation of an encrypted connection and the protocol structure give away much information for encrypted traffic classification and analysis. Then, we survey payload and feature-based classification methods for encrypted traffic and categorize them using an established taxonomy. The advantage of some of described classification methods is the ability to recognize the encrypted application protocol in addition to the encryption protocol. Finally, we make a comprehensive comparison of the surveyed feature-based classification methods and present their weaknesses and strengths. Copyright © 2015 John Wiley & Sons, Ltd.
References
- 1 Sandvine, Inc. Global Internet Phenomena Report 1H 2014, 2014. Available from: https://www.sandvine.com/downloads/general/global-internet-phenomena/2014/1h-2014-global-internet-phenomena-report.pdf [30 November 2014].
- 2Dainotti A, Pescape A, Claffy KC. Issues and future directions in traffic classification. Network, IEEE. 2012; 26(1): 35–40.
- 3Nguyen TTT, Armitage G. A survey of techniques for internet traffic classification using machine learning. Communications Surveys & Tutorials, IEEE. 2008; 10(4): 56–76.
- 4Zhang M, John W, Claffy KC, Brownlee N. State of the art in traffic classification: a research review. In Pam '09: 10th International Conference on Passive and Active Measurement, Student Workshop, Seoul, Korea, 2009.
- 5Callado A, Kamienski C, Szabo G, Gero B, Kelner J, Fernandes S, Sadok D. A survey on internet traffic identification.Communications Surveys & Tutorials, IEEE. 2009; 11(3): 37–52.
- 6Finsterbusch M, Richter C, Rocha E, Muller J-A, Hanssgen K. A survey of payload-based traffic classification approaches. Communications Surveys & Tutorials, IEEE. 2014; 16(2): 1135–1156.
- 7Cao Z, Xiong G, Zhao Y, Li Z, Guo L. A survey on encrypted traffic classification. In Applications and Techniques in Information Security, L Batten, G Li, W Niu, M Warren (eds)., Communications in Computer and Information Science, vol. 490, Springer Berlin Heidelberg: Berlin, Germany. 2014; 73–81.
10.1007/978-3-662-45670-5_8 Google Scholar
- 8Khalife J, Hajjar A, Diaz-Verdejo J. A multilevel taxonomy and requirements for an optimal traffic-classification model. International Journal of Network Management. 2014; 24(2): 101–120.
- 9 ISO. ISO/IEC 7498-1:1994 Information technology—Open Systems Interconnection—Basic Reference Model: The Basic Model 2nd edn., International Organization for Standardization: Geneva, Switzerland, 1994.
- 10Frankel SE, Kent K, Lewkowski R, Orebaugh AD, Ritchey RW, Sharma SR. Guide to IPsec VPNs. In SP 800-77, National Institute of Standards & Technology, Gaithersburg, MD, United States, 2005.
- 11Kaufman C, Hoffman P, Nir Y, Eronen P. Internet Key Exchange Protocol Version 2 (IKEv2). In RFC 5996 (Proposed Standard), Internet Engineering Task Force, 2010 Updated by RFCs 5998, 6989.
- 12Kent S. IP Encapsulating Security Payload (ESP). In RFC 4303 (Proposed Standard), Internet Engineering Task Force, 2005.
- 13Dierks T, Rescorla E. The Transport Layer Security (TLS) Protocol Version 1.2. IETF, Internet Engineering Task Force, 2008. Updated by RFCs 5746, 5878, 6176.
- 14Freier A, Karlton P, Kocher P. The Secure Sockets Layer (SSL) Protocol Version 3.0. In RFC 6101 (Historic), Request for Comments, Internet Engineering Task Force, 2011.
- 15Meyer C. 20 Years of SSL/TLS Research. An Analysis of the Internet's Security Foundation. Ph.D. Thesis, 2014.
- 16Cooper D, Santesson S, Farrell S, Boeyen S, Housley R, Polk W. Internet X.509 Public Key Infrastructure Certificate and Certificate Revocation List (CRL) Profile. In RFC 5280 (Proposed Standard), Internet Engineering Task Force, 2008 Updated by RFC 6818.
- 17Ylonen T, Lonvick C. The Secure Shell (SSH) Transport Layer Protocol. In RFC 4253 (Proposed Standard), Internet Engineering Task Force, 2006 Updated by RFC 6668.
- 18Galbraith J, Saarenmaa O. SSH File Transfer Protocol draft-ietf-secsh-filexfer-13.txt. Internet-Draft, 2006.
- 19Pechanec J. How the SCP protocol works. Jan Pechanec's Weblog. Weblog post, 2007. Available from: https://blogs.oracle.com/janp/entry/how_the_scp_protocol_works [30 November 2014].
- 20Stallings W. Protocol basics: Secure Shell protocol. The Internet Protocol Journal. 2009; 12(4): 18–30.
- 21Harrison D. Index of BitTorrent Enhancement Proposals. Web page, 2014. Available from: http://www.bittorrent. org/beps/bep_0000.html [30 November 2014].
- 22 Azureus Software Inc.. Message Stream Encryption. Vuze Wiki. Web page, 2014. Available from: http://wiki.vuze.com/w/Message_Stream_Encryption [30 November 2014].
- 23 Skype and Microsoft. Skype. Web page, 2014. Available from: http://www.skype.com/ [30 November 2014].
- 24Adami D, Callegar C, Giordano S, Pagano M, Pepe T. Skype-Hunter: A real-time system for the detection and classification of Skype traffic. International Journal of Communication Systems. 2011; 25(3): 386–403.
- 25 Skype Limited. Skype ConnectTM Requirements Guide. Online, 2011. Available from: http://download.skype.com/share/business/guides/ skype-connect-requirements-guide.pdf [30 November 2014].
- 26 IANA—Internet Assigned Numbers Authority. Protocol Registries. Web page, 2014. Available from: http: //www.iana.org/protocols [30 November 2014].
- 27 Qualys, Inc.. HTTP Client Fingerprinting Using SSL Handshake Analysis. Web page, 2014. Available from: https:// www.ssllabs.com/projects/client-fingerprinting/ [30 November 2014].
- 28Majkowski M. SSL fingerprinting for p0f. Web page, 2012. Available from: https://idea.popcount.org/2012-06-17-ssl-fingerprinting-for-p0f/ [30 November 2014].
- 29Holz R, Braun L, Kammenhuber N, Carle G. The SSL landscape: a thorough analysis of the X.509 PKI using active and passive measurements. In Proceedings of the 2011 ACM SIGCOMM Conference on Internet Measurement Conference, IMC '11. ACM: New York, NY, USA, 2011; 427–444.
- 30Durumeric Z, Kasten J, Bailey M, Halderman JA. Analysis of the HTTPS certificate ecosystem. In Proceedings of the 2013 Conference on Internet Measurement Conference, IMC '13. ACM: New York, NY, USA, 2013; 291–304.
- 31 sslbl.abuse.ch. SSL Blacklist. Web page, 2014. Available from: https://sslbl.abuse.ch/ [30 November 2014].
- 32Blake-Wilson S, Nystrom M, Hopwood D, Mikkelsen J, Wright T. Transport Layer Security (TLS) Extensions. Technical Report RFC 4366 (Proposed Standard), Internet Engineering Task Force, 2006 Obsoleted by RFCs 5246, 6066, updated by RFC 5746.
- 33Huang L, Adhikarla S, Boneh D, Jackson C. An experimental study of TLS forward secrecy deployments. Internet Computing, IEEE. 2014; 18(6): 43–51.
- 34Miller B, Huang L, Joseph AD, Tygar JD. I know why you went to the clinic: risks and realization of HTTPS traffic analysis. In Privacy Enhancing Technologies, Lecture Notes in Computer Science, vol. 8555, Springer International Publishing: New York, USA. 2014; 143–163.
10.1007/978-3-319-08506-7_8 Google Scholar
- 35Koch R, Rodosek GD. Command evaluation in encrypted remote sessions. In Proceedings of the 2010 Fourth International Conference on Network and System Security, NSS '10. IEEE Computer Society: Washington, DC, USA, 2010; 299–305.
- 36Hellemons L, Hendriks L, Hofstede R, Sperotto A, Sadre R, Pras A. SSHCure: a flow-based SSH intrusion detection system. In Dependable Networks and Services, Lecture Notes in Computer Science, vol. 7279, Springer Berlin Heidelberg: Berlin, Germany. 2012; 86–97.
10.1007/978-3-642-30633-4_11 Google Scholar
- 37 ipoque GmbH. PACE 2.0. Web page. Available from: http://www.ipoque.com/en/products/pace [30 November 2014].
- 38 Cisco Systems, Inc. Network Based Application Recognition (NBAR). Web page, 2014. Available from: http: //www.cisco.com/c/en/us/products/ios-nx-os-software/network-based-application-recognition-nbar [30 November 2014].
- 39Deri L, Martinelli M, Bujlow T, Cardigliano A. nDPI: Open-source high-speed deep packet inspection. In Wireless Communications and Mobile Computing Conference (IWCMC), 2014 International,Nicosia, Cyprus, 2014; 617–622.
- 40Alcock S, Nelson R. Libprotoident: traffic classification using lightweight packet inspection. Online article, 2012. Available from: http://www.wand.net.nz/~salcock/lpi/lpi.pdf [30 November 2014].
- 41 ClearFoundation. l7-filter. Web page, 2013. Available from: http://l7-filter.clearfoundation.com/ [30 November 2014].
- 42Bujlow T, Carela-Español V, Barlet-Ros P. Independent comparison of popular DPI tools for traffic classification. Computer Networks. 2015; 76(0): 75–89.
- 43Moore A, Crogan M, Zuev D. Discriminators for use in flow-based classification, Queen Mary, University of London, 2005.
- 44Alpaydin E. Introduction to Machine Learning (2nd edn). The MIT Press: London, England, 2010.
- 45Sun G, Xue Y, Dong Y, Wang D, Li C. An novel hybrid method for effectively classifying encrypted traffic. In Global Telecommunications Conference (GLOBECOM 2010), 2010 IEEE,Miami, Florida, USA, 2010; 1–5.
- 46Okada Y, Ata S, Nakamura N, Nakahira Y, Oka I. Application identification from encrypted traffic based on characteristic changes by encryption. In 2011 IEEE International Workshop Technical Committee on Communications Quality and Reliability (CQR),Naples, Florida, USA, 2011; 1–6.
- 47Arndt DJ, Zincir-Heywood AN. A comparison of three machine learning techniques for encrypted network traffic analysis. In 2011 IEEE Symposium on Computational Intelligence for Security and Defense Applications (CISDA),Paris, France, 2011; 107–114.
- 48Alshammari R, Lichodzijewski PI, Heywood M, Zincir-Heywood AN. Classifying SSH encrypted traffic with minimum packet header features using genetic programming. In Proceedings of the 11th Annual Conference Companion on Genetic and Evolutionary Computation Conference: Late Breaking Papers, GECCO '09. ACM: New York, NY, USA, 2009; 2539–2546.
- 49Alshammari R, Zincir-Heywood AN. A flow based approach for SSH traffic detection. In IEEE International Conference on Systems Man and Cybernetics, 2007. ISIC,Montreal, Quebec, Canada, 2007; 296–301.
- 50Alshammari R, Zincir-Heywood AN. A preliminary performance comparison of two feature sets for encrypted traffic classification. In Proceedings of the International Workshop on Computational Intelligence in Security for Information Systems CISIS'08, Advances in Soft Computing, vol. 53, Springer Berlin Heidelberg: Genoa, Italy, 2009; 203–210.
- 51Alshammari R, Zincir-Heywood AN. Machine learning based encrypted traffic classification: identifying SSH and Skype. In IEEE Symposium on Computational Intelligence for Security and Defense Applications, 2009. CISDA 2009,Ottawa, Canada, 2009; 1–8.
- 52Alshammari R, Zincir-Heywood AN. An investigation on the identification of VoIP traffic: case study on Gtalk and Skype. In 2010 International Conference on Network and Service Management (CNSM),Niagara Falls, Canada, 2010; 310–313.
- 53Alshammari R, Zincir-Heywood AN. Can encrypted traffic be identified without port numbers, IP addresses and payload inspection?Computer Networks. 2011; 55(6): 1326–1350.
- 54Kumano Y, Ata S, Nakamura N, Nakahira Y, Oka I. Towards real-time processing for application identification of encrypted traffic. In 2014 International Conference on Computing, Networking and Communications (ICNC),Honolulu, Hawaii, USA, 2014; 136–140.
- 55Bernaille L, Teixeira R. Early recognition of encrypted applications. In Proceedings of the 8th International Conference on Passive and Active Network Measurement, PAM'07. Springer-Verlag: Berlin, Heidelberg, 2007; 165–175.
- 56Maiolini G, Baiocchi A, Iacovazzi A, Rizzi A. Real time identification of SSH encrypted application flows by using cluster analysis techniques. In Networking 2009, L Fratta, H Schulzrinne, Y Takahashi, O Spaniol (eds)., Lecture Notes in Computer Science, vol. 5550, Springer Berlin Heidelberg: Berlin, Germany. 2009; 182–194.
10.1007/978-3-642-01399-7_15 Google Scholar
- 57Bacquet C, Zincir-Heywood AN, Heywood MI. An investigation of multi-objective genetic algorithms for encrypted traffic identification. In Computational Intelligence in Security for Information Systems, Advances in Intelligent and Soft Computing, vol. 63, Springer Berlin Heidelberg. 2009; 93–100.
10.1007/978-3-642-04091-7_12 Google Scholar
- 58Bacquet C, Zincir-Heywood AN, Heywood MI. Genetic optimization and hierarchical clustering applied to encrypted traffic identification. In 2011 IEEE Symposium on Computational Intelligence in Cyber Security (CICS),Paris, France, 2011 194–201.
- 59Bar-Yanai R, Langberg M, Peleg D, Roditty L. Realtime classification for encrypted traffic. In Experimental Algorithms, Lecture Notes in Computer Science, vol. 6049, Springer Berlin Heidelberg: Berlin, Germany. 2010; 373–385.
10.1007/978-3-642-13193-6_32 Google Scholar
- 60Zhang M, Zhang H, Zhang B, Lu G. Encrypted traffic classification based on an improved clustering algorithm. In Trustworthy Computing and Services, Communications in Computer and Information Science, vol. 320, Springer Berlin Heidelberg: Berlin, Germany. 2013; 124–131.
10.1007/978-3-642-35795-4_16 Google Scholar
- 61Du Y, Zhang R. Design of a method for encrypted P2P traffic identification using k-means algorithm. Telecommunication Systems. 2013; 53(1): 163–168.
- 62De Montigny-Leboeuf A. Flow attributes for use in traffic characterization, Communications Research Centre Canada, Tech. Rep, 2005.
- 63Wang Y, Zhang Z, Guo L, Li S. Using entropy to classify traffic more deeply. In 2011 6th IEEE International Conference on Networking, Architecture and Storage (NAS),Dalian, Liaoning, China, 2011; 45–52.
- 64Korczynski M, Duda A. Classifying service flows in the encrypted Skype traffic. In 2012 IEEE International Conference on Communications (ICC),Ottawa, Canada, 2012; 1064–1068.
- 65Amoli PV, Hamalainen T. A real time unsupervised NIDS for detecting unknown and encrypted network attacks in high speed network. In 2013 IEEE International Workshop on Measurements and Networking Proceedings (M&N),Naples, Italy, 2013; 149–154.
- 66Korczynski M, Duda A. Markov chain fingerprinting to classify encrypted traffic. In Infocom, 2014 Proceedings IEEE, 2014; 781–789.
- 67Karagiannis T, Papagiannaki K, Faloutsos M. BLINC: Multilevel Traffic Classification in the Dark. In Proceedings of the 2005 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, SIGCOMM ‘05. ACM: New York, NY, USA, 2005; 229–240.
- 68Wright CV, Monrose F, Masson GM. On inferring application protocol behaviors in encrypted network traffic. Journal of Machine Learning Research. 2006; 7: 2745–2769.
- 69Khakpour AR, Liu AX. An information-theoretical approach to high-speed flow nature identification. Networking, IEEE/ACM Transactions on. 2013; 21(4): 1076–1089.
- 70 CAIDA. Cyber-security Research Ethics Dialog & Strategy Workshop, 2013. Available from: http://www.caida.org/ workshops/creds/1305/ [30 November 2014].
- 71 IETF. IETF 88 Proceedings, Technical Plenary, 2014. Available from: http://www.ietf.org/proceedings/88/technical-plenary.html [30 November 2014].
- 72Farrell S, Tschofenig Hb. Pervasive Monitoring Is an Attack. RFC 7258 (Best Current Practice). IETF, Internet Engineering Task Force.
Citing Literature
Special Issue:Measure, Detect and Mitigate ‐ Challenges and Trends in Network Security
September/October 2015
Pages 355-374