Domain problem-solving expert identification in community question answering
Weizhao Tang
School of Computer Science, Fudan University, Shanghai, China
Shanghai Key Laboratory of Data Science, Shanghai, China
Shanghai Institute of Intelligent Electronics & Systems, Shanghai, China
Search for more papers by this authorCorresponding Author
Tun Lu
School of Computer Science, Fudan University, Shanghai, China
Shanghai Key Laboratory of Data Science, Shanghai, China
Shanghai Institute of Intelligent Electronics & Systems, Shanghai, China
Correspondence
Tun Lu, School of Computer Science, Fudan University, Shanghai, 200433, China.
Email: [email protected]
Hansu Gu, Microsoft Inc., Seattle, WA, 98052.
Email: [email protected]
Search for more papers by this authorCorresponding Author
Hansu Gu
Microsoft Inc., Seattle, Washington, USA
Correspondence
Tun Lu, School of Computer Science, Fudan University, Shanghai, 200433, China.
Email: [email protected]
Hansu Gu, Microsoft Inc., Seattle, WA, 98052.
Email: [email protected]
Search for more papers by this authorPeng Zhang
School of Computer Science, Fudan University, Shanghai, China
Shanghai Key Laboratory of Data Science, Shanghai, China
Shanghai Institute of Intelligent Electronics & Systems, Shanghai, China
Search for more papers by this authorNing Gu
School of Computer Science, Fudan University, Shanghai, China
Shanghai Key Laboratory of Data Science, Shanghai, China
Shanghai Institute of Intelligent Electronics & Systems, Shanghai, China
Search for more papers by this authorWeizhao Tang
School of Computer Science, Fudan University, Shanghai, China
Shanghai Key Laboratory of Data Science, Shanghai, China
Shanghai Institute of Intelligent Electronics & Systems, Shanghai, China
Search for more papers by this authorCorresponding Author
Tun Lu
School of Computer Science, Fudan University, Shanghai, China
Shanghai Key Laboratory of Data Science, Shanghai, China
Shanghai Institute of Intelligent Electronics & Systems, Shanghai, China
Correspondence
Tun Lu, School of Computer Science, Fudan University, Shanghai, 200433, China.
Email: [email protected]
Hansu Gu, Microsoft Inc., Seattle, WA, 98052.
Email: [email protected]
Search for more papers by this authorCorresponding Author
Hansu Gu
Microsoft Inc., Seattle, Washington, USA
Correspondence
Tun Lu, School of Computer Science, Fudan University, Shanghai, 200433, China.
Email: [email protected]
Hansu Gu, Microsoft Inc., Seattle, WA, 98052.
Email: [email protected]
Search for more papers by this authorPeng Zhang
School of Computer Science, Fudan University, Shanghai, China
Shanghai Key Laboratory of Data Science, Shanghai, China
Shanghai Institute of Intelligent Electronics & Systems, Shanghai, China
Search for more papers by this authorNing Gu
School of Computer Science, Fudan University, Shanghai, China
Shanghai Key Laboratory of Data Science, Shanghai, China
Shanghai Institute of Intelligent Electronics & Systems, Shanghai, China
Search for more papers by this authorFunding information: National Natural Science Foundation of China, Grant/Award Numbers: 61902075, 61932007
Abstract
Question-Answering (Q&A) services provide internet users with platforms to exchange knowledge and ideas. The development of Q&A sites, or Community Question Answering (CQA), mainly depends on the high-quality content continuously contributed by users with high-level expertise, who can be recognized as experts. Expert finding is an important task for the authorities of Q&A communities to encourage commitment. In a highly competitive market environment, CQA managers have to take measures to retain and nurture users, especially superior contributors. However, current expertise scoring techniques adopted in CQA often give much credit to very active users and fail to identify real experts. This study aims to develop a robust and practical expert identification framework for Q&A communities, by combining well-designed expertise scoring technique and probabilistic clustering model. With regard to expert identification, a numerical metric of users' expertise is developed as the optimal expert finding strategy, and a clustering algorithm based on Gaussian-Gamma mixture model (GGMM) is proposed to efficiently distinguish experts from nonexperts. In the experiments, the proposed method is applied to real-world datasets collected from subcommunities of Stack Exchange Q&A networks. Results obtained from comparative experiments show that our method achieves better performance than the state-of-the-art methods and demonstrate the effectiveness of the proposed framework. The analysis shows that the framework which combines the proposed expertise scoring technique and Gaussian–Gamma mixture clustering model is capable of detecting excellent domain problem-solving experts who exhibit both domain interest and expertise.
CONFLICTS OF INTEREST
None.
REFERENCES
- Abramowitz, M., & Stegun, I. A. (1965). Handbook of mathematical functions: With formulas, graphs, and mathematical tables (Vol. 55). New York, NY: Courier Corporation.
- Ahmed, T., & Srivastava, A. (2017). Understanding and evaluating the behavior of technical users: A study of developer interaction at stackoverflow. Human-Centric Computing and Information Sciences, 7, 8.
- Al-Taie, M. Z., Kadry, S., & Obasa, A. I. (2018). Understanding expert finding systems: Domains and techniques. Social Network Analysis and Mining, 8, 57.
- Bouguessa, M., & Romdhane, L. B. (2015). Identifying authorities in online communities. ACM Transactions on Intelligent Systems and Technology (TIST), 6, 30.
- Bouguessa, M., & Wang, S. (2008) Identifying authoritative actors in question-answering forums: the case of yahoo! answers. Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 866–874). ACM.
- Dehghan, M., Biabani, M., & Abin, A. A. (2019). Temporal expert profiling: With an application to t-shaped expert finding. Information Processing & Management, 56, 1067–1079.
- Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society. Series B (Methodological), 39, 1–38.
- Egghe, L. (2006). Theory and practise of the g -index. Scientometrics, 69, 131–152.
- Ericsson, K. A., Hoffman, R. R., Kozbelt, A., & Williams, A. M. (2018). The Cambridge handbook of expertise and expert performance. Cambridge, UK: Cambridge University Press.
10.1017/9781316480748 Google Scholar
- Faisal, M. S., Daud, A., Akram, A. U., Abbasi, R. A., Aljohani, N. R., & Mehmood, I. (2019). Expert ranking techniques for online rated forums. Computers in Human Behavior, 100, 168–176.
- Furtado, A., Oliveira, N., & Andrade, N. (2014). A case study of contributor behavior in Q&A site and tags: The importance of prominent profiles in community productivity. Journal of the Brazilian Computer Society, 20, 1–16.
10.1186/1678-4804-20-5 Google Scholar
- Hanrahan, B. V., Convertino, G. & Nelson, L. (2012) Modeling problem difficulty and expertise in stackoverflow. Proceedings of the ACM Conference on Computer Supported Cooperative Work Companion (pp. 91–94). ACM.
- Harper, F. M., Raban, D. R., Rafaeli, S. & Konstan, J. A. (2008) Predictors of answer quality in online Q&A sites. Proceedings of the Conference on Human Factors in Computing Systems.
- Huna, A., Srba, I. & Bielikova, M. (2016) Exploiting content quality and question difficulty in cqa reputation systems. Proceedings of the International Conference and School on Network Science, 68–81. Springer.
- Jeon, J., Croft, W. B., Lee, J. H. & Park, S. (2006) A framework to predict the quality of answers with non-textual features. Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 228–235).
- Kao, W. C., Liu, D. R. & Wang, S. W. (2010) Expert finding in question-answering websites: A novel hybrid approach. Proceedings of the ACM Symposium on Applied Computing (pp. 867–871).
- Li, G., Zhu, H., Lu, T., Ding, X. & Gu, N. (2015) Is it good to be like wikipedia? Exploring the trade-offs of introducing collaborative editing model to Q&A sites. Proceedings of the ACM Conference on Computer Supported Cooperative Work & Social Computing (pp. 1080–1091).
- Li, Z., Jiang, J.-Y., Sun, Y. and Wang, W. (2019) Personalized question routing via heterogeneous network embedding. Proceedings of the AAAI Conference on Artificial Intelligence (vol. 33, pp. 192–199).
- Liu, X., Bollen, J., Nelson, M. L., & Sompel, H. V. D. (2005). Co-authorship networks in the digital library research community. Information Processing & Management, 41, 1462–1480.
- Mumtaz, S., Rodriguez, C. & Benatallah, B. (2019) Expert2vec: Experts representation in community question answering for question routing. Proceedings of the International conference on advanced information systems engineering (pp. 213–229). Springer.
- Page, L. (1998). The pagerank citation ranking: Bringing order to the web. Stanford Digital Libraries Working Paper, 9, 1–14.
- Pal, A., Farzan, R., Konstan, J. A. & Kraut, R. E. (2011) Early detection of potential experts in question answering communities. Proceedings of the International Conference on User Modeling, Adaption, and Personalization (pp. 231–242).
- Pal, A., Harper, F. M., & Konstan, J. A. (2012). Exploring question selection bias to identify experts and potential experts in community question answering. ACM Transactions on Information Systems (TOIS), 30, 1–28.
- Patil, S., & Lee, K. (2016). Detecting experts on quora: By their activity, quality of answers, linguistic characteristics and temporal behaviors. Social Network Analysis & Mining, 6, 1–11.
- Rafiei, M., & Kardan, A. A. (2015). A novel method for expert finding in online communities based on concept map and pagerank. Human-Centric Computing and Information Sciences, 5, 10.
- Sahu, T. P., Nagwani, N. K., & Verma, S. (2016). Multivariate beta mixture model for automatic identification of topical authoritative users in community question answering sites. IEEE Access, 4, 5343–5355.
- Sun, J., Moosavi, S., Ramnath, R. & Parthasarathy, S. (2018) Qdee: Question difficulty and expertise estimation in community question answering sites. Proceedings of the 12th International AAAI Conference on Web and Social Media.
- Sun, J., Vishnu, A., Chakrabarti, A., Siegel, C., & Parthasarathy, S. (2018). Coldroute: Effective routing of cold questions in stack exchange sites. Data Mining and Knowledge Discovery, 32, 1339–1367.
- Toba, H., Ming, Z. Y., Adriani, M., & Chua, T. S. (2014). Discovering high quality answers in community question answering archives using a hierarchy of classifiers. Information Sciences, 261, 101–115.
- Torkzadeh Mahani, N., Dehghani, M., Mirian, M. S., Shakery, A., & Taheri, K. (2018). Expert finding by the dempster-Shafer theory for evidence combination. Expert Systems, 35, e12231.
- Wang, G., Gill, K., Mohanlal, M., Zheng, H. & Zhao, B. Y. (2013) Wisdom in the social crowd:an analysis of quora. Proceedings of the International World Wide Web Conference (pp. 1341–1352).
- Wang, X., Huang, C., Yao, L., Benatallah, B., & Dong, M. (2018). A survey on expert recommendation in community question answering. Journal of Computer Science & Technology, 33, 625–653.
- Weng, J., Lim, E. P., Jiang, J. & He, Q. (2010) Twitterrank: finding topic-sensitive influential twitterers. Proceedings of the ACM International Conference on Web Search and Data Mining (pp. 261–270). ACM.
- Yang, J., Tao, K., Bozzon, A. & Houben, G. J. (2014) Sparrows and owls: Characterisation of expert behaviour in stackoverflow. Proceedings of the International Conference on User Modeling, Adaptation, and Personalization (pp. 266–277).
- Yang, L., Qiu, M., Gottipati, S., Zhu, F., Jiang, J., Sun, H. & Chen, Z. (2013) Cqarank: Jointly model topics and expertise in community question answering. Proceedings of the 22nd ACM International Conference on Information & Knowledge Management (pp. 99–108). ACM.
- Yuan, S., Zhang, Y., Tang, J., Hall, W., & Cabotà, J. B. (2020). Expert finding in community question answering: A review. Artificial Intelligence Review, 53(2), 843–874.
- Zhang, J., Ackerman, M. S. and Adamic, L. (2007) Expertise networks in online communities: S ructure and algorithms. Proceedings of the International Conference on World Wide Web (pp. 221–230).
- Zhou, G., Liu, K. and Zhao, J. (2012) Joint relevance and answer quality learning for question routing in community qa. Proceedings of the ACM International Conference on Information and Knowledge Management (pp. 1492–1496).
- Zhu, Z., Bernhard, D. and Gurevych, I. (2009) A multi-dimensional model for assessing the quality of answers in social Q&A sites. Proceedings of the International Conference on Information Quality, ICIQ 2009 November (pp. 264–265). Hasso Plattner Institute, University of Potsdam, Germany.