Educational big data : predictions, applications and challenges
- Bai, Xiaomei, Zhang, Fuli, Li, Jinzhou, Guo, Teng, Xia, Feng
- Authors: Bai, Xiaomei , Zhang, Fuli , Li, Jinzhou , Guo, Teng , Xia, Feng
- Date: 2021
- Type: Text , Journal article , Review
- Relation: Big Data Research Vol. 26, no. (2021), p.
- Full Text:
- Reviewed:
- Description: Educational big data is becoming a strategic educational asset, exceptionally significant in advancing educational reform. The term educational big data stems from the rapidly growing educational data development, including students' inherent attributes, learning behavior, and psychological state. Educational big data has many applications that can be used for educational administration, teaching innovation, and research management. The representative examples of such applications are student academic performance prediction, employment recommendation, and financial support for low-income students. Different empirical studies have shown that it is possible to predict student performance in the courses during the next term. Predictive research for the higher education stage has become an attractive area of study since it allowed us to predict student behavior. In this survey, we will review predictive research, its applications, and its challenges. We first introduce the significance and background of educational big data. Second, we review the students' academic performance prediction research, such as factors influencing students' academic performance, predicting models, evaluating indices. Third, we introduce the applications of educational big data such as prediction, recommendation, and evaluation. Finally, we investigate challenging research issues in this area. This discussion aims to provide a comprehensive overview of educational big data. © 2021 Elsevier Inc. **Please note that there are multiple authors for this article therefore only the name of the first 5 including Federation University Australia affiliate “Feng Xia” is provided in this record**
- Authors: Bai, Xiaomei , Zhang, Fuli , Li, Jinzhou , Guo, Teng , Xia, Feng
- Date: 2021
- Type: Text , Journal article , Review
- Relation: Big Data Research Vol. 26, no. (2021), p.
- Full Text:
- Reviewed:
- Description: Educational big data is becoming a strategic educational asset, exceptionally significant in advancing educational reform. The term educational big data stems from the rapidly growing educational data development, including students' inherent attributes, learning behavior, and psychological state. Educational big data has many applications that can be used for educational administration, teaching innovation, and research management. The representative examples of such applications are student academic performance prediction, employment recommendation, and financial support for low-income students. Different empirical studies have shown that it is possible to predict student performance in the courses during the next term. Predictive research for the higher education stage has become an attractive area of study since it allowed us to predict student behavior. In this survey, we will review predictive research, its applications, and its challenges. We first introduce the significance and background of educational big data. Second, we review the students' academic performance prediction research, such as factors influencing students' academic performance, predicting models, evaluating indices. Third, we introduce the applications of educational big data such as prediction, recommendation, and evaluation. Finally, we investigate challenging research issues in this area. This discussion aims to provide a comprehensive overview of educational big data. © 2021 Elsevier Inc. **Please note that there are multiple authors for this article therefore only the name of the first 5 including Federation University Australia affiliate “Feng Xia” is provided in this record**
Local contrast as an effective means to robust clustering against varying densities
- Chen, Bo, Ting, Kaiming, Washio, Takashi, Zhu, Ye
- Authors: Chen, Bo , Ting, Kaiming , Washio, Takashi , Zhu, Ye
- Date: 2018
- Type: Text , Journal article
- Relation: Machine Learning Vol. 107, no. 8-10 (2018), p. 1621-1645
- Full Text:
- Reviewed:
- Description: Most density-based clustering methods have difficulties detecting clusters of hugely different densities in a dataset. A recent density-based clustering CFSFDP appears to have mitigated the issue. However, through formalising the condition under which it fails, we reveal that CFSFDP still has the same issue. To address this issue, we propose a new measure called Local Contrast, as an alternative to density, to find cluster centers and detect clusters. We then apply Local Contrast to CFSFDP, and create a new clustering method called LC-CFSFDP which is robust in the presence of varying densities. Our empirical evaluation shows that LC-CFSFDP outperforms CFSFDP and three other state-of-the-art variants of CFSFDP. © 2018, The Author(s).
- Authors: Chen, Bo , Ting, Kaiming , Washio, Takashi , Zhu, Ye
- Date: 2018
- Type: Text , Journal article
- Relation: Machine Learning Vol. 107, no. 8-10 (2018), p. 1621-1645
- Full Text:
- Reviewed:
- Description: Most density-based clustering methods have difficulties detecting clusters of hugely different densities in a dataset. A recent density-based clustering CFSFDP appears to have mitigated the issue. However, through formalising the condition under which it fails, we reveal that CFSFDP still has the same issue. To address this issue, we propose a new measure called Local Contrast, as an alternative to density, to find cluster centers and detect clusters. We then apply Local Contrast to CFSFDP, and create a new clustering method called LC-CFSFDP which is robust in the presence of varying densities. Our empirical evaluation shows that LC-CFSFDP outperforms CFSFDP and three other state-of-the-art variants of CFSFDP. © 2018, The Author(s).
A survey on context awareness in big data analytics for business applications
- Dinh, Loan, Karmakar, Gour, Kamruzzaman, Joarder
- Authors: Dinh, Loan , Karmakar, Gour , Kamruzzaman, Joarder
- Date: 2020
- Type: Text , Journal article
- Relation: Knowledge and Information Systems Vol. 62, no. 9 (2020), p. 3387-3415
- Full Text:
- Reviewed:
- Description: The concept of context awareness has been in existence since the 1990s. Though initially applied exclusively in computer science, over time it has increasingly been adopted by many different application domains such as business, health and military. Contexts change continuously because of objective reasons, such as economic situation, political matter and social issues. The adoption of big data analytics by businesses is facilitating such change at an even faster rate in much complicated ways. The potential benefits of embedding contextual information into an application are already evidenced by the improved outcomes of the existing context-aware methods in those applications. Since big data is growing very rapidly, context awareness in big data analytics has become more important and timely because of its proven efficiency in big data understanding and preparation, contributing to extracting the more and accurate value of big data. Many surveys have been published on context-based methods such as context modelling and reasoning, workflow adaptations, computational intelligence techniques and mobile ubiquitous systems. However, to our knowledge, no survey of context-aware methods on big data analytics for business applications supported by enterprise level software has been published to date. To bridge this research gap, in this paper first, we present a definition of context, its modelling and evaluation techniques, and highlight the importance of contextual information for big data analytics. Second, the works in three key business application areas that are context-aware and/or exploit big data analytics have been thoroughly reviewed. Finally, the paper concludes by highlighting a number of contemporary research challenges, including issues concerning modelling, managing and applying business contexts to big data analytics. © 2020, Springer-Verlag London Ltd., part of Springer Nature.
- Authors: Dinh, Loan , Karmakar, Gour , Kamruzzaman, Joarder
- Date: 2020
- Type: Text , Journal article
- Relation: Knowledge and Information Systems Vol. 62, no. 9 (2020), p. 3387-3415
- Full Text:
- Reviewed:
- Description: The concept of context awareness has been in existence since the 1990s. Though initially applied exclusively in computer science, over time it has increasingly been adopted by many different application domains such as business, health and military. Contexts change continuously because of objective reasons, such as economic situation, political matter and social issues. The adoption of big data analytics by businesses is facilitating such change at an even faster rate in much complicated ways. The potential benefits of embedding contextual information into an application are already evidenced by the improved outcomes of the existing context-aware methods in those applications. Since big data is growing very rapidly, context awareness in big data analytics has become more important and timely because of its proven efficiency in big data understanding and preparation, contributing to extracting the more and accurate value of big data. Many surveys have been published on context-based methods such as context modelling and reasoning, workflow adaptations, computational intelligence techniques and mobile ubiquitous systems. However, to our knowledge, no survey of context-aware methods on big data analytics for business applications supported by enterprise level software has been published to date. To bridge this research gap, in this paper first, we present a definition of context, its modelling and evaluation techniques, and highlight the importance of contextual information for big data analytics. Second, the works in three key business application areas that are context-aware and/or exploit big data analytics have been thoroughly reviewed. Finally, the paper concludes by highlighting a number of contemporary research challenges, including issues concerning modelling, managing and applying business contexts to big data analytics. © 2020, Springer-Verlag London Ltd., part of Springer Nature.
Intelligent energy prediction techniques for fog computing networks
- Farooq, Umar, Shabir, Muhammad, Javed, Muhammad, Imran, Muhammad
- Authors: Farooq, Umar , Shabir, Muhammad , Javed, Muhammad , Imran, Muhammad
- Date: 2021
- Type: Text , Journal article
- Relation: Applied Soft Computing Vol. 111, no. (2021), p.
- Full Text:
- Reviewed:
- Description: Energy Efficiency is a key concern for future fog-enabled Internet of Things (IoT). Since Fog Nodes (FNs) are energy-constrained devices, task offloading techniques must consider the energy consumption of the FNs to maximize the performance of IoT applications. In this context, accurate energy prediction can enable the development of intelligent energy-aware task offloading techniques. In this paper, we present two energy prediction techniques, the first one is based on the Recursive Least Square (RLS) filter and the second one uses the Artificial Neural Network (ANN). Both techniques use inputs such as the number of tasks and size of the tasks to predict the energy consumption at different fog nodes. Simulation results show that both techniques have a root mean square error of less than 3%. However, the ANN-based technique shows up to 20% less root mean square error as compared to the RLS-based technique. © 2021 Elsevier B.V.
- Authors: Farooq, Umar , Shabir, Muhammad , Javed, Muhammad , Imran, Muhammad
- Date: 2021
- Type: Text , Journal article
- Relation: Applied Soft Computing Vol. 111, no. (2021), p.
- Full Text:
- Reviewed:
- Description: Energy Efficiency is a key concern for future fog-enabled Internet of Things (IoT). Since Fog Nodes (FNs) are energy-constrained devices, task offloading techniques must consider the energy consumption of the FNs to maximize the performance of IoT applications. In this context, accurate energy prediction can enable the development of intelligent energy-aware task offloading techniques. In this paper, we present two energy prediction techniques, the first one is based on the Recursive Least Square (RLS) filter and the second one uses the Artificial Neural Network (ANN). Both techniques use inputs such as the number of tasks and size of the tasks to predict the energy consumption at different fog nodes. Simulation results show that both techniques have a root mean square error of less than 3%. However, the ANN-based technique shows up to 20% less root mean square error as compared to the RLS-based technique. © 2021 Elsevier B.V.
The gene of scientific success
- Kong, Xiangjie, Zhang, Jun, Zhang, Da, Bu, Yi, Ding, Ying, Xia, Feng
- Authors: Kong, Xiangjie , Zhang, Jun , Zhang, Da , Bu, Yi , Ding, Ying , Xia, Feng
- Date: 2020
- Type: Text , Journal article
- Relation: ACM Transactions on Knowledge Discovery from Data Vol. 14, no. 4 (2020), p.
- Full Text:
- Reviewed:
- Description: This article elaborates how to identify and evaluate causal factors to improve scientific impact. Currently, analyzing scientific impact can be beneficial to various academic activities including funding application, mentor recommendation, discovering potential cooperators, and the like. It is universally acknowledged that high-impact scholars often have more opportunities to receive awards as an encouragement for their hard work. Therefore, scholars spend great efforts in making scientific achievements and improving scientific impact during their academic life. However, what are the determinate factors that control scholars' academic success? The answer to this question can help scholars conduct their research more efficiently. Under this consideration, our article presents and analyzes the causal factors that are crucial for scholars' academic success. We first propose five major factors including article-centered factors, author-centered factors, venue-centered factors, institution-centered factors, and temporal factors. Then, we apply recent advanced machine learning algorithms and jackknife method to assess the importance of each causal factor. Our empirical results show that author-centered and article-centered factors have the highest relevancy to scholars' future success in the computer science area. Additionally, we discover an interesting phenomenon that the h-index of scholars within the same institution or university are actually very close to each other. © 2020 ACM.
- Authors: Kong, Xiangjie , Zhang, Jun , Zhang, Da , Bu, Yi , Ding, Ying , Xia, Feng
- Date: 2020
- Type: Text , Journal article
- Relation: ACM Transactions on Knowledge Discovery from Data Vol. 14, no. 4 (2020), p.
- Full Text:
- Reviewed:
- Description: This article elaborates how to identify and evaluate causal factors to improve scientific impact. Currently, analyzing scientific impact can be beneficial to various academic activities including funding application, mentor recommendation, discovering potential cooperators, and the like. It is universally acknowledged that high-impact scholars often have more opportunities to receive awards as an encouragement for their hard work. Therefore, scholars spend great efforts in making scientific achievements and improving scientific impact during their academic life. However, what are the determinate factors that control scholars' academic success? The answer to this question can help scholars conduct their research more efficiently. Under this consideration, our article presents and analyzes the causal factors that are crucial for scholars' academic success. We first propose five major factors including article-centered factors, author-centered factors, venue-centered factors, institution-centered factors, and temporal factors. Then, we apply recent advanced machine learning algorithms and jackknife method to assess the importance of each causal factor. Our empirical results show that author-centered and article-centered factors have the highest relevancy to scholars' future success in the computer science area. Additionally, we discover an interesting phenomenon that the h-index of scholars within the same institution or university are actually very close to each other. © 2020 ACM.
Tracing the Pace of COVID-19 research : topic modeling and evolution
- Liu, Jiaying, Nie, Hansong, Li, Shihao, Ren, Jing, Xia, Feng
- Authors: Liu, Jiaying , Nie, Hansong , Li, Shihao , Ren, Jing , Xia, Feng
- Date: 2021
- Type: Text , Journal article
- Relation: Big Data Research Vol. 25, no. (2021), p.
- Full Text:
- Reviewed:
- Description: COVID-19 has been spreading rapidly around the world. With the growing attention on the deadly pandemic, discussions and research on COVID-19 are rapidly increasing to exchange latest findings with the hope to accelerate the pace of finding a cure. As a branch of information technology, artificial intelligence (AI) has greatly expedited the development of human society. In this paper, we investigate and visualize the on-going advancements of early scientific research on COVID-19 from the perspective of AI. By adopting the Latent Dirichlet Allocation (LDA) model, this paper allocates the research articles into 50 key research topics pertinent to COVID-19 according to their abstracts. We present an overview of early studies of the COVID-19 crisis at different scales including referencing/citation behavior, topic variation and their inner interactions. We also identify innovative papers that are regarded as the cornerstones in the development of COVID-19 research. The results unveil the focus of scientific research, thereby giving deep insights into how the academic society contributes to combating the COVID-19 pandemic. © 2021 Elsevier Inc. **Please note that there are multiple authors for this article therefore only the name of the first 5 including Federation University Australia affiliate “Jing Ren and Feng Xia" is provided in this record**
- Description: COVID-19 has been spreading rapidly around the world. With the growing attention on the deadly pandemic, discussions and research on COVID-19 are rapidly increasing to exchange latest findings with the hope to accelerate the pace of finding a cure. As a branch of information technology, artificial intelligence (AI) has greatly expedited the development of human society. In this paper, we investigate and visualize the on-going advancements of early scientific research on COVID-19 from the perspective of AI. By adopting the Latent Dirichlet Allocation (LDA) model, this paper allocates the research articles into 50 key research topics pertinent to COVID-19 according to their abstracts. We present an overview of early studies of the COVID-19 crisis at different scales including referencing/citation behavior, topic variation and their inner interactions. We also identify innovative papers that are regarded as the cornerstones in the development of COVID-19 research. The results unveil the focus of scientific research, thereby giving deep insights into how the academic society contributes to combating the COVID-19 pandemic. © 2021 Elsevier Inc.
- Authors: Liu, Jiaying , Nie, Hansong , Li, Shihao , Ren, Jing , Xia, Feng
- Date: 2021
- Type: Text , Journal article
- Relation: Big Data Research Vol. 25, no. (2021), p.
- Full Text:
- Reviewed:
- Description: COVID-19 has been spreading rapidly around the world. With the growing attention on the deadly pandemic, discussions and research on COVID-19 are rapidly increasing to exchange latest findings with the hope to accelerate the pace of finding a cure. As a branch of information technology, artificial intelligence (AI) has greatly expedited the development of human society. In this paper, we investigate and visualize the on-going advancements of early scientific research on COVID-19 from the perspective of AI. By adopting the Latent Dirichlet Allocation (LDA) model, this paper allocates the research articles into 50 key research topics pertinent to COVID-19 according to their abstracts. We present an overview of early studies of the COVID-19 crisis at different scales including referencing/citation behavior, topic variation and their inner interactions. We also identify innovative papers that are regarded as the cornerstones in the development of COVID-19 research. The results unveil the focus of scientific research, thereby giving deep insights into how the academic society contributes to combating the COVID-19 pandemic. © 2021 Elsevier Inc. **Please note that there are multiple authors for this article therefore only the name of the first 5 including Federation University Australia affiliate “Jing Ren and Feng Xia" is provided in this record**
- Description: COVID-19 has been spreading rapidly around the world. With the growing attention on the deadly pandemic, discussions and research on COVID-19 are rapidly increasing to exchange latest findings with the hope to accelerate the pace of finding a cure. As a branch of information technology, artificial intelligence (AI) has greatly expedited the development of human society. In this paper, we investigate and visualize the on-going advancements of early scientific research on COVID-19 from the perspective of AI. By adopting the Latent Dirichlet Allocation (LDA) model, this paper allocates the research articles into 50 key research topics pertinent to COVID-19 according to their abstracts. We present an overview of early studies of the COVID-19 crisis at different scales including referencing/citation behavior, topic variation and their inner interactions. We also identify innovative papers that are regarded as the cornerstones in the development of COVID-19 research. The results unveil the focus of scientific research, thereby giving deep insights into how the academic society contributes to combating the COVID-19 pandemic. © 2021 Elsevier Inc.
Deep Reinforcement Learning for Vehicular Edge Computing: An Intelligent Offloading System
- Ning, Zhaolong, Dong, Peiran, Wang, Xiaojie, Rodrigues, Joel, Xia, Feng
- Authors: Ning, Zhaolong , Dong, Peiran , Wang, Xiaojie , Rodrigues, Joel , Xia, Feng
- Date: 2019
- Type: Text , Journal article
- Relation: ACM Transactions on Intelligent Systems and Technology Vol. 10, no. 6 (Dec 2019), p. 24
- Full Text:
- Reviewed:
- Description: The development of smart vehicles brings drivers and passengers a comfortable and safe environment. Various emerging applications are promising to enrich users' traveling experiences and daily life. However, how to execute computing-intensive applications on resource-constrained vehicles still faces huge challenges. In this article, we construct an intelligent offloading system for vehicular edge computing by leveraging deep reinforcement learning. First, both the communication and computation states are modelled by finite Markov chains. Moreover, the task scheduling and resource allocation strategy is formulated as a joint optimization problem to maximize users' Quality of Experience (QoE). Due to its complexity, the original problem is further divided into two sub-optimization problems. A two-sided matching scheme and a deep reinforcement learning approach are developed to schedule offloading requests and allocate network resources, respectively. Performance evaluations illustrate the effectiveness and superiority of our constructed system.
- Authors: Ning, Zhaolong , Dong, Peiran , Wang, Xiaojie , Rodrigues, Joel , Xia, Feng
- Date: 2019
- Type: Text , Journal article
- Relation: ACM Transactions on Intelligent Systems and Technology Vol. 10, no. 6 (Dec 2019), p. 24
- Full Text:
- Reviewed:
- Description: The development of smart vehicles brings drivers and passengers a comfortable and safe environment. Various emerging applications are promising to enrich users' traveling experiences and daily life. However, how to execute computing-intensive applications on resource-constrained vehicles still faces huge challenges. In this article, we construct an intelligent offloading system for vehicular edge computing by leveraging deep reinforcement learning. First, both the communication and computation states are modelled by finite Markov chains. Moreover, the task scheduling and resource allocation strategy is formulated as a joint optimization problem to maximize users' Quality of Experience (QoE). Due to its complexity, the original problem is further divided into two sub-optimization problems. A two-sided matching scheme and a deep reinforcement learning approach are developed to schedule offloading requests and allocate network resources, respectively. Performance evaluations illustrate the effectiveness and superiority of our constructed system.
Integrated generalized zero-shot learning for fine-grained classification
- Shermin, Tasfia, Teng, Shyh, Sohel, Ferdous, Murshed, Manzur, Lu, Guojun
- Authors: Shermin, Tasfia , Teng, Shyh , Sohel, Ferdous , Murshed, Manzur , Lu, Guojun
- Date: 2022
- Type: Text , Journal article
- Relation: Pattern Recognition Vol. 122, no. (2022), p.
- Full Text:
- Reviewed:
- Description: Embedding learning (EL) and feature synthesizing (FS) are two of the popular categories of fine-grained GZSL methods. EL or FS using global features cannot discriminate fine details in the absence of local features. On the other hand, EL or FS methods exploiting local features either neglect direct attribute guidance or global information. Consequently, neither method performs well. In this paper, we propose to explore global and direct attribute-supervised local visual features for both EL and FS categories in an integrated manner for fine-grained GZSL. The proposed integrated network has an EL sub-network and a FS sub-network. Consequently, the proposed integrated network can be tested in two ways. We propose a novel two-step dense attention mechanism to discover attribute-guided local visual features. We introduce new mutual learning between the sub-networks to exploit mutually beneficial information for optimization. Moreover, we propose to compute source-target class similarity based on mutual information and transfer-learn the target classes to reduce bias towards the source domain during testing. We demonstrate that our proposed method outperforms contemporary methods on benchmark datasets. © 2021 Elsevier Ltd
- Authors: Shermin, Tasfia , Teng, Shyh , Sohel, Ferdous , Murshed, Manzur , Lu, Guojun
- Date: 2022
- Type: Text , Journal article
- Relation: Pattern Recognition Vol. 122, no. (2022), p.
- Full Text:
- Reviewed:
- Description: Embedding learning (EL) and feature synthesizing (FS) are two of the popular categories of fine-grained GZSL methods. EL or FS using global features cannot discriminate fine details in the absence of local features. On the other hand, EL or FS methods exploiting local features either neglect direct attribute guidance or global information. Consequently, neither method performs well. In this paper, we propose to explore global and direct attribute-supervised local visual features for both EL and FS categories in an integrated manner for fine-grained GZSL. The proposed integrated network has an EL sub-network and a FS sub-network. Consequently, the proposed integrated network can be tested in two ways. We propose a novel two-step dense attention mechanism to discover attribute-guided local visual features. We introduce new mutual learning between the sub-networks to exploit mutually beneficial information for optimization. Moreover, we propose to compute source-target class similarity based on mutual information and transfer-learn the target classes to reduce bias towards the source domain during testing. We demonstrate that our proposed method outperforms contemporary methods on benchmark datasets. © 2021 Elsevier Ltd
Scholar2vec : vector representation of scholars for lifetime collaborator prediction
- Wang, Wei, Xia, Feng, Wu, Jian, Gong, Zhiguo, Tong, Hanghang, Davison, Brian
- Authors: Wang, Wei , Xia, Feng , Wu, Jian , Gong, Zhiguo , Tong, Hanghang , Davison, Brian
- Date: 2021
- Type: Text , Journal article
- Relation: ACM Transactions on Knowledge Discovery from Data Vol. 15, no. 3 (2021), p.
- Full Text:
- Reviewed:
- Description: While scientific collaboration is critical for a scholar, some collaborators can be more significant than others, e.g., lifetime collaborators. It has been shown that lifetime collaborators are more influential on a scholar's academic performance. However, little research has been done on investigating predicting such special relationships in academic networks. To this end, we propose Scholar2vec, a novel neural network embedding for representing scholar profiles. First, our approach creates scholars' research interest vector from textual information, such as demographics, research, and influence. After bridging research interests with a collaboration network, vector representations of scholars can be gained with graph learning. Meanwhile, since scholars are occupied with various attributes, we propose to incorporate four types of scholar attributes for learning scholar vectors. Finally, the early-stage similarity sequence based on Scholar2vec is used to predict lifetime collaborators with machine learning methods. Extensive experiments on two real-world datasets show that Scholar2vec outperforms state-of-the-art methods in lifetime collaborator prediction. Our work presents a new way to measure the similarity between two scholars by vector representation, which tackles the knowledge between network embedding and academic relationship mining. © 2021 Association for Computing Machinery.
- Authors: Wang, Wei , Xia, Feng , Wu, Jian , Gong, Zhiguo , Tong, Hanghang , Davison, Brian
- Date: 2021
- Type: Text , Journal article
- Relation: ACM Transactions on Knowledge Discovery from Data Vol. 15, no. 3 (2021), p.
- Full Text:
- Reviewed:
- Description: While scientific collaboration is critical for a scholar, some collaborators can be more significant than others, e.g., lifetime collaborators. It has been shown that lifetime collaborators are more influential on a scholar's academic performance. However, little research has been done on investigating predicting such special relationships in academic networks. To this end, we propose Scholar2vec, a novel neural network embedding for representing scholar profiles. First, our approach creates scholars' research interest vector from textual information, such as demographics, research, and influence. After bridging research interests with a collaboration network, vector representations of scholars can be gained with graph learning. Meanwhile, since scholars are occupied with various attributes, we propose to incorporate four types of scholar attributes for learning scholar vectors. Finally, the early-stage similarity sequence based on Scholar2vec is used to predict lifetime collaborators with machine learning methods. Extensive experiments on two real-world datasets show that Scholar2vec outperforms state-of-the-art methods in lifetime collaborator prediction. Our work presents a new way to measure the similarity between two scholars by vector representation, which tackles the knowledge between network embedding and academic relationship mining. © 2021 Association for Computing Machinery.
How to optimize an academic team when the outlier member is leaving?
- Yu, Shuo, Liu, Jiaying, Wei, Haoran, Xia, Feng, Tong, Hanghang
- Authors: Yu, Shuo , Liu, Jiaying , Wei, Haoran , Xia, Feng , Tong, Hanghang
- Date: 2021
- Type: Text , Journal article
- Relation: IEEE Intelligent Systems Vol. 36, no. 3 (May-Jun 2021), p. 23-30
- Full Text:
- Reviewed:
- Description: An academic team is a highly cohesive collaboration group of scholars, which has been recognized as an effective way to improve scientific output in terms of both quality and quantity. However, the high staff turnover brings about a series of problems that may have negative influences on team performance. To address this challenge, we first detect the tendency of the member who may potentially leave. Here, the outlierness is defined with respect to familiarity, which is quantified by using collaboration intensity. It is assumed that if a team member has a higher familiarity with scholars outside the team, then this member might probably leave the team. To minimize the influence caused by the leaving of such an outlier member, we propose an optimization solution to find a proper candidate who can replace the outlier member. Based on random walk with graph kernel, our solution involves familiarity matching, skill matching, as well as structure matching. The proposed approach proves to be effective and outperforms existing methods when applied to computer science academic teams.
- Authors: Yu, Shuo , Liu, Jiaying , Wei, Haoran , Xia, Feng , Tong, Hanghang
- Date: 2021
- Type: Text , Journal article
- Relation: IEEE Intelligent Systems Vol. 36, no. 3 (May-Jun 2021), p. 23-30
- Full Text:
- Reviewed:
- Description: An academic team is a highly cohesive collaboration group of scholars, which has been recognized as an effective way to improve scientific output in terms of both quality and quantity. However, the high staff turnover brings about a series of problems that may have negative influences on team performance. To address this challenge, we first detect the tendency of the member who may potentially leave. Here, the outlierness is defined with respect to familiarity, which is quantified by using collaboration intensity. It is assumed that if a team member has a higher familiarity with scholars outside the team, then this member might probably leave the team. To minimize the influence caused by the leaving of such an outlier member, we propose an optimization solution to find a proper candidate who can replace the outlier member. Based on random walk with graph kernel, our solution involves familiarity matching, skill matching, as well as structure matching. The proposed approach proves to be effective and outperforms existing methods when applied to computer science academic teams.
Data-driven computational social science : A survey
- Zhang, Jun, Wang, Wei, Xia, Feng, Lin, Yu-Ru, Tong, Hanghang
- Authors: Zhang, Jun , Wang, Wei , Xia, Feng , Lin, Yu-Ru , Tong, Hanghang
- Date: 2020
- Type: Text , Journal article
- Relation: Big Data Research Vol. 21, no. (2020), p. 1-22
- Full Text:
- Reviewed:
- Description: Social science concerns issues on individuals, relationships, and the whole society. The complexity of research topics in social science makes it the amalgamation of multiple disciplines, such as economics, political science, and sociology, etc. For centuries, scientists have conducted many studies to understand the mechanisms of the society. However, due to the limitations of traditional research methods, there exist many critical social issues to be explored. To solve those issues, computational social science emerges due to the rapid advancements of computation technologies and the profound studies on social science. With the aids of the advanced research techniques, various kinds of data from diverse areas can be acquired nowadays, and they can help us look into social problems with a new eye. As a result, utilizing various data to reveal issues derived from computational social science area has attracted more and more attentions. In this paper, to the best of our knowledge, we present a survey on datadriven computational social science for the first time which primarily focuses on reviewing application domains involving human dynamics. The state-of-the-art research on human dynamics is reviewed from three aspects: individuals, relationships, and collectives. Specifically, the research methodologies used to address research challenges in aforementioned application domains are summarized. In addition, some important open challenges with respect to both emerging research topics and research methods are discussed.
- Authors: Zhang, Jun , Wang, Wei , Xia, Feng , Lin, Yu-Ru , Tong, Hanghang
- Date: 2020
- Type: Text , Journal article
- Relation: Big Data Research Vol. 21, no. (2020), p. 1-22
- Full Text:
- Reviewed:
- Description: Social science concerns issues on individuals, relationships, and the whole society. The complexity of research topics in social science makes it the amalgamation of multiple disciplines, such as economics, political science, and sociology, etc. For centuries, scientists have conducted many studies to understand the mechanisms of the society. However, due to the limitations of traditional research methods, there exist many critical social issues to be explored. To solve those issues, computational social science emerges due to the rapid advancements of computation technologies and the profound studies on social science. With the aids of the advanced research techniques, various kinds of data from diverse areas can be acquired nowadays, and they can help us look into social problems with a new eye. As a result, utilizing various data to reveal issues derived from computational social science area has attracted more and more attentions. In this paper, to the best of our knowledge, we present a survey on datadriven computational social science for the first time which primarily focuses on reviewing application domains involving human dynamics. The state-of-the-art research on human dynamics is reviewed from three aspects: individuals, relationships, and collectives. Specifically, the research methodologies used to address research challenges in aforementioned application domains are summarized. In addition, some important open challenges with respect to both emerging research topics and research methods are discussed.
- «
- ‹
- 1
- ›
- »