CenGCN : centralized convolutional networks with vertex imbalance for scale-free graphs
- Xia, Feng, Wang, Lei, Tang, Tao, Chen, Xin, Kong, Xiangjie, Oatley, Giles, King, Irwin
- Authors: Xia, Feng , Wang, Lei , Tang, Tao , Chen, Xin , Kong, Xiangjie , Oatley, Giles , King, Irwin
- Date: 2023
- Type: Text , Journal article
- Relation: IEEE Transactions on Knowledge and Data Engineering Vol. 35, no. 5 (2023), p. 4555-4569
- Full Text:
- Reviewed:
- Description: Graph Convolutional Networks (GCNs) have achieved impressive performance in a wide variety of areas, attracting considerable attention. The core step of GCNs is the information-passing framework that considers all information from neighbors to the central vertex to be equally important. Such equal importance, however, is inadequate for scale-free networks, where hub vertices propagate more dominant information due to vertex imbalance. In this paper, we propose a novel centrality-based framework named CenGCN to address the inequality of information. This framework first quantifies the similarity between hub vertices and their neighbors by label propagation with hub vertices. Based on this similarity and centrality indices, the framework transforms the graph by increasing or decreasing the weights of edges connecting hub vertices and adding self-connections to vertices. In each non-output layer of the GCN, this framework uses a hub attention mechanism to assign new weights to connected non-hub vertices based on their common information with hub vertices. We present two variants CenGCN_D and CenGCN_E, based on degree centrality and eigenvector centrality, respectively. We also conduct comprehensive experiments, including vertex classification, link prediction, vertex clustering, and network visualization. The results demonstrate that the two variants significantly outperform state-of-the-art baselines. © 1989-2012 IEEE.
- Authors: Xia, Feng , Wang, Lei , Tang, Tao , Chen, Xin , Kong, Xiangjie , Oatley, Giles , King, Irwin
- Date: 2023
- Type: Text , Journal article
- Relation: IEEE Transactions on Knowledge and Data Engineering Vol. 35, no. 5 (2023), p. 4555-4569
- Full Text:
- Reviewed:
- Description: Graph Convolutional Networks (GCNs) have achieved impressive performance in a wide variety of areas, attracting considerable attention. The core step of GCNs is the information-passing framework that considers all information from neighbors to the central vertex to be equally important. Such equal importance, however, is inadequate for scale-free networks, where hub vertices propagate more dominant information due to vertex imbalance. In this paper, we propose a novel centrality-based framework named CenGCN to address the inequality of information. This framework first quantifies the similarity between hub vertices and their neighbors by label propagation with hub vertices. Based on this similarity and centrality indices, the framework transforms the graph by increasing or decreasing the weights of edges connecting hub vertices and adding self-connections to vertices. In each non-output layer of the GCN, this framework uses a hub attention mechanism to assign new weights to connected non-hub vertices based on their common information with hub vertices. We present two variants CenGCN_D and CenGCN_E, based on degree centrality and eigenvector centrality, respectively. We also conduct comprehensive experiments, including vertex classification, link prediction, vertex clustering, and network visualization. The results demonstrate that the two variants significantly outperform state-of-the-art baselines. © 1989-2012 IEEE.
Deep learning : survey of environmental and camera impacts on internet of things images
- Kaur, Roopdeep, Karmakar, Gour, Xia, Feng, Imran, Muhammad
- Authors: Kaur, Roopdeep , Karmakar, Gour , Xia, Feng , Imran, Muhammad
- Date: 2023
- Type: Text , Journal article
- Relation: Artificial Intelligence Review Vol. 56, no. 9 (2023), p. 9605-9638
- Full Text:
- Reviewed:
- Description: Internet of Things (IoT) images are captivating growing attention because of their wide range of applications which requires visual analysis to drive automation. However, IoT images are predominantly captured from outdoor environments and thus are inherently impacted by the camera and environmental parameters which can adversely affect corresponding applications. Deep Learning (DL) has been widely adopted in the field of image processing and computer vision and can reduce the impact of these parameters on IoT images. Albeit, there are many DL-based techniques available in the current literature for analyzing and reducing the environmental and camera impacts on IoT images. However, to the best of our knowledge, no survey paper presents state-of-the-art DL-based approaches for this purpose. Motivated by this, for the first time, we present a Systematic Literature Review (SLR) of existing DL techniques available for analyzing and reducing environmental and camera lens impacts on IoT images. As part of this SLR, firstly, we reiterate and highlight the significance of IoT images in their respective applications. Secondly, we describe the DL techniques employed for assessing the environmental and camera lens distortion impacts on IoT images. Thirdly, we illustrate how DL can be effective in reducing the impact of environmental and camera lens distortion in IoT images. Finally, along with the critical reflection on the advantages and limitations of the techniques, we also present ways to address the research challenges of existing techniques and identify some further researches to advance the relevant research areas. © 2023, The Author(s).
- Authors: Kaur, Roopdeep , Karmakar, Gour , Xia, Feng , Imran, Muhammad
- Date: 2023
- Type: Text , Journal article
- Relation: Artificial Intelligence Review Vol. 56, no. 9 (2023), p. 9605-9638
- Full Text:
- Reviewed:
- Description: Internet of Things (IoT) images are captivating growing attention because of their wide range of applications which requires visual analysis to drive automation. However, IoT images are predominantly captured from outdoor environments and thus are inherently impacted by the camera and environmental parameters which can adversely affect corresponding applications. Deep Learning (DL) has been widely adopted in the field of image processing and computer vision and can reduce the impact of these parameters on IoT images. Albeit, there are many DL-based techniques available in the current literature for analyzing and reducing the environmental and camera impacts on IoT images. However, to the best of our knowledge, no survey paper presents state-of-the-art DL-based approaches for this purpose. Motivated by this, for the first time, we present a Systematic Literature Review (SLR) of existing DL techniques available for analyzing and reducing environmental and camera lens impacts on IoT images. As part of this SLR, firstly, we reiterate and highlight the significance of IoT images in their respective applications. Secondly, we describe the DL techniques employed for assessing the environmental and camera lens distortion impacts on IoT images. Thirdly, we illustrate how DL can be effective in reducing the impact of environmental and camera lens distortion in IoT images. Finally, along with the critical reflection on the advantages and limitations of the techniques, we also present ways to address the research challenges of existing techniques and identify some further researches to advance the relevant research areas. © 2023, The Author(s).
Graph learning for anomaly analytics : algorithms, applications, and challenges
- Ren, Jing, Xia, Feng, Lee, Ivan, Noori Hoshyar, Azadeh, Aggarwal, Charu
- Authors: Ren, Jing , Xia, Feng , Lee, Ivan , Noori Hoshyar, Azadeh , Aggarwal, Charu
- Date: 2023
- Type: Text , Journal article
- Relation: ACM Transactions on Intelligent Systems and Technology Vol. 14, no. 2 (2023), p.
- Full Text:
- Reviewed:
- Description: Anomaly analytics is a popular and vital task in various research contexts that has been studied for several decades. At the same time, deep learning has shown its capacity in solving many graph-based tasks, like node classification, link prediction, and graph classification. Recently, many studies are extending graph learning models for solving anomaly analytics problems, resulting in beneficial advances in graph-based anomaly analytics techniques. In this survey, we provide a comprehensive overview of graph learning methods for anomaly analytics tasks. We classify them into four categories based on their model architectures, namely graph convolutional network, graph attention network, graph autoencoder, and other graph learning models. The differences between these methods are also compared in a systematic manner. Furthermore, we outline several graph-based anomaly analytics applications across various domains in the real world. Finally, we discuss five potential future research directions in this rapidly growing field. © 2023 Association for Computing Machinery.
- Authors: Ren, Jing , Xia, Feng , Lee, Ivan , Noori Hoshyar, Azadeh , Aggarwal, Charu
- Date: 2023
- Type: Text , Journal article
- Relation: ACM Transactions on Intelligent Systems and Technology Vol. 14, no. 2 (2023), p.
- Full Text:
- Reviewed:
- Description: Anomaly analytics is a popular and vital task in various research contexts that has been studied for several decades. At the same time, deep learning has shown its capacity in solving many graph-based tasks, like node classification, link prediction, and graph classification. Recently, many studies are extending graph learning models for solving anomaly analytics problems, resulting in beneficial advances in graph-based anomaly analytics techniques. In this survey, we provide a comprehensive overview of graph learning methods for anomaly analytics tasks. We classify them into four categories based on their model architectures, namely graph convolutional network, graph attention network, graph autoencoder, and other graph learning models. The differences between these methods are also compared in a systematic manner. Furthermore, we outline several graph-based anomaly analytics applications across various domains in the real world. Finally, we discuss five potential future research directions in this rapidly growing field. © 2023 Association for Computing Machinery.
Knowledge graphs : opportunities and challenges
- Peng, Ciyuan, Xia, Feng, Naseriparsa, Mehdi, Osborne, Francesco
- Authors: Peng, Ciyuan , Xia, Feng , Naseriparsa, Mehdi , Osborne, Francesco
- Date: 2023
- Type: Text , Journal article
- Relation: Artificial Intelligence Review Vol. 56, no. 11 (2023), p. 13071-13102
- Full Text:
- Reviewed:
- Description: With the explosive growth of artificial intelligence (AI) and big data, it has become vitally important to organize and represent the enormous volume of knowledge appropriately. As graph data, knowledge graphs accumulate and convey knowledge of the real world. It has been well-recognized that knowledge graphs effectively represent complex information; hence, they rapidly gain the attention of academia and industry in recent years. Thus to develop a deeper understanding of knowledge graphs, this paper presents a systematic overview of this field. Specifically, we focus on the opportunities and challenges of knowledge graphs. We first review the opportunities of knowledge graphs in terms of two aspects: (1) AI systems built upon knowledge graphs; (2) potential application fields of knowledge graphs. Then, we thoroughly discuss severe technical challenges in this field, such as knowledge graph embeddings, knowledge acquisition, knowledge graph completion, knowledge fusion, and knowledge reasoning. We expect that this survey will shed new light on future research and the development of knowledge graphs. © 2023, The Author(s).
- Authors: Peng, Ciyuan , Xia, Feng , Naseriparsa, Mehdi , Osborne, Francesco
- Date: 2023
- Type: Text , Journal article
- Relation: Artificial Intelligence Review Vol. 56, no. 11 (2023), p. 13071-13102
- Full Text:
- Reviewed:
- Description: With the explosive growth of artificial intelligence (AI) and big data, it has become vitally important to organize and represent the enormous volume of knowledge appropriately. As graph data, knowledge graphs accumulate and convey knowledge of the real world. It has been well-recognized that knowledge graphs effectively represent complex information; hence, they rapidly gain the attention of academia and industry in recent years. Thus to develop a deeper understanding of knowledge graphs, this paper presents a systematic overview of this field. Specifically, we focus on the opportunities and challenges of knowledge graphs. We first review the opportunities of knowledge graphs in terms of two aspects: (1) AI systems built upon knowledge graphs; (2) potential application fields of knowledge graphs. Then, we thoroughly discuss severe technical challenges in this field, such as knowledge graph embeddings, knowledge acquisition, knowledge graph completion, knowledge fusion, and knowledge reasoning. We expect that this survey will shed new light on future research and the development of knowledge graphs. © 2023, The Author(s).
Lost at starting line : predicting maladaptation of university freshmen based on educational big data
- Guo, Teng, Bai, Xiaomei, Zhen, Shihao, Abid, Shagufta, Xia, Feng
- Authors: Guo, Teng , Bai, Xiaomei , Zhen, Shihao , Abid, Shagufta , Xia, Feng
- Date: 2023
- Type: Text , Journal article
- Relation: Journal of the Association for Information Science and Technology Vol. 74, no. 1 (2023), p. 17-32
- Full Text:
- Reviewed:
- Description: The transition from secondary education to higher education could be challenging for most freshmen. For students who fail to adjust to university life smoothly, their status may worsen if the university cannot offer timely and proper guidance. Helping students adapt to university life is a long-term goal for any academic institution. Therefore, understanding the nature of the maladaptation phenomenon and the early prediction of “at-risk” students are crucial tasks that urgently need to be tackled effectively. This article aims to analyze the relevant factors that affect the maladaptation phenomenon and predict this phenomenon in advance. We develop a prediction framework (MAladaptive STudEnt pRediction, MASTER) for the early prediction of students with maladaptation. First, our framework uses the SMOTE (Synthetic Minority Oversampling Technique) algorithm to solve the data label imbalance issue. Moreover, a novel ensemble algorithm, priority forest, is proposed for outputting ranks instead of binary results, which enables us to perform proactive interventions in a prioritized manner where limited education resources are available. Experimental results on real-world education datasets demonstrate that the MASTER framework outperforms other state-of-art methods. © 2022 The Authors. Journal of the Association for Information Science and Technology published by Wiley Periodicals LLC on behalf of Association for Information Science and Technology.
- Authors: Guo, Teng , Bai, Xiaomei , Zhen, Shihao , Abid, Shagufta , Xia, Feng
- Date: 2023
- Type: Text , Journal article
- Relation: Journal of the Association for Information Science and Technology Vol. 74, no. 1 (2023), p. 17-32
- Full Text:
- Reviewed:
- Description: The transition from secondary education to higher education could be challenging for most freshmen. For students who fail to adjust to university life smoothly, their status may worsen if the university cannot offer timely and proper guidance. Helping students adapt to university life is a long-term goal for any academic institution. Therefore, understanding the nature of the maladaptation phenomenon and the early prediction of “at-risk” students are crucial tasks that urgently need to be tackled effectively. This article aims to analyze the relevant factors that affect the maladaptation phenomenon and predict this phenomenon in advance. We develop a prediction framework (MAladaptive STudEnt pRediction, MASTER) for the early prediction of students with maladaptation. First, our framework uses the SMOTE (Synthetic Minority Oversampling Technique) algorithm to solve the data label imbalance issue. Moreover, a novel ensemble algorithm, priority forest, is proposed for outputting ranks instead of binary results, which enables us to perform proactive interventions in a prioritized manner where limited education resources are available. Experimental results on real-world education datasets demonstrate that the MASTER framework outperforms other state-of-art methods. © 2022 The Authors. Journal of the Association for Information Science and Technology published by Wiley Periodicals LLC on behalf of Association for Information Science and Technology.
MSCET : a multi-scenario offloading schedule for biomedical data processing and analysis in cloud-edge-terminal collaborative vehicular networks
- Ni, Zhichen, Chen, Honglong, Li, Zhe, Wang, Xiaomeng, Yan, Na, Liu, Weifeng, Xia, Feng
- Authors: Ni, Zhichen , Chen, Honglong , Li, Zhe , Wang, Xiaomeng , Yan, Na , Liu, Weifeng , Xia, Feng
- Date: 2023
- Type: Text , Journal article
- Relation: IEEE/ACM Transactions on Computational Biology and Bioinformatics Vol. 20, no. 4 (2023), p. 2376-2386
- Full Text:
- Reviewed:
- Description: With the rapid development of Artificial Intelligence (AI) and Internet of Things (IoTs), an increasing number of computation intensive or delay sensitive biomedical data processing and analysis tasks are produced in vehicles, bringing more and more challenges to the biometric monitoring of drivers. Edge computing is a new paradigm to solve these challenges by offloading tasks from the resource-limited vehicles to Edge Servers (ESs) in Road Side Units (RSUs). However, most of the traditional offloading schedules for vehicular networks concentrate on the edge, while some tasks may be too complex for ESs to process. To this end, we consider a collaborative vehicular network in which the cloud, edge and terminal can cooperate with each other to accomplish the tasks. The vehicles can offload the computation intensive tasks to the cloud to save the resource of edge. We further construct the virtual resource pool which can integrate the resource of multiple ESs since some regions may be covered by multiple RSUs. In this paper, we propose a Multi-Scenario offloading schedule for biomedical data processing and analysis in Cloud-Edge-Terminal collaborative vehicular networks called MSCET. The parameters of the proposed MSCET are optimized to maximize the system utility. We also conduct extensive simulations to evaluate the proposed MSCET and the results illustrate that MSCET outperforms other existing schedules. © 2004-2012 IEEE.
- Authors: Ni, Zhichen , Chen, Honglong , Li, Zhe , Wang, Xiaomeng , Yan, Na , Liu, Weifeng , Xia, Feng
- Date: 2023
- Type: Text , Journal article
- Relation: IEEE/ACM Transactions on Computational Biology and Bioinformatics Vol. 20, no. 4 (2023), p. 2376-2386
- Full Text:
- Reviewed:
- Description: With the rapid development of Artificial Intelligence (AI) and Internet of Things (IoTs), an increasing number of computation intensive or delay sensitive biomedical data processing and analysis tasks are produced in vehicles, bringing more and more challenges to the biometric monitoring of drivers. Edge computing is a new paradigm to solve these challenges by offloading tasks from the resource-limited vehicles to Edge Servers (ESs) in Road Side Units (RSUs). However, most of the traditional offloading schedules for vehicular networks concentrate on the edge, while some tasks may be too complex for ESs to process. To this end, we consider a collaborative vehicular network in which the cloud, edge and terminal can cooperate with each other to accomplish the tasks. The vehicles can offload the computation intensive tasks to the cloud to save the resource of edge. We further construct the virtual resource pool which can integrate the resource of multiple ESs since some regions may be covered by multiple RSUs. In this paper, we propose a Multi-Scenario offloading schedule for biomedical data processing and analysis in Cloud-Edge-Terminal collaborative vehicular networks called MSCET. The parameters of the proposed MSCET are optimized to maximize the system utility. We also conduct extensive simulations to evaluate the proposed MSCET and the results illustrate that MSCET outperforms other existing schedules. © 2004-2012 IEEE.
A reliable image quality assessment metric : evaluation using camera impacts
- Kaur, Roopdeep, Karmakar, Gour, Xia, Feng
- Authors: Kaur, Roopdeep , Karmakar, Gour , Xia, Feng
- Date: 2022
- Type: Text , Journal article
- Relation: Pattern Recognition and Image Analysis Vol. 32, no. 3 (2022), p. 551-560
- Full Text:
- Reviewed:
- Description: Abstract: Image analysis is being applied in many applications including industrial automation with the Industrial Internet of Things and machine vision. The images captured by cameras, especially from the outdoor environment are impacted by various parameters such as lens blur, dirty lens, and lens distortion (barrel distortion). There exist many approaches that assess the impact of camera parameters on the quality of the images. However, most of these techniques do not use important quality assessment metrics such as oriented FAST and rotated BRIEF, and structural content. None of these techniques objectively evaluate the impact of barrel distortion on the image quality using quality assessment metrics such as mean square error, peak signal-to-noise ratio, structural content, oriented FAST, and rotated BRIEF, and structural similarity index. In this paper, besides lens dirtiness and blurring, we also examine the impact of barrel distortion using various types of datasets having different levels of barrel distortion. Analysis shows none of the existing metrics produces quality values consistent with intuitively defined impact levels for lens blur, dirtiness, and barrel distortion. To address the loopholes of existing metrics and make the quality assessment metric more reliable, we propose a new image quality assessment metric that fuses the quality values obtained from different metrics using a decision fusion technique known as the Dempster–Shafer theory. Our proposed metric produces quality values that are more consistent and conform with the perceptually defined camera parameter impact levels. For all the above-mentioned camera impacts, our proposed metric exhibits 100% assessment reliability, which includes an enormous improvement over other metrics. © 2022, Pleiades Publishing, Ltd.
- Authors: Kaur, Roopdeep , Karmakar, Gour , Xia, Feng
- Date: 2022
- Type: Text , Journal article
- Relation: Pattern Recognition and Image Analysis Vol. 32, no. 3 (2022), p. 551-560
- Full Text:
- Reviewed:
- Description: Abstract: Image analysis is being applied in many applications including industrial automation with the Industrial Internet of Things and machine vision. The images captured by cameras, especially from the outdoor environment are impacted by various parameters such as lens blur, dirty lens, and lens distortion (barrel distortion). There exist many approaches that assess the impact of camera parameters on the quality of the images. However, most of these techniques do not use important quality assessment metrics such as oriented FAST and rotated BRIEF, and structural content. None of these techniques objectively evaluate the impact of barrel distortion on the image quality using quality assessment metrics such as mean square error, peak signal-to-noise ratio, structural content, oriented FAST, and rotated BRIEF, and structural similarity index. In this paper, besides lens dirtiness and blurring, we also examine the impact of barrel distortion using various types of datasets having different levels of barrel distortion. Analysis shows none of the existing metrics produces quality values consistent with intuitively defined impact levels for lens blur, dirtiness, and barrel distortion. To address the loopholes of existing metrics and make the quality assessment metric more reliable, we propose a new image quality assessment metric that fuses the quality values obtained from different metrics using a decision fusion technique known as the Dempster–Shafer theory. Our proposed metric produces quality values that are more consistent and conform with the perceptually defined camera parameter impact levels. For all the above-mentioned camera impacts, our proposed metric exhibits 100% assessment reliability, which includes an enormous improvement over other metrics. © 2022, Pleiades Publishing, Ltd.
CHIEF : clustering With higher-order motifs in big networks
- Xia, Feng, Yu, Shuo, Liu, Chengfei, Li, Jianxin, Lee, Ivan
- Authors: Xia, Feng , Yu, Shuo , Liu, Chengfei , Li, Jianxin , Lee, Ivan
- Date: 2022
- Type: Text , Journal article
- Relation: IEEE Transactions on Network Science and Engineering Vol. 9, no. 3 (2022), p. 990-1005
- Full Text:
- Reviewed:
- Description: Clustering network vertices is an enabler of various applications such as social computing and Internet of Things. However, challenges arise for clustering when networks increase in scale. This paper proposes CHIEF (Clustering with HIgher-ordEr motiFs), a solution which consists of two motif clustering techniques: standard acceleration CHIEF-ST and approximate acceleration CHIEF-AP. Both algorithms firstly find the maximal $k$-edge-connected subgraphs within the target networks to lower the network scale by optimizing the network structure with maximal $k$-edge-connected subgraphs, and then use heterogeneous four-node motifs clustering in higher-order dense networks. For CHIEF-ST, we illustrate that all target motifs will be kept after this procedure when the minimum node degree of the target motif is equal or greater than $k$. For CHIEF-AP, we prove that the eigenvalues of the adjacency matrix and the Laplacian matrix are relatively stable after this step. CHIEF offers an improved efficiency of motif clustering for big networks, and it verifies higher-order motif significance. Experiments on real and synthetic networks demonstrate that the proposed solutions outperform baseline approaches in large network analysis, and higher-order motifs outperform traditional triangle motifs in clustering. © 2022 IEEE Computer Society. All rights reserved.
- Authors: Xia, Feng , Yu, Shuo , Liu, Chengfei , Li, Jianxin , Lee, Ivan
- Date: 2022
- Type: Text , Journal article
- Relation: IEEE Transactions on Network Science and Engineering Vol. 9, no. 3 (2022), p. 990-1005
- Full Text:
- Reviewed:
- Description: Clustering network vertices is an enabler of various applications such as social computing and Internet of Things. However, challenges arise for clustering when networks increase in scale. This paper proposes CHIEF (Clustering with HIgher-ordEr motiFs), a solution which consists of two motif clustering techniques: standard acceleration CHIEF-ST and approximate acceleration CHIEF-AP. Both algorithms firstly find the maximal $k$-edge-connected subgraphs within the target networks to lower the network scale by optimizing the network structure with maximal $k$-edge-connected subgraphs, and then use heterogeneous four-node motifs clustering in higher-order dense networks. For CHIEF-ST, we illustrate that all target motifs will be kept after this procedure when the minimum node degree of the target motif is equal or greater than $k$. For CHIEF-AP, we prove that the eigenvalues of the adjacency matrix and the Laplacian matrix are relatively stable after this step. CHIEF offers an improved efficiency of motif clustering for big networks, and it verifies higher-order motif significance. Experiments on real and synthetic networks demonstrate that the proposed solutions outperform baseline approaches in large network analysis, and higher-order motifs outperform traditional triangle motifs in clustering. © 2022 IEEE Computer Society. All rights reserved.
Collaborative filtering with network representation learning for citation recommendation
- Wang, Wei, Tang, Tao, Xia, Feng, Gong, Zhiguo, Chen, Zhikui, Liu, Huan
- Authors: Wang, Wei , Tang, Tao , Xia, Feng , Gong, Zhiguo , Chen, Zhikui , Liu, Huan
- Date: 2022
- Type: Text , Journal article
- Relation: IEEE Transactions on Big Data Vol. 8, no. 5 (2022), p. 1233-1246
- Full Text:
- Reviewed:
- Description: Citation recommendation plays an important role in the context of scholarly big data, where finding relevant papers has become more difficult because of information overload. Applying traditional collaborative filtering (CF) to citation recommendation is challenging due to the cold start problem and the lack of paper ratings. To address these challenges, in this article, we propose a collaborative filtering with network representation learning framework for citation recommendation, namely CNCRec, which is a hybrid user-based CF considering both paper content and network topology. It aims at recommending citations in heterogeneous academic information networks. CNCRec creates the paper rating matrix based on attributed citation network representation learning, where the attributes are topics extracted from the paper text information. Meanwhile, the learned representations of attributed collaboration network is utilized to improve the selection of nearest neighbors. By harnessing the power of network representation learning, CNCRec is able to make full use of the whole citation network topology compared with previous context-aware network-based models. Extensive experiments on both DBLP and APS datasets show that the proposed method outperforms state-of-the-art methods in terms of precision, recall, and MRR (Mean Reciprocal Rank). Moreover, CNCRec can better solve the data sparsity problem compared with other CF-based baselines. © 2015 IEEE.
- Authors: Wang, Wei , Tang, Tao , Xia, Feng , Gong, Zhiguo , Chen, Zhikui , Liu, Huan
- Date: 2022
- Type: Text , Journal article
- Relation: IEEE Transactions on Big Data Vol. 8, no. 5 (2022), p. 1233-1246
- Full Text:
- Reviewed:
- Description: Citation recommendation plays an important role in the context of scholarly big data, where finding relevant papers has become more difficult because of information overload. Applying traditional collaborative filtering (CF) to citation recommendation is challenging due to the cold start problem and the lack of paper ratings. To address these challenges, in this article, we propose a collaborative filtering with network representation learning framework for citation recommendation, namely CNCRec, which is a hybrid user-based CF considering both paper content and network topology. It aims at recommending citations in heterogeneous academic information networks. CNCRec creates the paper rating matrix based on attributed citation network representation learning, where the attributes are topics extracted from the paper text information. Meanwhile, the learned representations of attributed collaboration network is utilized to improve the selection of nearest neighbors. By harnessing the power of network representation learning, CNCRec is able to make full use of the whole citation network topology compared with previous context-aware network-based models. Extensive experiments on both DBLP and APS datasets show that the proposed method outperforms state-of-the-art methods in terms of precision, recall, and MRR (Mean Reciprocal Rank). Moreover, CNCRec can better solve the data sparsity problem compared with other CF-based baselines. © 2015 IEEE.
COVID-19 datasets : a brief overview
- Sun, Ke, Li, Wuyang, Saikrishna, Vidya, Chadhar, Mehmood, Xia, Feng
- Authors: Sun, Ke , Li, Wuyang , Saikrishna, Vidya , Chadhar, Mehmood , Xia, Feng
- Date: 2022
- Type: Text , Journal article
- Relation: Computer Science and Information Systems Vol. 19, no. 3 (2022), p. 1115-1132
- Full Text:
- Reviewed:
- Description: The outbreak of the COVID-19 pandemic affects lives and social-economic development around the world. The affecting of the pandemic has motivated researchers from different domains to find effective solutions to diagnose, prevent, and estimate the pandemic and relieve its adverse effects. Numerous COVID-19 datasets are built from these studies and are available to the public. These datasets can be used for disease diagnosis and case prediction, speeding up solving problems caused by the pandemic. To meet the needs of researchers to understand various COVID-19 datasets, we examine and provide an overview of them. We organise the majority of these datasets into three categories based on the category of ap-plications, i.e., time-series, knowledge base, and media-based datasets. Organising COVID-19 datasets into appropriate categories can help researchers hold their focus on methodology rather than the datasets. In addition, applications and COVID-19 datasets suffer from a series of problems, such as privacy and quality. We discuss these issues as well as potentials of COVID-19 datasets. © 2022, ComSIS Consortium. All rights reserved.
- Authors: Sun, Ke , Li, Wuyang , Saikrishna, Vidya , Chadhar, Mehmood , Xia, Feng
- Date: 2022
- Type: Text , Journal article
- Relation: Computer Science and Information Systems Vol. 19, no. 3 (2022), p. 1115-1132
- Full Text:
- Reviewed:
- Description: The outbreak of the COVID-19 pandemic affects lives and social-economic development around the world. The affecting of the pandemic has motivated researchers from different domains to find effective solutions to diagnose, prevent, and estimate the pandemic and relieve its adverse effects. Numerous COVID-19 datasets are built from these studies and are available to the public. These datasets can be used for disease diagnosis and case prediction, speeding up solving problems caused by the pandemic. To meet the needs of researchers to understand various COVID-19 datasets, we examine and provide an overview of them. We organise the majority of these datasets into three categories based on the category of ap-plications, i.e., time-series, knowledge base, and media-based datasets. Organising COVID-19 datasets into appropriate categories can help researchers hold their focus on methodology rather than the datasets. In addition, applications and COVID-19 datasets suffer from a series of problems, such as privacy and quality. We discuss these issues as well as potentials of COVID-19 datasets. © 2022, ComSIS Consortium. All rights reserved.
Deep graph learning for anomalous citation detection
- Liu, Jiaying, Xia, Feng, Feng, Xu, Ren, Jing, Liu, Huand
- Authors: Liu, Jiaying , Xia, Feng , Feng, Xu , Ren, Jing , Liu, Huand
- Date: 2022
- Type: Text , Journal article
- Relation: IEEE Transactions on Neural Networks and Learning Systems Vol. 33, no. 6 (2022), p. 2543-2557
- Full Text:
- Reviewed:
- Description: Anomaly detection is one of the most active research areas in various critical domains, such as healthcare, fintech, and public security. However, little attention has been paid to scholarly data, that is, anomaly detection in a citation network. Citation is considered as one of the most crucial metrics to evaluate the impact of scientific research, which may be gamed in multiple ways. Therefore, anomaly detection in citation networks is of significant importance to identify manipulation and inflation of citations. To address this open issue, we propose a novel deep graph learning model, namely graph learning for anomaly detection (GLAD), to identify anomalies in citation networks. GLAD incorporates text semantic mining to network representation learning by adding both node attributes and link attributes via graph neural networks (GNNs). It exploits not only the relevance of citation contents, but also hidden relationships between papers. Within the GLAD framework, we propose an algorithm called Citation PUrpose (CPU) to discover the purpose of citation based on citation context. The performance of GLAD is validated through a simulated anomalous citation dataset. Experimental results demonstrate the effectiveness of GLAD on the anomalous citation detection task. © 2012 IEEE.
- Authors: Liu, Jiaying , Xia, Feng , Feng, Xu , Ren, Jing , Liu, Huand
- Date: 2022
- Type: Text , Journal article
- Relation: IEEE Transactions on Neural Networks and Learning Systems Vol. 33, no. 6 (2022), p. 2543-2557
- Full Text:
- Reviewed:
- Description: Anomaly detection is one of the most active research areas in various critical domains, such as healthcare, fintech, and public security. However, little attention has been paid to scholarly data, that is, anomaly detection in a citation network. Citation is considered as one of the most crucial metrics to evaluate the impact of scientific research, which may be gamed in multiple ways. Therefore, anomaly detection in citation networks is of significant importance to identify manipulation and inflation of citations. To address this open issue, we propose a novel deep graph learning model, namely graph learning for anomaly detection (GLAD), to identify anomalies in citation networks. GLAD incorporates text semantic mining to network representation learning by adding both node attributes and link attributes via graph neural networks (GNNs). It exploits not only the relevance of citation contents, but also hidden relationships between papers. Within the GLAD framework, we propose an algorithm called Citation PUrpose (CPU) to discover the purpose of citation based on citation context. The performance of GLAD is validated through a simulated anomalous citation dataset. Experimental results demonstrate the effectiveness of GLAD on the anomalous citation detection task. © 2012 IEEE.
Edge computing for Internet of Everything : a survey
- Kong, Xiangjie, Wu, Yuhan, Wang, Hui, Xia, Feng
- Authors: Kong, Xiangjie , Wu, Yuhan , Wang, Hui , Xia, Feng
- Date: 2022
- Type: Text , Journal article
- Relation: IEEE Internet of Things Journal Vol. 9, no. 23 (2022), p. 23472-23485
- Full Text:
- Reviewed:
- Description: In this era of the Internet of Everything (IoE), edge computing has emerged as the critical enabling technology to solve a series of issues caused by an increasing amount of interconnected devices and large-scale data transmission. However, the deficiencies of edge computing paradigm are gradually being magnified in the context of IoE, especially in terms of service migration, security and privacy preservation, and deployment issues of edge node. These issues can not be well addressed by conventional approaches. Thanks to the rapid development of upcoming technologies, such as artificial intelligence (AI), blockchain, and microservices, novel and more effective solutions have emerged and been applied to solve existing challenges. In addition, edge computing can be deeply integrated with technologies in other domains (e.g., AI, blockchain, 6G, and digital twin) through interdisciplinary intersection and practice, releasing the potential for mutual benefit. These promising integrations need to be further explored and researched. In addition, edge computing provides strong support in applications scenarios, such as remote working, new physical retail industries, and digital advertising, which has greatly changed the way we live, work, and study. In this article, we present an up-to-date survey of the edge computing research. In addition to introducing the definition, model, and characteristics of edge computing, we discuss a set of key issues in edge computing and novel solutions supported by emerging technologies in IoE era. Furthermore, we explore the potential and promising trends from the perspective of technology integration. Finally, new application scenarios and the final form of edge computing are discussed. © 2014 IEEE.
- Authors: Kong, Xiangjie , Wu, Yuhan , Wang, Hui , Xia, Feng
- Date: 2022
- Type: Text , Journal article
- Relation: IEEE Internet of Things Journal Vol. 9, no. 23 (2022), p. 23472-23485
- Full Text:
- Reviewed:
- Description: In this era of the Internet of Everything (IoE), edge computing has emerged as the critical enabling technology to solve a series of issues caused by an increasing amount of interconnected devices and large-scale data transmission. However, the deficiencies of edge computing paradigm are gradually being magnified in the context of IoE, especially in terms of service migration, security and privacy preservation, and deployment issues of edge node. These issues can not be well addressed by conventional approaches. Thanks to the rapid development of upcoming technologies, such as artificial intelligence (AI), blockchain, and microservices, novel and more effective solutions have emerged and been applied to solve existing challenges. In addition, edge computing can be deeply integrated with technologies in other domains (e.g., AI, blockchain, 6G, and digital twin) through interdisciplinary intersection and practice, releasing the potential for mutual benefit. These promising integrations need to be further explored and researched. In addition, edge computing provides strong support in applications scenarios, such as remote working, new physical retail industries, and digital advertising, which has greatly changed the way we live, work, and study. In this article, we present an up-to-date survey of the edge computing research. In addition to introducing the definition, model, and characteristics of edge computing, we discuss a set of key issues in edge computing and novel solutions supported by emerging technologies in IoE era. Furthermore, we explore the potential and promising trends from the perspective of technology integration. Finally, new application scenarios and the final form of edge computing are discussed. © 2014 IEEE.
Edge data based trailer inception probabilistic matrix factorization for context-aware movie recommendation
- Chen, Honglong, Li, Zhe, Wang, Zhu, Ni, Zhichen, Li, Junjian, Xu, Ge, Aziz, Abdul, Xia, Feng
- Authors: Chen, Honglong , Li, Zhe , Wang, Zhu , Ni, Zhichen , Li, Junjian , Xu, Ge , Aziz, Abdul , Xia, Feng
- Date: 2022
- Type: Text , Journal article
- Relation: World Wide Web Vol. 25, no. 5 (2022), p. 1863-1882
- Full Text:
- Reviewed:
- Description: The rapid growth of edge data generated by mobile devices and applications deployed at the edge of the network has exacerbated the problem of information overload. As an effective way to alleviate information overload, recommender system can improve the quality of various services by adding application data generated by users on edge devices, such as visual and textual information, on the basis of sparse rating data. The visual information in the movie trailer is a significant part of the movie recommender system. However, due to the complexity of visual information extraction, data sparsity cannot be remarkably alleviated by merely using the rough visual features to improve the rating prediction accuracy. Fortunately, the convolutional neural network can be used to extract the visual features precisely. Therefore, the end-to-end neural image caption (NIC) model can be utilized to obtain the textual information describing the visual features of movie trailers. This paper proposes a trailer inception probabilistic matrix factorization model called Ti-PMF, which combines NIC, recurrent convolutional neural network, and probabilistic matrix factorization models as the rating prediction model. We implement the proposed Ti-PMF model with extensive experiments on three real-world datasets to validate its effectiveness. The experimental results illustrate that the proposed Ti-PMF outperforms the existing ones. © 2021, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.
- Authors: Chen, Honglong , Li, Zhe , Wang, Zhu , Ni, Zhichen , Li, Junjian , Xu, Ge , Aziz, Abdul , Xia, Feng
- Date: 2022
- Type: Text , Journal article
- Relation: World Wide Web Vol. 25, no. 5 (2022), p. 1863-1882
- Full Text:
- Reviewed:
- Description: The rapid growth of edge data generated by mobile devices and applications deployed at the edge of the network has exacerbated the problem of information overload. As an effective way to alleviate information overload, recommender system can improve the quality of various services by adding application data generated by users on edge devices, such as visual and textual information, on the basis of sparse rating data. The visual information in the movie trailer is a significant part of the movie recommender system. However, due to the complexity of visual information extraction, data sparsity cannot be remarkably alleviated by merely using the rough visual features to improve the rating prediction accuracy. Fortunately, the convolutional neural network can be used to extract the visual features precisely. Therefore, the end-to-end neural image caption (NIC) model can be utilized to obtain the textual information describing the visual features of movie trailers. This paper proposes a trailer inception probabilistic matrix factorization model called Ti-PMF, which combines NIC, recurrent convolutional neural network, and probabilistic matrix factorization models as the rating prediction model. We implement the proposed Ti-PMF model with extensive experiments on three real-world datasets to validate its effectiveness. The experimental results illustrate that the proposed Ti-PMF outperforms the existing ones. © 2021, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.
Educational anomaly analytics : features, methods, and challenges
- Guo, Teng, Bai, Xiaomei, Tian, Xue, Firmin, Sally, Xia, Feng
- Authors: Guo, Teng , Bai, Xiaomei , Tian, Xue , Firmin, Sally , Xia, Feng
- Date: 2022
- Type: Text , Journal article , Review
- Relation: Frontiers in Big Data Vol. 4, no. (2022), p.
- Full Text:
- Reviewed:
- Description: Anomalies in education affect the personal careers of students and universities' retention rates. Understanding the laws behind educational anomalies promotes the development of individual students and improves the overall quality of education. However, the inaccessibility of educational data hinders the development of the field. Previous research in this field used questionnaires, which are time- and cost-consuming and hardly applicable to large-scale student cohorts. With the popularity of educational management systems and the rise of online education during the prevalence of COVID-19, a large amount of educational data is available online and offline, providing an unprecedented opportunity to explore educational anomalies from a data-driven perspective. As an emerging field, educational anomaly analytics rapidly attracts scholars from a variety of fields, including education, psychology, sociology, and computer science. This paper intends to provide a comprehensive review of data-driven analytics of educational anomalies from a methodological standpoint. We focus on the following five types of research that received the most attention: course failure prediction, dropout prediction, mental health problems detection, prediction of difficulty in graduation, and prediction of difficulty in employment. Then, we discuss the challenges of current related research. This study aims to provide references for educational policymaking while promoting the development of educational anomaly analytics as a growing field. Copyright © 2022 Guo, Bai, Tian, Firmin and Xia.
- Authors: Guo, Teng , Bai, Xiaomei , Tian, Xue , Firmin, Sally , Xia, Feng
- Date: 2022
- Type: Text , Journal article , Review
- Relation: Frontiers in Big Data Vol. 4, no. (2022), p.
- Full Text:
- Reviewed:
- Description: Anomalies in education affect the personal careers of students and universities' retention rates. Understanding the laws behind educational anomalies promotes the development of individual students and improves the overall quality of education. However, the inaccessibility of educational data hinders the development of the field. Previous research in this field used questionnaires, which are time- and cost-consuming and hardly applicable to large-scale student cohorts. With the popularity of educational management systems and the rise of online education during the prevalence of COVID-19, a large amount of educational data is available online and offline, providing an unprecedented opportunity to explore educational anomalies from a data-driven perspective. As an emerging field, educational anomaly analytics rapidly attracts scholars from a variety of fields, including education, psychology, sociology, and computer science. This paper intends to provide a comprehensive review of data-driven analytics of educational anomalies from a methodological standpoint. We focus on the following five types of research that received the most attention: course failure prediction, dropout prediction, mental health problems detection, prediction of difficulty in graduation, and prediction of difficulty in employment. Then, we discuss the challenges of current related research. This study aims to provide references for educational policymaking while promoting the development of educational anomaly analytics as a growing field. Copyright © 2022 Guo, Bai, Tian, Firmin and Xia.
Efficient anomaly recognition using surveillance videos
- Saleem, Gulshan, Bajwa, Usama, Raza, Rana, Alqahtani, Fayez, Tolba, Amr, Xia, Feng
- Authors: Saleem, Gulshan , Bajwa, Usama , Raza, Rana , Alqahtani, Fayez , Tolba, Amr , Xia, Feng
- Date: 2022
- Type: Text , Journal article
- Relation: PeerJ Computer Science Vol. 8, no. (2022), p.
- Full Text:
- Reviewed:
- Description: Smart surveillance is a difficult task that is gaining popularity due to its direct link to human safety. Today, many indoor and outdoor surveillance systems are in use at public places and smart cities. Because these systems are expensive to deploy, these are out of reach for the vast majority of the public and private sectors. Due to the lack of a precise definition of an anomaly, automated surveillance is a challenging task, especially when large amounts of data, such as 24/7 CCTV footage, must be processed. When implementing such systems in real-time environments, the high computational resource requirements for automated surveillance becomes a major bottleneck. Another challenge is to recognize anomalies accurately as achieving high accuracy while reducing computational cost is more challenging. To address these challenge, this research is based on the developing a system that is both efficient and cost effective. Although 3D convolutional neural networks have proven to be accurate, they are prohibitively expensive for practical use, particularly in real-time surveillance. In this article, we present two contributions: a resource-efficient framework for anomaly recognition problems and two-class and multi-class anomaly recognition on spatially augmented surveillance videos. This research aims to address the problem of computation overhead while maintaining recognition accuracy. The proposed Temporal based Anomaly Recognizer (TAR) framework combines a partial shift strategy with a 2D convolutional architecture-based model, namely MobileNetV2. Extensive experiments were carried out to evaluate the model's performance on the UCF Crime dataset, with MobileNetV2 as the baseline architecture; it achieved an accuracy of 88% which is 2.47% increased performance than available state-of-the-art. The proposed framework achieves 52.7% accuracy for multiclass anomaly recognition on the UCF Crime2Local dataset. The proposed model has been tested in real-time camera stream settings and can handle six streams simultaneously without the need for additional resources. © Copyright 2022 Saleem et al.
- Authors: Saleem, Gulshan , Bajwa, Usama , Raza, Rana , Alqahtani, Fayez , Tolba, Amr , Xia, Feng
- Date: 2022
- Type: Text , Journal article
- Relation: PeerJ Computer Science Vol. 8, no. (2022), p.
- Full Text:
- Reviewed:
- Description: Smart surveillance is a difficult task that is gaining popularity due to its direct link to human safety. Today, many indoor and outdoor surveillance systems are in use at public places and smart cities. Because these systems are expensive to deploy, these are out of reach for the vast majority of the public and private sectors. Due to the lack of a precise definition of an anomaly, automated surveillance is a challenging task, especially when large amounts of data, such as 24/7 CCTV footage, must be processed. When implementing such systems in real-time environments, the high computational resource requirements for automated surveillance becomes a major bottleneck. Another challenge is to recognize anomalies accurately as achieving high accuracy while reducing computational cost is more challenging. To address these challenge, this research is based on the developing a system that is both efficient and cost effective. Although 3D convolutional neural networks have proven to be accurate, they are prohibitively expensive for practical use, particularly in real-time surveillance. In this article, we present two contributions: a resource-efficient framework for anomaly recognition problems and two-class and multi-class anomaly recognition on spatially augmented surveillance videos. This research aims to address the problem of computation overhead while maintaining recognition accuracy. The proposed Temporal based Anomaly Recognizer (TAR) framework combines a partial shift strategy with a 2D convolutional architecture-based model, namely MobileNetV2. Extensive experiments were carried out to evaluate the model's performance on the UCF Crime dataset, with MobileNetV2 as the baseline architecture; it achieved an accuracy of 88% which is 2.47% increased performance than available state-of-the-art. The proposed framework achieves 52.7% accuracy for multiclass anomaly recognition on the UCF Crime2Local dataset. The proposed model has been tested in real-time camera stream settings and can handle six streams simultaneously without the need for additional resources. © Copyright 2022 Saleem et al.
Exploring human mobility for multi-pattern passenger prediction : a graph learning framework
- Kong, Xiangjiea, Wang, Kailai, Hou, Mingliang, Xia, Feng, Karmakar, Gour, Li, Jianxin
- Authors: Kong, Xiangjiea , Wang, Kailai , Hou, Mingliang , Xia, Feng , Karmakar, Gour , Li, Jianxin
- Date: 2022
- Type: Text , Journal article
- Relation: IEEE Transactions on Intelligent Transportation Systems Vol. 23, no. 9 (2022), p. 16148-16160
- Full Text:
- Reviewed:
- Description: Traffic flow prediction is an integral part of an intelligent transportation system and thus fundamental for various traffic-related applications. Buses are an indispensable way of moving for urban residents with fixed routes and schedules, which leads to latent travel regularity. However, human mobility patterns, specifically the complex relationships between bus passengers, are deeply hidden in this fixed mobility mode. Although many models exist to predict traffic flow, human mobility patterns have not been well explored in this regard. To address this research gap and learn human mobility knowledge from this fixed travel behaviors, we propose a multi-pattern passenger flow prediction framework, MPGCN, based on Graph Convolutional Network (GCN). Firstly, we construct a novel sharing-stop network to model relationships between passengers based on bus record data. Then, we employ GCN to extract features from the graph by learning useful topology information and introduce a deep clustering method to recognize mobility patterns hidden in bus passengers. Furthermore, to fully utilize spatio-temporal information, we propose GCN2Flow to predict passenger flow based on various mobility patterns. To the best of our knowledge, this paper is the first work to adopt a multi-pattern approach to predict the bus passenger flow by taking advantage of graph learning. We design a case study for optimizing routes. Extensive experiments upon a real-world bus dataset demonstrate that MPGCN has potential efficacy in passenger flow prediction and route optimization. © 2000-2011 IEEE.
- Authors: Kong, Xiangjiea , Wang, Kailai , Hou, Mingliang , Xia, Feng , Karmakar, Gour , Li, Jianxin
- Date: 2022
- Type: Text , Journal article
- Relation: IEEE Transactions on Intelligent Transportation Systems Vol. 23, no. 9 (2022), p. 16148-16160
- Full Text:
- Reviewed:
- Description: Traffic flow prediction is an integral part of an intelligent transportation system and thus fundamental for various traffic-related applications. Buses are an indispensable way of moving for urban residents with fixed routes and schedules, which leads to latent travel regularity. However, human mobility patterns, specifically the complex relationships between bus passengers, are deeply hidden in this fixed mobility mode. Although many models exist to predict traffic flow, human mobility patterns have not been well explored in this regard. To address this research gap and learn human mobility knowledge from this fixed travel behaviors, we propose a multi-pattern passenger flow prediction framework, MPGCN, based on Graph Convolutional Network (GCN). Firstly, we construct a novel sharing-stop network to model relationships between passengers based on bus record data. Then, we employ GCN to extract features from the graph by learning useful topology information and introduce a deep clustering method to recognize mobility patterns hidden in bus passengers. Furthermore, to fully utilize spatio-temporal information, we propose GCN2Flow to predict passenger flow based on various mobility patterns. To the best of our knowledge, this paper is the first work to adopt a multi-pattern approach to predict the bus passenger flow by taking advantage of graph learning. We design a case study for optimizing routes. Extensive experiments upon a real-world bus dataset demonstrate that MPGCN has potential efficacy in passenger flow prediction and route optimization. © 2000-2011 IEEE.
Familiarity-based collaborative team recognition in academic social networks
- Yu, Shuo, Xia, Feng, Zhang, Chen, Wei, Haoran, Keogh, Kathleen, Chen, Honglong
- Authors: Yu, Shuo , Xia, Feng , Zhang, Chen , Wei, Haoran , Keogh, Kathleen , Chen, Honglong
- Date: 2022
- Type: Text , Journal article
- Relation: IEEE Transactions on Computational Social Systems Vol. 9, no. 5 (2022), p. 1432-1445
- Full Text:
- Reviewed:
- Description: Collaborative teamwork is key to major scientific discoveries. However, the prevalence of collaboration among researchers makes team recognition increasingly challenging. Previous studies have demonstrated that people are more likely to collaborate with individuals they are familiar with. In this work, we employ the definition of familiarity and then propose faMiliarity-based cOllaborative Team recOgnition (MOTO) algorithm to recognize collaborative teams. MOTO calculates the shortest distance matrix within the global collaboration network and the local density of each node. Central team members are initially recognized based on local density. Then, MOTO recognizes the remaining team members by using the familiarity metric and shortest distance matrix. Extensive experiments have been conducted upon a large-scale dataset. The experimental results show that compared with baseline methods, MOTO can recognize the largest number of teams. The teams recognized by the MOTO possess more cohesive team structures and lower team communication costs compared with other methods. MOTO utilizes familiarity in team recognition to identify cohesive academic teams. The recognized teams are in line with real-world collaborative teamwork patterns. Based on team recognition using MOTO, the research team structure and performance are further analyzed for given time periods. The number of teams that consist of members from different institutions increases gradually. Such teams are found to perform better in comparison with those whose members are from the same institution. © 2014 IEEE.
- Authors: Yu, Shuo , Xia, Feng , Zhang, Chen , Wei, Haoran , Keogh, Kathleen , Chen, Honglong
- Date: 2022
- Type: Text , Journal article
- Relation: IEEE Transactions on Computational Social Systems Vol. 9, no. 5 (2022), p. 1432-1445
- Full Text:
- Reviewed:
- Description: Collaborative teamwork is key to major scientific discoveries. However, the prevalence of collaboration among researchers makes team recognition increasingly challenging. Previous studies have demonstrated that people are more likely to collaborate with individuals they are familiar with. In this work, we employ the definition of familiarity and then propose faMiliarity-based cOllaborative Team recOgnition (MOTO) algorithm to recognize collaborative teams. MOTO calculates the shortest distance matrix within the global collaboration network and the local density of each node. Central team members are initially recognized based on local density. Then, MOTO recognizes the remaining team members by using the familiarity metric and shortest distance matrix. Extensive experiments have been conducted upon a large-scale dataset. The experimental results show that compared with baseline methods, MOTO can recognize the largest number of teams. The teams recognized by the MOTO possess more cohesive team structures and lower team communication costs compared with other methods. MOTO utilizes familiarity in team recognition to identify cohesive academic teams. The recognized teams are in line with real-world collaborative teamwork patterns. Based on team recognition using MOTO, the research team structure and performance are further analyzed for given time periods. The number of teams that consist of members from different institutions increases gradually. Such teams are found to perform better in comparison with those whose members are from the same institution. © 2014 IEEE.
Graph augmentation learning
- Yu, Shuo, Huang, Huafei, Dao, Minh, Xia, Feng
- Authors: Yu, Shuo , Huang, Huafei , Dao, Minh , Xia, Feng
- Date: 2022
- Type: Text , Conference paper
- Relation: 31st ACM Web Conference, WWW 2022, Virtual, online, 25 April 2022, WWW 2022 - Companion Proceedings of the Web Conference 2022 p. 1063-1072
- Full Text:
- Reviewed:
- Description: Graph Augmentation Learning (GAL) provides outstanding solutions for graph learning in handling incomplete data, noise data, etc. Numerous GAL methods have been proposed for graph-based applications such as social network analysis and traffic flow forecasting. However, the underlying reasons for the effectiveness of these GAL methods are still unclear. As a consequence, how to choose optimal graph augmentation strategy for a certain application scenario is still in black box. There is a lack of systematic, comprehensive, and experimentally validated guideline of GAL for scholars. Therefore, in this survey, we in-depth review GAL techniques from macro (graph), meso (subgraph), and micro (node/edge) levels. We further detailedly illustrate how GAL enhance the data quality and the model performance. The aggregation mechanism of augmentation strategies and graph learning models are also discussed by different application scenarios, i.e., data-specific, model-specific, and hybrid scenarios. To better show the outperformance of GAL, we experimentally validate the effectiveness and adaptability of different GAL strategies in different downstream tasks. Finally, we share our insights on several open issues of GAL, including heterogeneity, spatio-temporal dynamics, scalability, and generalization. © 2022 ACM.
- Authors: Yu, Shuo , Huang, Huafei , Dao, Minh , Xia, Feng
- Date: 2022
- Type: Text , Conference paper
- Relation: 31st ACM Web Conference, WWW 2022, Virtual, online, 25 April 2022, WWW 2022 - Companion Proceedings of the Web Conference 2022 p. 1063-1072
- Full Text:
- Reviewed:
- Description: Graph Augmentation Learning (GAL) provides outstanding solutions for graph learning in handling incomplete data, noise data, etc. Numerous GAL methods have been proposed for graph-based applications such as social network analysis and traffic flow forecasting. However, the underlying reasons for the effectiveness of these GAL methods are still unclear. As a consequence, how to choose optimal graph augmentation strategy for a certain application scenario is still in black box. There is a lack of systematic, comprehensive, and experimentally validated guideline of GAL for scholars. Therefore, in this survey, we in-depth review GAL techniques from macro (graph), meso (subgraph), and micro (node/edge) levels. We further detailedly illustrate how GAL enhance the data quality and the model performance. The aggregation mechanism of augmentation strategies and graph learning models are also discussed by different application scenarios, i.e., data-specific, model-specific, and hybrid scenarios. To better show the outperformance of GAL, we experimentally validate the effectiveness and adaptability of different GAL strategies in different downstream tasks. Finally, we share our insights on several open issues of GAL, including heterogeneity, spatio-temporal dynamics, scalability, and generalization. © 2022 ACM.
Graph self-supervised learning : a survey
- Liu, Yixin, Jin, Ming, Pan, Shirui, Zhou, Chuan, Zheng, Yu, Xia, Feng, Yu, Philip
- Authors: Liu, Yixin , Jin, Ming , Pan, Shirui , Zhou, Chuan , Zheng, Yu , Xia, Feng , Yu, Philip
- Date: 2022
- Type: Text , Journal article
- Relation: IEEE Transactions on Knowledge and Data Engineering Vol. 35, no. 6 (2022), p. 5879-5900
- Full Text:
- Reviewed:
- Description: Deep learning on graphs has attracted significant interests recently. However, most of the works have focused on (semi-) supervised learning, resulting in shortcomings including heavy label reliance, poor generalization, and weak robustness. To address these issues, self-supervised learning (SSL), which extracts informative knowledge through well-designed pretext tasks without relying on manual labels, has become a promising and trending learning paradigm for graph data. Different from SSL on other domains like computer vision and natural language processing, SSL on graphs has an exclusive background, design ideas, and taxonomies. Under the umbrella of graph self-supervised learning, we present a timely and comprehensive review of the existing approaches which employ SSL techniques for graph data. We construct a unified framework that mathematically formalizes the paradigm of graph SSL. According to the objectives of pretext tasks, we divide these approaches into four categories: generation-based, auxiliary property-based, contrast-based, and hybrid approaches. We further describe the applications of graph SSL across various research fields and summarize the commonly used datasets, evaluation benchmark, performance comparison and open-source codes of graph SSL. Finally, we discuss the remaining challenges and potential future directions in this research field. IEEE
- Authors: Liu, Yixin , Jin, Ming , Pan, Shirui , Zhou, Chuan , Zheng, Yu , Xia, Feng , Yu, Philip
- Date: 2022
- Type: Text , Journal article
- Relation: IEEE Transactions on Knowledge and Data Engineering Vol. 35, no. 6 (2022), p. 5879-5900
- Full Text:
- Reviewed:
- Description: Deep learning on graphs has attracted significant interests recently. However, most of the works have focused on (semi-) supervised learning, resulting in shortcomings including heavy label reliance, poor generalization, and weak robustness. To address these issues, self-supervised learning (SSL), which extracts informative knowledge through well-designed pretext tasks without relying on manual labels, has become a promising and trending learning paradigm for graph data. Different from SSL on other domains like computer vision and natural language processing, SSL on graphs has an exclusive background, design ideas, and taxonomies. Under the umbrella of graph self-supervised learning, we present a timely and comprehensive review of the existing approaches which employ SSL techniques for graph data. We construct a unified framework that mathematically formalizes the paradigm of graph SSL. According to the objectives of pretext tasks, we divide these approaches into four categories: generation-based, auxiliary property-based, contrast-based, and hybrid approaches. We further describe the applications of graph SSL across various research fields and summarize the commonly used datasets, evaluation benchmark, performance comparison and open-source codes of graph SSL. Finally, we discuss the remaining challenges and potential future directions in this research field. IEEE
Multimodal educational data fusion for students' mental health detection
- Guo, Teng, Zhao, Wenhong, Alrashoud, Mubarak, Tolba, Amr, Firmin, Sally, Xia, Feng
- Authors: Guo, Teng , Zhao, Wenhong , Alrashoud, Mubarak , Tolba, Amr , Firmin, Sally , Xia, Feng
- Date: 2022
- Type: Text , Journal article
- Relation: IEEE Access Vol. 10, no. (2022), p. 70370-70382
- Full Text:
- Reviewed:
- Description: Mental health issues can lead to serious consequences like depression, self-mutilation, and worse, especially for university students who are not physically and mentally mature. Not all students with poor mental health are aware of their situation and actively seek help. Proactive detection of mental problems is a critical step in addressing this issue. However, accurate detections are hard to achieve due to the inherent complexity and heterogeneity of unstructured multi-modal data generated by campus life. Against this background, we propose a detection framework for detecting students' mental health, named CASTLE (educational data fusion for mental health detection). Three parts are involved in this framework. First, we utilize representation learning to fuse data on social life, academic performance, and physical appearance. An algorithm, named MOON (multi-view social network embedding), is proposed to represent students' social life in a comprehensive way by fusing students' heterogeneous social relations effectively. Second, a synthetic minority oversampling technique algorithm (SMOTE) is applied to the label imbalance issue. Finally, a DNN (deep neural network) model is utilized for the final detection. The extensive results demonstrate the promising performance of the proposed methods in comparison to an extensive range of state-of-the-art baselines. © 2013 IEEE.
- Authors: Guo, Teng , Zhao, Wenhong , Alrashoud, Mubarak , Tolba, Amr , Firmin, Sally , Xia, Feng
- Date: 2022
- Type: Text , Journal article
- Relation: IEEE Access Vol. 10, no. (2022), p. 70370-70382
- Full Text:
- Reviewed:
- Description: Mental health issues can lead to serious consequences like depression, self-mutilation, and worse, especially for university students who are not physically and mentally mature. Not all students with poor mental health are aware of their situation and actively seek help. Proactive detection of mental problems is a critical step in addressing this issue. However, accurate detections are hard to achieve due to the inherent complexity and heterogeneity of unstructured multi-modal data generated by campus life. Against this background, we propose a detection framework for detecting students' mental health, named CASTLE (educational data fusion for mental health detection). Three parts are involved in this framework. First, we utilize representation learning to fuse data on social life, academic performance, and physical appearance. An algorithm, named MOON (multi-view social network embedding), is proposed to represent students' social life in a comprehensive way by fusing students' heterogeneous social relations effectively. Second, a synthetic minority oversampling technique algorithm (SMOTE) is applied to the label imbalance issue. Finally, a DNN (deep neural network) model is utilized for the final detection. The extensive results demonstrate the promising performance of the proposed methods in comparison to an extensive range of state-of-the-art baselines. © 2013 IEEE.