Attributed collaboration network embedding for academic relationship mining
- Wang, Wei, Liu, Jiaying, Tang, Tao, Tuarob, Suppawong, Xia, Feng, Gong, Zhiguo, King, Irwin
- Authors: Wang, Wei , Liu, Jiaying , Tang, Tao , Tuarob, Suppawong , Xia, Feng , Gong, Zhiguo , King, Irwin
- Date: 2021
- Type: Text , Journal article
- Relation: ACM Transactions on the Web Vol. 15, no. 1 (2021), p.
- Full Text:
- Reviewed:
- Description: Finding both efficient and effective quantitative representations for scholars in scientific digital libraries has been a focal point of research. The unprecedented amounts of scholarly datasets, combined with contemporary machine learning and big data techniques, have enabled intelligent and automatic profiling of scholars from this vast and ever-increasing pool of scholarly data. Meanwhile, recent advance in network embedding techniques enables us to mitigate the challenges of large scale and sparsity of academic collaboration networks. In real-world academic social networks, scholars are accompanied with various attributes or features, such as co-authorship and publication records, which result in attributed collaboration networks. It has been observed that both network topology and scholar attributes are important in academic relationship mining. However, previous studies mainly focus on network topology, whereas scholar attributes are overlooked. Moreover, the influence of different scholar attributes are unclear. To bridge this gap, in this work, we present a novel framework of Attributed Collaboration Network Embedding (ACNE) for academic relationship mining. ACNE extracts four types of scholar attributes based on the proposed scholar profiling model, including demographics, research, influence, and sociability. ACNE can learn a low-dimensional representation of scholars considering both scholar attributes and network topology simultaneously. We demonstrate the effectiveness and potentials of ACNE in academic relationship mining by performing collaborator recommendation on two real-world datasets and the contribution and importance of each scholar attribute on scientific collaborator recommendation is investigated. Our work may shed light on academic relationship mining by taking advantage of attributed collaboration network embedding. © 2020 ACM.
- Authors: Wang, Wei , Liu, Jiaying , Tang, Tao , Tuarob, Suppawong , Xia, Feng , Gong, Zhiguo , King, Irwin
- Date: 2021
- Type: Text , Journal article
- Relation: ACM Transactions on the Web Vol. 15, no. 1 (2021), p.
- Full Text:
- Reviewed:
- Description: Finding both efficient and effective quantitative representations for scholars in scientific digital libraries has been a focal point of research. The unprecedented amounts of scholarly datasets, combined with contemporary machine learning and big data techniques, have enabled intelligent and automatic profiling of scholars from this vast and ever-increasing pool of scholarly data. Meanwhile, recent advance in network embedding techniques enables us to mitigate the challenges of large scale and sparsity of academic collaboration networks. In real-world academic social networks, scholars are accompanied with various attributes or features, such as co-authorship and publication records, which result in attributed collaboration networks. It has been observed that both network topology and scholar attributes are important in academic relationship mining. However, previous studies mainly focus on network topology, whereas scholar attributes are overlooked. Moreover, the influence of different scholar attributes are unclear. To bridge this gap, in this work, we present a novel framework of Attributed Collaboration Network Embedding (ACNE) for academic relationship mining. ACNE extracts four types of scholar attributes based on the proposed scholar profiling model, including demographics, research, influence, and sociability. ACNE can learn a low-dimensional representation of scholars considering both scholar attributes and network topology simultaneously. We demonstrate the effectiveness and potentials of ACNE in academic relationship mining by performing collaborator recommendation on two real-world datasets and the contribution and importance of each scholar attribute on scientific collaborator recommendation is investigated. Our work may shed light on academic relationship mining by taking advantage of attributed collaboration network embedding. © 2020 ACM.
CenGCN : centralized convolutional networks with vertex imbalance for scale-free graphs
- Xia, Feng, Wang, Lei, Tang, Tao, Chen, Xin, Kong, Xiangjie, Oatley, Giles, King, Irwin
- Authors: Xia, Feng , Wang, Lei , Tang, Tao , Chen, Xin , Kong, Xiangjie , Oatley, Giles , King, Irwin
- Date: 2023
- Type: Text , Journal article
- Relation: IEEE Transactions on Knowledge and Data Engineering Vol. 35, no. 5 (2023), p. 4555-4569
- Full Text:
- Reviewed:
- Description: Graph Convolutional Networks (GCNs) have achieved impressive performance in a wide variety of areas, attracting considerable attention. The core step of GCNs is the information-passing framework that considers all information from neighbors to the central vertex to be equally important. Such equal importance, however, is inadequate for scale-free networks, where hub vertices propagate more dominant information due to vertex imbalance. In this paper, we propose a novel centrality-based framework named CenGCN to address the inequality of information. This framework first quantifies the similarity between hub vertices and their neighbors by label propagation with hub vertices. Based on this similarity and centrality indices, the framework transforms the graph by increasing or decreasing the weights of edges connecting hub vertices and adding self-connections to vertices. In each non-output layer of the GCN, this framework uses a hub attention mechanism to assign new weights to connected non-hub vertices based on their common information with hub vertices. We present two variants CenGCN_D and CenGCN_E, based on degree centrality and eigenvector centrality, respectively. We also conduct comprehensive experiments, including vertex classification, link prediction, vertex clustering, and network visualization. The results demonstrate that the two variants significantly outperform state-of-the-art baselines. © 1989-2012 IEEE.
- Authors: Xia, Feng , Wang, Lei , Tang, Tao , Chen, Xin , Kong, Xiangjie , Oatley, Giles , King, Irwin
- Date: 2023
- Type: Text , Journal article
- Relation: IEEE Transactions on Knowledge and Data Engineering Vol. 35, no. 5 (2023), p. 4555-4569
- Full Text:
- Reviewed:
- Description: Graph Convolutional Networks (GCNs) have achieved impressive performance in a wide variety of areas, attracting considerable attention. The core step of GCNs is the information-passing framework that considers all information from neighbors to the central vertex to be equally important. Such equal importance, however, is inadequate for scale-free networks, where hub vertices propagate more dominant information due to vertex imbalance. In this paper, we propose a novel centrality-based framework named CenGCN to address the inequality of information. This framework first quantifies the similarity between hub vertices and their neighbors by label propagation with hub vertices. Based on this similarity and centrality indices, the framework transforms the graph by increasing or decreasing the weights of edges connecting hub vertices and adding self-connections to vertices. In each non-output layer of the GCN, this framework uses a hub attention mechanism to assign new weights to connected non-hub vertices based on their common information with hub vertices. We present two variants CenGCN_D and CenGCN_E, based on degree centrality and eigenvector centrality, respectively. We also conduct comprehensive experiments, including vertex classification, link prediction, vertex clustering, and network visualization. The results demonstrate that the two variants significantly outperform state-of-the-art baselines. © 1989-2012 IEEE.
Collaborative filtering with network representation learning for citation recommendation
- Wang, Wei, Tang, Tao, Xia, Feng, Gong, Zhiguo, Chen, Zhikui, Liu, Huan
- Authors: Wang, Wei , Tang, Tao , Xia, Feng , Gong, Zhiguo , Chen, Zhikui , Liu, Huan
- Date: 2022
- Type: Text , Journal article
- Relation: IEEE Transactions on Big Data Vol. 8, no. 5 (2022), p. 1233-1246
- Full Text:
- Reviewed:
- Description: Citation recommendation plays an important role in the context of scholarly big data, where finding relevant papers has become more difficult because of information overload. Applying traditional collaborative filtering (CF) to citation recommendation is challenging due to the cold start problem and the lack of paper ratings. To address these challenges, in this article, we propose a collaborative filtering with network representation learning framework for citation recommendation, namely CNCRec, which is a hybrid user-based CF considering both paper content and network topology. It aims at recommending citations in heterogeneous academic information networks. CNCRec creates the paper rating matrix based on attributed citation network representation learning, where the attributes are topics extracted from the paper text information. Meanwhile, the learned representations of attributed collaboration network is utilized to improve the selection of nearest neighbors. By harnessing the power of network representation learning, CNCRec is able to make full use of the whole citation network topology compared with previous context-aware network-based models. Extensive experiments on both DBLP and APS datasets show that the proposed method outperforms state-of-the-art methods in terms of precision, recall, and MRR (Mean Reciprocal Rank). Moreover, CNCRec can better solve the data sparsity problem compared with other CF-based baselines. © 2015 IEEE.
- Authors: Wang, Wei , Tang, Tao , Xia, Feng , Gong, Zhiguo , Chen, Zhikui , Liu, Huan
- Date: 2022
- Type: Text , Journal article
- Relation: IEEE Transactions on Big Data Vol. 8, no. 5 (2022), p. 1233-1246
- Full Text:
- Reviewed:
- Description: Citation recommendation plays an important role in the context of scholarly big data, where finding relevant papers has become more difficult because of information overload. Applying traditional collaborative filtering (CF) to citation recommendation is challenging due to the cold start problem and the lack of paper ratings. To address these challenges, in this article, we propose a collaborative filtering with network representation learning framework for citation recommendation, namely CNCRec, which is a hybrid user-based CF considering both paper content and network topology. It aims at recommending citations in heterogeneous academic information networks. CNCRec creates the paper rating matrix based on attributed citation network representation learning, where the attributes are topics extracted from the paper text information. Meanwhile, the learned representations of attributed collaboration network is utilized to improve the selection of nearest neighbors. By harnessing the power of network representation learning, CNCRec is able to make full use of the whole citation network topology compared with previous context-aware network-based models. Extensive experiments on both DBLP and APS datasets show that the proposed method outperforms state-of-the-art methods in terms of precision, recall, and MRR (Mean Reciprocal Rank). Moreover, CNCRec can better solve the data sparsity problem compared with other CF-based baselines. © 2015 IEEE.
Data-efficient graph learning for responsible prediction and recommendation
- Authors: Tang, Tao
- Date: 2024
- Type: Text , Thesis , PhD
- Full Text:
- Description: Graph learning offers a promising approach to uncover latent complex relationships within single or multiple graphs, thereby enhancing the performance of prediction and recommendation models. However, current graph learning methods often require significant computational resources and detailed training data for optimal performance. In real-world scenarios, graph-structured data are frequently sparse, with missing attributes and errors, particularly in distributed systems. Data heterogeneity can lead to Non-IID issues (e.g., imbalanced distribution) and limited computational resources. Additionally, ethical challenges in AI systems necessitate designing user-centered algorithms that consider privacy, transparency, and responsibility. These issues can degrade model performance, underscoring the need for user-centered, data-efficient graph learning models that enhance efficiency in both centralized and decentralized systems. Considering these challenges, this research investigates data-efficient graph learning for responsible prediction and recommendation in real-world applications. In this thesis, we propose effective and efficient graph learning algorithms for three sub-tasks: (1) Federated Graph Learning on Non-IID EHRs, (2) Multi-view Graph Learning on Sparse EHRs, and (3) Federated Graph Learning for Spatiotemporal Recommendation. Extracting latent disease patterns from Electronic Health Records (EHRs) is crucial for disease analysis and significantly facilitates healthcare decision-making. The first task, federated graph learning on Non-IID EHRs, aims to obtain complex disease graph representations with temporal dynamics from global imbalanced and locally insufficient Non-IID EHRs for downstream disease prediction tasks. We propose a personalized federated graph learning framework named PEARL, designed to avoid performance decreases in the global model on individual clients while enhancing the personalized capabilities of the learned global model. To further improve its effectiveness, we introduce a fine-tuning scheme to personalize the global model using local EHRs. Extensive experiments conducted on the real-world MIMIC-III dataset validate PEARL’s effectiveness, demonstrating significant improvement (10.25% on F1 scores) compared to baseline approaches. The second task, multi-view learning, offers a comprehensive exploration of both structured and unstructured EHRs. However, the intrinsic uncertainty among disease features presents a significant challenge for multi-view feature alignment. The sparsity of realworld EHRs further exacerbates this difficulty. To address these challenges, we introduce a novel fuzzy multi-view graph learning framework named FuzzyMVG, designed to mitigate the impacts of uncertainty in disease features derived from sparse EHRs. Extensive experiments on the real-world MIMIC-III dataset validate FuzzyMVG’s effectiveness. Results in the diagnosis prediction task show higher Precision (0.2991) that FuzzyMVG outperforms other state-of-the-art baselines. Finally, the third task addresses the challenges of limited computational resources, privacy leakage, and data silos in spatiotemporal Point-of-Interests (PoI) recommendation within distributed systems. We propose an efficient federated graph learning-based model to mine complex spatiotemporal features for generating recommendations. Experiments on the PoI recommendation task based on real-life check-in data validate the effectiveness of our proposed model. The results indicate that our recommendation model achieves competitive results (a higher accuracy, RMSE of 0.1096) with lower computational costs than the baselines.
- Authors: Tang, Tao
- Date: 2024
- Type: Text , Thesis , PhD
- Full Text:
- Description: Graph learning offers a promising approach to uncover latent complex relationships within single or multiple graphs, thereby enhancing the performance of prediction and recommendation models. However, current graph learning methods often require significant computational resources and detailed training data for optimal performance. In real-world scenarios, graph-structured data are frequently sparse, with missing attributes and errors, particularly in distributed systems. Data heterogeneity can lead to Non-IID issues (e.g., imbalanced distribution) and limited computational resources. Additionally, ethical challenges in AI systems necessitate designing user-centered algorithms that consider privacy, transparency, and responsibility. These issues can degrade model performance, underscoring the need for user-centered, data-efficient graph learning models that enhance efficiency in both centralized and decentralized systems. Considering these challenges, this research investigates data-efficient graph learning for responsible prediction and recommendation in real-world applications. In this thesis, we propose effective and efficient graph learning algorithms for three sub-tasks: (1) Federated Graph Learning on Non-IID EHRs, (2) Multi-view Graph Learning on Sparse EHRs, and (3) Federated Graph Learning for Spatiotemporal Recommendation. Extracting latent disease patterns from Electronic Health Records (EHRs) is crucial for disease analysis and significantly facilitates healthcare decision-making. The first task, federated graph learning on Non-IID EHRs, aims to obtain complex disease graph representations with temporal dynamics from global imbalanced and locally insufficient Non-IID EHRs for downstream disease prediction tasks. We propose a personalized federated graph learning framework named PEARL, designed to avoid performance decreases in the global model on individual clients while enhancing the personalized capabilities of the learned global model. To further improve its effectiveness, we introduce a fine-tuning scheme to personalize the global model using local EHRs. Extensive experiments conducted on the real-world MIMIC-III dataset validate PEARL’s effectiveness, demonstrating significant improvement (10.25% on F1 scores) compared to baseline approaches. The second task, multi-view learning, offers a comprehensive exploration of both structured and unstructured EHRs. However, the intrinsic uncertainty among disease features presents a significant challenge for multi-view feature alignment. The sparsity of realworld EHRs further exacerbates this difficulty. To address these challenges, we introduce a novel fuzzy multi-view graph learning framework named FuzzyMVG, designed to mitigate the impacts of uncertainty in disease features derived from sparse EHRs. Extensive experiments on the real-world MIMIC-III dataset validate FuzzyMVG’s effectiveness. Results in the diagnosis prediction task show higher Precision (0.2991) that FuzzyMVG outperforms other state-of-the-art baselines. Finally, the third task addresses the challenges of limited computational resources, privacy leakage, and data silos in spatiotemporal Point-of-Interests (PoI) recommendation within distributed systems. We propose an efficient federated graph learning-based model to mine complex spatiotemporal features for generating recommendations. Experiments on the PoI recommendation task based on real-life check-in data validate the effectiveness of our proposed model. The results indicate that our recommendation model achieves competitive results (a higher accuracy, RMSE of 0.1096) with lower computational costs than the baselines.
Data-efficient graph learning meets ethical challenges
- Authors: Tang, Tao
- Date: 2023
- Type: Text , Conference paper
- Relation: 16th ACM International Conference on Web Search and Data Mining, WSDM 2023, Singapore, 27 February to 3 March 2023, WSDM 2023 - Proceedings of the 16th ACM International Conference on Web Search and Data Mining p. 1218-1219
- Full Text: false
- Reviewed:
- Description: Recommender systems have achieved great success in our daily life. In recent years, the ethical concerns of AI systems have gained lots of attention. At the same time, graph learning techniques are powerful in modelling the complex relations among users and items under recommender system applications. These graph learning- based methods are data hungry, which brought a significant data efficiency challenge. In this proposal, I introduce my PhD research from three aspects: 1) Efficient privacy-preserving recommendation for imbalanced data. 2) Efficient recommendation model training for Insufficient samples. 3) Explainability in the social recommendation. Challenges and solutions of the above research problems have been proposed in this proposal. © 2023 Owner/Author.
Detecting outlier patterns with query-based artificially generated searching conditions
- Yu, Shuo, Xia, Feng, Sun, Yuchen, Tang, Tao, Yan, Xiaoran, Lee, Ivan
- Authors: Yu, Shuo , Xia, Feng , Sun, Yuchen , Tang, Tao , Yan, Xiaoran , Lee, Ivan
- Date: 2021
- Type: Text , Journal article
- Relation: IEEE Transactions on Computational Social Systems Vol. 8, no. 1 (2021), p. 134-147
- Full Text:
- Reviewed:
- Description: In the age of social computing, finding interesting network patterns or motifs is significant and critical for various areas, such as decision intelligence, intrusion detection, medical diagnosis, social network analysis, fake news identification, and national security. However, subgraph matching remains a computationally challenging problem, let alone identifying special motifs among them. This is especially the case in large heterogeneous real-world networks. In this article, we propose an efficient solution for discovering and ranking human behavior patterns based on network motifs by exploring a user's query in an intelligent way. Our method takes advantage of the semantics provided by a user's query, which in turn provides the mathematical constraint that is crucial for faster detection. We propose an approach to generate query conditions based on the user's query. In particular, we use meta paths between the nodes to define target patterns as well as their similarities, leading to efficient motif discovery and ranking at the same time. The proposed method is examined in a real-world academic network using different similarity measures between the nodes. The experiment result demonstrates that our method can identify interesting motifs and is robust to the choice of similarity measures. © 2014 IEEE.
- Authors: Yu, Shuo , Xia, Feng , Sun, Yuchen , Tang, Tao , Yan, Xiaoran , Lee, Ivan
- Date: 2021
- Type: Text , Journal article
- Relation: IEEE Transactions on Computational Social Systems Vol. 8, no. 1 (2021), p. 134-147
- Full Text:
- Reviewed:
- Description: In the age of social computing, finding interesting network patterns or motifs is significant and critical for various areas, such as decision intelligence, intrusion detection, medical diagnosis, social network analysis, fake news identification, and national security. However, subgraph matching remains a computationally challenging problem, let alone identifying special motifs among them. This is especially the case in large heterogeneous real-world networks. In this article, we propose an efficient solution for discovering and ranking human behavior patterns based on network motifs by exploring a user's query in an intelligent way. Our method takes advantage of the semantics provided by a user's query, which in turn provides the mathematical constraint that is crucial for faster detection. We propose an approach to generate query conditions based on the user's query. In particular, we use meta paths between the nodes to define target patterns as well as their similarities, leading to efficient motif discovery and ranking at the same time. The proposed method is examined in a real-world academic network using different similarity measures between the nodes. The experiment result demonstrates that our method can identify interesting motifs and is robust to the choice of similarity measures. © 2014 IEEE.
Digital twin mobility profiling : a spatio-temporal graph learning approach
- Chen, Xin, Hou, Mingliang, Tang, Tao, Kaur, Achhardeep, Xia, Feng
- Authors: Chen, Xin , Hou, Mingliang , Tang, Tao , Kaur, Achhardeep , Xia, Feng
- Date: 2022
- Type: Text , Conference paper
- Relation: 23rd IEEE International Conference on High Performance Computing and Communications, 7th IEEE International Conference on Data Science and Systems, 19th IEEE International Conference on Smart City and 7th IEEE International Conference on Dependability in Sensor, Cloud and Big Data Systems and Applications, HPCC-DSS-SmartCity-DependSys 2021, Hainan, China, 20-22 December 2021, Proceedings 2021 IEEE 23rd International Conference on High Performance Computing & Communications, 7th International Conference on Data Science & Systems 19th International Conference on Smart City 7th International Conference on Dependability in Sensor, Cloud & Big Data Systems & Applications p. 1178-1187
- Full Text:
- Reviewed:
- Description: With the arrival of the big data era, mobility profiling has become a viable method of utilizing enormous amounts of mobility data to create an intelligent transportation system. Mobility profiling can extract potential patterns in urban traffic from mobility data and is critical for a variety of traffic-related applications. However, due to the high level of complexity and the huge amount of data, mobility profiling faces huge challenges. Digital Twin (DT) technology paves the way for cost-effective and performance-optimised management by digitally creating a virtual representation of the network to simulate its behaviour. In order to capture the complex spatio-temporal features in traffic scenario, we construct alignment diagrams to assist in completing the spatio-temporal correlation representation and design dilated alignment convolution network (DACN) to learn the fine-grained correlations, i.e., spatio-temporal interactions. We propose a digital twin mobility profiling (DTMP) framework to learn node profiles on a mobility network DT model. Extensive experiments have been conducted upon three real-world datasets. Experimental results demonstrate the effectiveness of DTMP. © 2021 IEEE.
- Authors: Chen, Xin , Hou, Mingliang , Tang, Tao , Kaur, Achhardeep , Xia, Feng
- Date: 2022
- Type: Text , Conference paper
- Relation: 23rd IEEE International Conference on High Performance Computing and Communications, 7th IEEE International Conference on Data Science and Systems, 19th IEEE International Conference on Smart City and 7th IEEE International Conference on Dependability in Sensor, Cloud and Big Data Systems and Applications, HPCC-DSS-SmartCity-DependSys 2021, Hainan, China, 20-22 December 2021, Proceedings 2021 IEEE 23rd International Conference on High Performance Computing & Communications, 7th International Conference on Data Science & Systems 19th International Conference on Smart City 7th International Conference on Dependability in Sensor, Cloud & Big Data Systems & Applications p. 1178-1187
- Full Text:
- Reviewed:
- Description: With the arrival of the big data era, mobility profiling has become a viable method of utilizing enormous amounts of mobility data to create an intelligent transportation system. Mobility profiling can extract potential patterns in urban traffic from mobility data and is critical for a variety of traffic-related applications. However, due to the high level of complexity and the huge amount of data, mobility profiling faces huge challenges. Digital Twin (DT) technology paves the way for cost-effective and performance-optimised management by digitally creating a virtual representation of the network to simulate its behaviour. In order to capture the complex spatio-temporal features in traffic scenario, we construct alignment diagrams to assist in completing the spatio-temporal correlation representation and design dilated alignment convolution network (DACN) to learn the fine-grained correlations, i.e., spatio-temporal interactions. We propose a digital twin mobility profiling (DTMP) framework to learn node profiles on a mobility network DT model. Extensive experiments have been conducted upon three real-world datasets. Experimental results demonstrate the effectiveness of DTMP. © 2021 IEEE.
Fuzzy multiview graph learning on sparse electronic health records
- Tang, Tao, Han, Zhuoyang, Yu, Shuo, Bagirov, Adil, Zhang, Qiang
- Authors: Tang, Tao , Han, Zhuoyang , Yu, Shuo , Bagirov, Adil , Zhang, Qiang
- Date: 2024
- Type: Text , Journal article
- Relation: IEEE Transactions on Fuzzy Systems Vol. 32, no. 10 (2024), p. 5520-5532
- Full Text: false
- Reviewed:
- Description: Extracting latent disease patterns from electronic health records (EHRs) is a crucial solution for disease analysis, significantly facilitating healthcare decision-making. Multiview learning presents itself as a promising approach that offers a comprehensive exploration of both structured and unstructured EHRs. However, the intrinsic uncertainty among disease features presents a significant challenge for multiview feature alignment. Besides, the sparsity of real-world EHRs also exacerbates the difficulty of feature alignment. To address these challenges, we introduce a novel fuzzy multiview graph learning framework named FuzzyMVG, which is designed for mitigating the impacts of uncertainty in disease features derived from sparse EHRs. First, we utilize auxiliary information from sparse EHRs to construct a multiview EHR graph using the structured and unstructured records. Then, for efficient feature alignment, we specially design the fuzzy logic-enhanced graph convolutional networks to obtain the fuzzy representations of time-invariant node features. Thereby, we implement a random walk strategy and long short-term memory networks to capture the distinct features of static and dynamic nodes, respectively. Extensive experiments have been conducted on the real-world MIMIC III dataset to validate the effectiveness of FuzzyMVG. Results in the diagnosis prediction task demonstrate that FuzzyMVG outperforms other state-of-the-art baselines. © 1993-2012 IEEE.
Heterogeneous graph learning for explainable recommendation over academic networks
- Chen, Xiangtai, Tang, Tao, Ren, Jing, Lee, Ivan, Chen, Honglong, Xia, Feng
- Authors: Chen, Xiangtai , Tang, Tao , Ren, Jing , Lee, Ivan , Chen, Honglong , Xia, Feng
- Date: 2021
- Type: Text , Conference paper
- Relation: 2021 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, WI-IAT 2021, Virtual, Online, 14-17 December 2021, ACM International Conference Proceeding Series p. 29-36
- Full Text:
- Reviewed:
- Description: With the explosive growth of new graduates with research degrees every year, unprecedented challenges arise for early-career researchers to find a job at a suitable institution. This study aims to understand the behavior of academic job transition and hence recommend suitable institutions for PhD graduates. Specifically, we design a deep learning model to predict the career move of early-career researchers and provide suggestions. The design is built on top of scholarly/academic networks, which contains abundant information about scientific collaboration among scholars and institutions. We construct a heterogeneous scholarly network to facilitate the exploring of the behavior of career moves and the recommendation of institutions for scholars. We devise an unsupervised learning model called HAI (Heterogeneous graph Attention InfoMax) which aggregates attention mechanism and mutual information for institution recommendation. Moreover, we propose scholar attention and meta-path attention to discover the hidden relationships between several meta-paths. With these mechanisms, HAI provides ordered recommendations with explainability. We evaluate HAI upon a real-world dataset against baseline methods. Experimental results verify the effectiveness and efficiency of our approach. © 2021 ACM.
- Authors: Chen, Xiangtai , Tang, Tao , Ren, Jing , Lee, Ivan , Chen, Honglong , Xia, Feng
- Date: 2021
- Type: Text , Conference paper
- Relation: 2021 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, WI-IAT 2021, Virtual, Online, 14-17 December 2021, ACM International Conference Proceeding Series p. 29-36
- Full Text:
- Reviewed:
- Description: With the explosive growth of new graduates with research degrees every year, unprecedented challenges arise for early-career researchers to find a job at a suitable institution. This study aims to understand the behavior of academic job transition and hence recommend suitable institutions for PhD graduates. Specifically, we design a deep learning model to predict the career move of early-career researchers and provide suggestions. The design is built on top of scholarly/academic networks, which contains abundant information about scientific collaboration among scholars and institutions. We construct a heterogeneous scholarly network to facilitate the exploring of the behavior of career moves and the recommendation of institutions for scholars. We devise an unsupervised learning model called HAI (Heterogeneous graph Attention InfoMax) which aggregates attention mechanism and mutual information for institution recommendation. Moreover, we propose scholar attention and meta-path attention to discover the hidden relationships between several meta-paths. With these mechanisms, HAI provides ordered recommendations with explainability. We evaluate HAI upon a real-world dataset against baseline methods. Experimental results verify the effectiveness and efficiency of our approach. © 2021 ACM.
KIDNet : a knowledge-aware neural network model for academic performance prediction
- Tang, Tao, Hou, Jie, Guo, Teng, Bai, Xiaomei, Tian, Xue, Noori Hoshyar, Azadeh
- Authors: Tang, Tao , Hou, Jie , Guo, Teng , Bai, Xiaomei , Tian, Xue , Noori Hoshyar, Azadeh
- Date: 2021
- Type: Text , Conference paper
- Relation: 2021 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, WI-IAT 2021, Virtual, Online14-17 December 2021, ACM International Conference Proceeding Series p. 37-44
- Full Text: false
- Reviewed:
- Description: Academic performance prediction and analysis in educational data mining is meaningful for instructors to know the student's ongoing learning status, and also provide appropriate help to students as early as possible if academic difficulties appear. In this paper, we first collect the basic information of students and courses as features. Then, we propose a novel knowledge extraction framework to obtain course knowledge features to reinforce feature groups. The comparative analyses of the knowledge similarity and average grades of the courses in all terms demonstrate a strong correlation between them. Furthermore, we build the Knowledge Interaction Discovery Network (KIDNet) model, based on factorization machine (FM) and deep neural network (DNN) algorithms. This model uses FM to model lower-order interactions of sparse features and employs DNN to model higher-order interactions of both dense and sparse features. The effectiveness of KIDNet has been validated by conducting experiments based on a real-world dataset. © 2021 ACM.
Marking the pace : a blockchain-enhanced privacy-traceable strategy for federated recommender systems
- Cai, Zhen, Tang, Tao, Yu, Shuo, Xiao, Yunpeng, Xia, Feng
- Authors: Cai, Zhen , Tang, Tao , Yu, Shuo , Xiao, Yunpeng , Xia, Feng
- Date: 2024
- Type: Text , Journal article
- Relation: IEEE Internet of Things Journal Vol. 11, no. 6 (2024), p. 10384-10397
- Full Text:
- Reviewed:
- Description: Federated recommender systems have been crucially enhanced through data sharing and continuous model updates, attributed to the pervasive connectivity and distributed computing capabilities of Internet of Things (IoT) devices. Given the sensitivity of IoT data, transparent data processing in data sharing and model updates is paramount. However, existing methods fall short in tracing the flow of shared data and the evolution of model updates. Consequently, data sharing is vulnerable to exploitation by malicious entities, raising significant data privacy concerns, while excluding data sharing will result in suboptimal recommendations. To mitigate these concerns, we present LIBERATE, a privacy-traceable federated recommender system. We design a blockchain-based traceability mechanism, ensuring data privacy during data sharing and model updates. We further enhance privacy protection by incorporating local differential privacy in user-server communication. Extensive evaluations with the real-world data set corroborate LIBERATE's capabilities in ensuring data privacy during data sharing and model update while maintaining efficiency and performance. Results underscore blockchain-based traceability mechanism as a promising solution for privacy preserving in federated recommender systems. © 2014 IEEE.
- Authors: Cai, Zhen , Tang, Tao , Yu, Shuo , Xiao, Yunpeng , Xia, Feng
- Date: 2024
- Type: Text , Journal article
- Relation: IEEE Internet of Things Journal Vol. 11, no. 6 (2024), p. 10384-10397
- Full Text:
- Reviewed:
- Description: Federated recommender systems have been crucially enhanced through data sharing and continuous model updates, attributed to the pervasive connectivity and distributed computing capabilities of Internet of Things (IoT) devices. Given the sensitivity of IoT data, transparent data processing in data sharing and model updates is paramount. However, existing methods fall short in tracing the flow of shared data and the evolution of model updates. Consequently, data sharing is vulnerable to exploitation by malicious entities, raising significant data privacy concerns, while excluding data sharing will result in suboptimal recommendations. To mitigate these concerns, we present LIBERATE, a privacy-traceable federated recommender system. We design a blockchain-based traceability mechanism, ensuring data privacy during data sharing and model updates. We further enhance privacy protection by incorporating local differential privacy in user-server communication. Extensive evaluations with the real-world data set corroborate LIBERATE's capabilities in ensuring data privacy during data sharing and model update while maintaining efficiency and performance. Results underscore blockchain-based traceability mechanism as a promising solution for privacy preserving in federated recommender systems. © 2014 IEEE.
MissII: Missing Information Imputation for Traffic Data
- Hou, Mingliang, Tang, Tao, Xia, Feng, Sultan, Ibrahim, Kaur, Roopdeep, Kong, Xiangjie
- Authors: Hou, Mingliang , Tang, Tao , Xia, Feng , Sultan, Ibrahim , Kaur, Roopdeep , Kong, Xiangjie
- Date: 2024
- Type: Text , Journal article
- Relation: IEEE transactions on emerging topics in computing Vol. 12, no. 3 (2024), p. 752-765
- Full Text: false
- Reviewed:
- Description: Cyber-Physical-Social Systems (CPSS) offer a new perspective for applying advanced information technology to improve urban transportation. However, real-world traffic datasets collected from sensing devices like loop sensors often contain corrupted or missing values. The incompleteness of traffic data poses great challenges to downstream data analysis tasks and applications. Most existing data-driven methods only impute missing values based on observed data or hypothetical models, thus ignoring the incorporation of social world information into traffic data imputation. The connection between real-world social activities and CPSS is crucial. In this paper, a novel theory-guided traffic data imputation framework, namely MissII, is proposed. In MissII, we first estimate the traffic flow between two PoIs (Points of Interest) according to spatial interaction theory by considering the physical environment information (e.g., population distributions) and human social interactions (e.g., destination choice game). Moreover, we further refine the estimated traffic flow by considering the effects of road interactions and PoIs. Then, the estimated traffic flow is input into the non-parametric GAN model as real samples to guide the training process. Extensive experiments are conducted on real-world traffic dataset to demonstrate the effectiveness of the proposed framework.
Personalized federated graph learning on non-IID electronic health records
- Tang, Tao, Han, Zhuoyang, Cai, Zhen, Yu, Shuo, Zhou, Xiaokang, Oseni, Taiwo, Das, Sajal
- Authors: Tang, Tao , Han, Zhuoyang , Cai, Zhen , Yu, Shuo , Zhou, Xiaokang , Oseni, Taiwo , Das, Sajal
- Date: 2024
- Type: Text , Journal article
- Relation: IEEE Transactions on Neural Networks and Learning Systems Vol. 35, no. 9 (2024), p. 11843-11856
- Full Text: false
- Reviewed:
- Description: Understanding the latent disease patterns embedded in electronic health records (EHRs) is crucial for making precise and proactive healthcare decisions. Federated graph learning-based methods are commonly employed to extract complex disease patterns from the distributed EHRs without sharing the client-side raw data. However, the intrinsic characteristics of the distributed EHRs are typically non-independent and identically distributed (Non-IID), significantly bringing challenges related to data imbalance and leading to a notable decrease in the effectiveness of making healthcare decisions derived from the global model. To address these challenges, we introduce a novel personalized federated learning framework named PEARL, which is designed for disease prediction on Non-IID EHRs. Specifically, PEARL incorporates disease diagnostic code attention and admission record attention to extract patient embeddings from all EHRs. Then, PEARL integrates self-supervised learning into a federated learning framework to train a global model for hierarchical disease prediction. To improve the performance of the client model, we further introduce a fine-tuning scheme to personalize the global model using local EHRs. During the global model updating process, a differential privacy (DP) scheme is implemented, providing a high-level privacy guarantee. Extensive experiments conducted on the real-world MIMIC-III dataset validate the effectiveness of PEARL, demonstrating competitive results when compared with baselines. © 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
- «
- ‹
- 1
- ›
- »