Themes in data mining, big data, and crime analytics
- Authors: Oatley,Giles
- Date: 2022
- Type: Text , Journal article , Review
- Relation: Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery Vol. 12, no. 2 (2022), p.
- Full Text:
- Reviewed:
- Description: This article examines the impact of new AI-related technologies in data mining and big data on important research questions in crime analytics. Because the field is so broad, the review focuses on a selection of the most important topics. Challenges for information management, and in turn law and society, include: AI-powered predictive policing; big data for legal and adversarial decisions; bias using big data and analytics in profiling and predicting criminality; forecasting crime risk and crime rates; and, regulating AI systems. This article is categorized under: Algorithmic Development > Spatial and Temporal Data Mining Fundamental Concepts of Data and Knowledge > Big Data Mining Technologies > Artificial Intelligence Application Areas > Data Mining Software Tools. © 2021 Wiley Periodicals LLC.
- Authors: Oatley,Giles
- Date: 2022
- Type: Text , Journal article , Review
- Relation: Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery Vol. 12, no. 2 (2022), p.
- Full Text:
- Reviewed:
- Description: This article examines the impact of new AI-related technologies in data mining and big data on important research questions in crime analytics. Because the field is so broad, the review focuses on a selection of the most important topics. Challenges for information management, and in turn law and society, include: AI-powered predictive policing; big data for legal and adversarial decisions; bias using big data and analytics in profiling and predicting criminality; forecasting crime risk and crime rates; and, regulating AI systems. This article is categorized under: Algorithmic Development > Spatial and Temporal Data Mining Fundamental Concepts of Data and Knowledge > Big Data Mining Technologies > Artificial Intelligence Application Areas > Data Mining Software Tools. © 2021 Wiley Periodicals LLC.
Establishing effective communications in disaster affected areas and artificial intelligence based detection using social media platform
- Raza, Mohsin, Awais, Muhammad, Ali, Kamran, Aslam, Nauman, Paranthaman, Vishnu, Imran, Muhammad, Ali, Farman
- Authors: Raza, Mohsin , Awais, Muhammad , Ali, Kamran , Aslam, Nauman , Paranthaman, Vishnu , Imran, Muhammad , Ali, Farman
- Date: 2020
- Type: Text , Journal article
- Relation: Future Generation Computer Systems Vol. 112, no. (2020), p. 1057-1069
- Full Text:
- Reviewed:
- Description: Floods, earthquakes, storm surges and other natural disasters severely affect the communication infrastructure and thus compromise the effectiveness of communications dependent rescue and warning services. In this paper, a user centric approach is proposed to establish communications in disaster affected and communication outage areas. The proposed scheme forms ad hoc clusters to facilitate emergency communications and connect end-users/ User Equipment (UE) to the core network. A novel cluster formation with single and multi-hop communication framework is proposed. The overall throughput in the formed clusters is maximized using convex optimization. In addition, an intelligent system is designed to label different clusters and their localities into affected and non-affected areas. As a proof of concept, the labeling is achieved on flooding dataset where region specific social media information is used in proposed machine learning techniques to classify the disaster-prone areas as flooded or unflooded. The suitable results of the proposed machine learning schemes suggest its use along with proposed clustering techniques to revive communications in disaster affected areas and to classify the impact of disaster for different locations in disaster-prone areas. © 2020 Elsevier B.V.
- Authors: Raza, Mohsin , Awais, Muhammad , Ali, Kamran , Aslam, Nauman , Paranthaman, Vishnu , Imran, Muhammad , Ali, Farman
- Date: 2020
- Type: Text , Journal article
- Relation: Future Generation Computer Systems Vol. 112, no. (2020), p. 1057-1069
- Full Text:
- Reviewed:
- Description: Floods, earthquakes, storm surges and other natural disasters severely affect the communication infrastructure and thus compromise the effectiveness of communications dependent rescue and warning services. In this paper, a user centric approach is proposed to establish communications in disaster affected and communication outage areas. The proposed scheme forms ad hoc clusters to facilitate emergency communications and connect end-users/ User Equipment (UE) to the core network. A novel cluster formation with single and multi-hop communication framework is proposed. The overall throughput in the formed clusters is maximized using convex optimization. In addition, an intelligent system is designed to label different clusters and their localities into affected and non-affected areas. As a proof of concept, the labeling is achieved on flooding dataset where region specific social media information is used in proposed machine learning techniques to classify the disaster-prone areas as flooded or unflooded. The suitable results of the proposed machine learning schemes suggest its use along with proposed clustering techniques to revive communications in disaster affected areas and to classify the impact of disaster for different locations in disaster-prone areas. © 2020 Elsevier B.V.
Attributed collaboration network embedding for academic relationship mining
- Wang, Wei, Liu, Jiaying, Tang, Tao, Tuarob, Suppawong, Xia, Feng, Gong, Zhiguo, King, Irwin
- Authors: Wang, Wei , Liu, Jiaying , Tang, Tao , Tuarob, Suppawong , Xia, Feng , Gong, Zhiguo , King, Irwin
- Date: 2021
- Type: Text , Journal article
- Relation: ACM Transactions on the Web Vol. 15, no. 1 (2021), p.
- Full Text:
- Reviewed:
- Description: Finding both efficient and effective quantitative representations for scholars in scientific digital libraries has been a focal point of research. The unprecedented amounts of scholarly datasets, combined with contemporary machine learning and big data techniques, have enabled intelligent and automatic profiling of scholars from this vast and ever-increasing pool of scholarly data. Meanwhile, recent advance in network embedding techniques enables us to mitigate the challenges of large scale and sparsity of academic collaboration networks. In real-world academic social networks, scholars are accompanied with various attributes or features, such as co-authorship and publication records, which result in attributed collaboration networks. It has been observed that both network topology and scholar attributes are important in academic relationship mining. However, previous studies mainly focus on network topology, whereas scholar attributes are overlooked. Moreover, the influence of different scholar attributes are unclear. To bridge this gap, in this work, we present a novel framework of Attributed Collaboration Network Embedding (ACNE) for academic relationship mining. ACNE extracts four types of scholar attributes based on the proposed scholar profiling model, including demographics, research, influence, and sociability. ACNE can learn a low-dimensional representation of scholars considering both scholar attributes and network topology simultaneously. We demonstrate the effectiveness and potentials of ACNE in academic relationship mining by performing collaborator recommendation on two real-world datasets and the contribution and importance of each scholar attribute on scientific collaborator recommendation is investigated. Our work may shed light on academic relationship mining by taking advantage of attributed collaboration network embedding. © 2020 ACM.
- Authors: Wang, Wei , Liu, Jiaying , Tang, Tao , Tuarob, Suppawong , Xia, Feng , Gong, Zhiguo , King, Irwin
- Date: 2021
- Type: Text , Journal article
- Relation: ACM Transactions on the Web Vol. 15, no. 1 (2021), p.
- Full Text:
- Reviewed:
- Description: Finding both efficient and effective quantitative representations for scholars in scientific digital libraries has been a focal point of research. The unprecedented amounts of scholarly datasets, combined with contemporary machine learning and big data techniques, have enabled intelligent and automatic profiling of scholars from this vast and ever-increasing pool of scholarly data. Meanwhile, recent advance in network embedding techniques enables us to mitigate the challenges of large scale and sparsity of academic collaboration networks. In real-world academic social networks, scholars are accompanied with various attributes or features, such as co-authorship and publication records, which result in attributed collaboration networks. It has been observed that both network topology and scholar attributes are important in academic relationship mining. However, previous studies mainly focus on network topology, whereas scholar attributes are overlooked. Moreover, the influence of different scholar attributes are unclear. To bridge this gap, in this work, we present a novel framework of Attributed Collaboration Network Embedding (ACNE) for academic relationship mining. ACNE extracts four types of scholar attributes based on the proposed scholar profiling model, including demographics, research, influence, and sociability. ACNE can learn a low-dimensional representation of scholars considering both scholar attributes and network topology simultaneously. We demonstrate the effectiveness and potentials of ACNE in academic relationship mining by performing collaborator recommendation on two real-world datasets and the contribution and importance of each scholar attribute on scientific collaborator recommendation is investigated. Our work may shed light on academic relationship mining by taking advantage of attributed collaboration network embedding. © 2020 ACM.
A new dimensionality-unbiased score for efficient and effective outlying aspect mining
- Samariya, Durgesh, Ma, Jiangang
- Authors: Samariya, Durgesh , Ma, Jiangang
- Date: 2022
- Type: Text , Journal article
- Relation: Data Science and Engineering Vol. 7, no. 2 (2022), p. 120-135
- Full Text:
- Reviewed:
- Description: The main aim of the outlying aspect mining algorithm is to automatically detect the subspace(s) (a.k.a. aspect(s)), where a given data point is dramatically different than the rest of the data in each of those subspace(s) (aspect(s)). To rank the subspaces for a given data point, a scoring measure is required to compute the outlying degree of the given data in each subspace. In this paper, we introduce a new measure to compute outlying degree, called Simple Isolation score using Nearest Neighbor Ensemble (SiNNE), which not only detects the outliers but also provides an explanation on why the selected point is an outlier. SiNNE is a dimensionally unbias measure in its raw form, which means the scores produced by SiNNE are compared directly with subspaces having different dimensions. Thus, it does not require any normalization to make the score unbiased. Our experimental results on synthetic and publicly available real-world datasets revealed that (i) SiNNE produces better or at least the same results as existing scores. (ii) It improves the run time of the existing outlying aspect mining algorithm based on beam search by at least two orders of magnitude. SiNNE allows the existing outlying aspect mining algorithm to run in datasets with hundreds of thousands of instances and thousands of dimensions which was not possible before. © 2022, The Author(s).
- Authors: Samariya, Durgesh , Ma, Jiangang
- Date: 2022
- Type: Text , Journal article
- Relation: Data Science and Engineering Vol. 7, no. 2 (2022), p. 120-135
- Full Text:
- Reviewed:
- Description: The main aim of the outlying aspect mining algorithm is to automatically detect the subspace(s) (a.k.a. aspect(s)), where a given data point is dramatically different than the rest of the data in each of those subspace(s) (aspect(s)). To rank the subspaces for a given data point, a scoring measure is required to compute the outlying degree of the given data in each subspace. In this paper, we introduce a new measure to compute outlying degree, called Simple Isolation score using Nearest Neighbor Ensemble (SiNNE), which not only detects the outliers but also provides an explanation on why the selected point is an outlier. SiNNE is a dimensionally unbias measure in its raw form, which means the scores produced by SiNNE are compared directly with subspaces having different dimensions. Thus, it does not require any normalization to make the score unbiased. Our experimental results on synthetic and publicly available real-world datasets revealed that (i) SiNNE produces better or at least the same results as existing scores. (ii) It improves the run time of the existing outlying aspect mining algorithm based on beam search by at least two orders of magnitude. SiNNE allows the existing outlying aspect mining algorithm to run in datasets with hundreds of thousands of instances and thousands of dimensions which was not possible before. © 2022, The Author(s).
Memetic moments : the speed of twitter memes
- Smith, Naomi, Copland, Simon
- Authors: Smith, Naomi , Copland, Simon
- Date: 2021
- Type: Text , Journal article
- Relation: Journal of Digital Social Research Vol. , no. (2021), p. 23-48
- Full Text:
- Reviewed:
- Description: This paper examines how speed shapes internet culture. To do so, it analyses ‘memetic moments’ on Twitter, short-lived and rapidly circulated memes that quickly reach saturation. The paper examines two ‘memetic moments’ on Twitter in 2018 and 2019 to assess how they develop over time. Each case study comprises a week’s worth of relevant tweets that were analysed for temporal patterns. We analyse these ‘memetic moments’ through Lefebvre’s (2004) work on rhythmanalysis, arguing that the temporal patterns of memes on Twitter can be understood through his concepts of repetition, presence and dialogue. While seemingly trivial, memetic moments underscore the didactic relationship between social media and news media while also providing a way to approach complex social issues.
- Authors: Smith, Naomi , Copland, Simon
- Date: 2021
- Type: Text , Journal article
- Relation: Journal of Digital Social Research Vol. , no. (2021), p. 23-48
- Full Text:
- Reviewed:
- Description: This paper examines how speed shapes internet culture. To do so, it analyses ‘memetic moments’ on Twitter, short-lived and rapidly circulated memes that quickly reach saturation. The paper examines two ‘memetic moments’ on Twitter in 2018 and 2019 to assess how they develop over time. Each case study comprises a week’s worth of relevant tweets that were analysed for temporal patterns. We analyse these ‘memetic moments’ through Lefebvre’s (2004) work on rhythmanalysis, arguing that the temporal patterns of memes on Twitter can be understood through his concepts of repetition, presence and dialogue. While seemingly trivial, memetic moments underscore the didactic relationship between social media and news media while also providing a way to approach complex social issues.
Educational anomaly analytics : features, methods, and challenges
- Guo, Teng, Bai, Xiaomei, Tian, Xue, Firmin, Sally, Xia, Feng
- Authors: Guo, Teng , Bai, Xiaomei , Tian, Xue , Firmin, Sally , Xia, Feng
- Date: 2022
- Type: Text , Journal article , Review
- Relation: Frontiers in Big Data Vol. 4, no. (2022), p.
- Full Text:
- Reviewed:
- Description: Anomalies in education affect the personal careers of students and universities' retention rates. Understanding the laws behind educational anomalies promotes the development of individual students and improves the overall quality of education. However, the inaccessibility of educational data hinders the development of the field. Previous research in this field used questionnaires, which are time- and cost-consuming and hardly applicable to large-scale student cohorts. With the popularity of educational management systems and the rise of online education during the prevalence of COVID-19, a large amount of educational data is available online and offline, providing an unprecedented opportunity to explore educational anomalies from a data-driven perspective. As an emerging field, educational anomaly analytics rapidly attracts scholars from a variety of fields, including education, psychology, sociology, and computer science. This paper intends to provide a comprehensive review of data-driven analytics of educational anomalies from a methodological standpoint. We focus on the following five types of research that received the most attention: course failure prediction, dropout prediction, mental health problems detection, prediction of difficulty in graduation, and prediction of difficulty in employment. Then, we discuss the challenges of current related research. This study aims to provide references for educational policymaking while promoting the development of educational anomaly analytics as a growing field. Copyright © 2022 Guo, Bai, Tian, Firmin and Xia.
- Authors: Guo, Teng , Bai, Xiaomei , Tian, Xue , Firmin, Sally , Xia, Feng
- Date: 2022
- Type: Text , Journal article , Review
- Relation: Frontiers in Big Data Vol. 4, no. (2022), p.
- Full Text:
- Reviewed:
- Description: Anomalies in education affect the personal careers of students and universities' retention rates. Understanding the laws behind educational anomalies promotes the development of individual students and improves the overall quality of education. However, the inaccessibility of educational data hinders the development of the field. Previous research in this field used questionnaires, which are time- and cost-consuming and hardly applicable to large-scale student cohorts. With the popularity of educational management systems and the rise of online education during the prevalence of COVID-19, a large amount of educational data is available online and offline, providing an unprecedented opportunity to explore educational anomalies from a data-driven perspective. As an emerging field, educational anomaly analytics rapidly attracts scholars from a variety of fields, including education, psychology, sociology, and computer science. This paper intends to provide a comprehensive review of data-driven analytics of educational anomalies from a methodological standpoint. We focus on the following five types of research that received the most attention: course failure prediction, dropout prediction, mental health problems detection, prediction of difficulty in graduation, and prediction of difficulty in employment. Then, we discuss the challenges of current related research. This study aims to provide references for educational policymaking while promoting the development of educational anomaly analytics as a growing field. Copyright © 2022 Guo, Bai, Tian, Firmin and Xia.
Development and governance of FAIR thresholds for a data federation
- Wong, Megan, Levett, Kerry, Lee, Ashlin, Box, Paul, Simons, Bruce, David, Rakesh, MacLeod, Andrew, Taylor, Nicolas, Schneider, Derek, Thompson, Helen
- Authors: Wong, Megan , Levett, Kerry , Lee, Ashlin , Box, Paul , Simons, Bruce , David, Rakesh , MacLeod, Andrew , Taylor, Nicolas , Schneider, Derek , Thompson, Helen
- Date: 2022
- Type: Text , Journal article
- Relation: Data Science Journal Vol. 21, no. (2022), p.
- Full Text:
- Reviewed:
- Description: The FAIR (findable, accessible, interoperable, and re-usable) principles and practice recommendations provide high level guidance and recommendations that are not research-domain specific in nature. There remains a gap in practice at the data provider and domain scientist level demonstrating how the FAIR principles can be applied beyond a set of generalist guidelines to meet the needs of a specific domain community. We present our insights developing FAIR thresholds in a domain specific context for self-governance by a community (agricultural research). ‘Minimum thresholds’ for FAIR data are required to align expectations for data delivered from providers’ distributed data stores through a community-governed federation (the Agricultural Research Federation, AgReFed). Data providers were supported to make data holdings more FAIR. There was a range of different FAIR starting points, organisational goals, and end user needs, solutions, and capabilities. This informed the distilling of a set of FAIR criteria ranging from ‘Minimum thresholds’ to ‘Stretch targets’. These were operationalised through consensus into a framework for governance and implementation by the agricultural research domain community. Improving the FAIR maturity of data took resourcing and incentive to do so, highlighting the challenge for data federations to generate value whilst reducing costs of participation. Our experience showed a role for supporting collective advocacy, relationship brokering, tailored support, and low-bar tooling access particularly across the areas of data structure, access and semantics that were challenging to domain researchers. Active democratic participation supported by a governance framework like AgReFed’s will ensure participants have a say in how federations can deliver individual and collective benefits for members. © 2022 The Author(s).
- Authors: Wong, Megan , Levett, Kerry , Lee, Ashlin , Box, Paul , Simons, Bruce , David, Rakesh , MacLeod, Andrew , Taylor, Nicolas , Schneider, Derek , Thompson, Helen
- Date: 2022
- Type: Text , Journal article
- Relation: Data Science Journal Vol. 21, no. (2022), p.
- Full Text:
- Reviewed:
- Description: The FAIR (findable, accessible, interoperable, and re-usable) principles and practice recommendations provide high level guidance and recommendations that are not research-domain specific in nature. There remains a gap in practice at the data provider and domain scientist level demonstrating how the FAIR principles can be applied beyond a set of generalist guidelines to meet the needs of a specific domain community. We present our insights developing FAIR thresholds in a domain specific context for self-governance by a community (agricultural research). ‘Minimum thresholds’ for FAIR data are required to align expectations for data delivered from providers’ distributed data stores through a community-governed federation (the Agricultural Research Federation, AgReFed). Data providers were supported to make data holdings more FAIR. There was a range of different FAIR starting points, organisational goals, and end user needs, solutions, and capabilities. This informed the distilling of a set of FAIR criteria ranging from ‘Minimum thresholds’ to ‘Stretch targets’. These were operationalised through consensus into a framework for governance and implementation by the agricultural research domain community. Improving the FAIR maturity of data took resourcing and incentive to do so, highlighting the challenge for data federations to generate value whilst reducing costs of participation. Our experience showed a role for supporting collective advocacy, relationship brokering, tailored support, and low-bar tooling access particularly across the areas of data structure, access and semantics that were challenging to domain researchers. Active democratic participation supported by a governance framework like AgReFed’s will ensure participants have a say in how federations can deliver individual and collective benefits for members. © 2022 The Author(s).
- «
- ‹
- 1
- ›
- »