The spectrum of big data analytics
- Authors: Sun, Zhaohao , Huo, Yanxia
- Date: 2021
- Type: Text , Journal article
- Relation: Journal of Computer Information Systems Vol. 61, no. 2 (2021), p. 154-162
- Full Text:
- Reviewed:
- Description: Big data analytics is playing a pivotal role in big data, artificial intelligence, management, governance, and society with the dramatic development of big data, analytics, artificial intelligence. However, what is the spectrum of big data analytics and how to develop the spectrum are still a fundamental issue in the academic community. This article addresses these issues by presenting a big data derived small data approach. It then uses the proposed approach to analyze the top 150 profiles of Google Scholar, including big data analytics as one research field and proposes a spectrum of big data analytics. The spectrum of big data analytics mainly includes data mining, machine learning, data science and systems, artificial intelligence, distributed computing and systems, and cloud computing, taking into account degree of importance. The proposed approach and findings will generalize to other researchers and practitioners of big data analytics, machine learning, artificial intelligence, and data science. © 2019 International Association for Computer Information Systems.
- Authors: Sun, Zhaohao , Huo, Yanxia
- Date: 2021
- Type: Text , Journal article
- Relation: Journal of Computer Information Systems Vol. 61, no. 2 (2021), p. 154-162
- Full Text:
- Reviewed:
- Description: Big data analytics is playing a pivotal role in big data, artificial intelligence, management, governance, and society with the dramatic development of big data, analytics, artificial intelligence. However, what is the spectrum of big data analytics and how to develop the spectrum are still a fundamental issue in the academic community. This article addresses these issues by presenting a big data derived small data approach. It then uses the proposed approach to analyze the top 150 profiles of Google Scholar, including big data analytics as one research field and proposes a spectrum of big data analytics. The spectrum of big data analytics mainly includes data mining, machine learning, data science and systems, artificial intelligence, distributed computing and systems, and cloud computing, taking into account degree of importance. The proposed approach and findings will generalize to other researchers and practitioners of big data analytics, machine learning, artificial intelligence, and data science. © 2019 International Association for Computer Information Systems.
Data-driven computational social science : A survey
- Zhang, Jun, Wang, Wei, Xia, Feng, Lin, Yu-Ru, Tong, Hanghang
- Authors: Zhang, Jun , Wang, Wei , Xia, Feng , Lin, Yu-Ru , Tong, Hanghang
- Date: 2020
- Type: Text , Journal article
- Relation: Big Data Research Vol. 21, no. (2020), p. 1-22
- Full Text:
- Reviewed:
- Description: Social science concerns issues on individuals, relationships, and the whole society. The complexity of research topics in social science makes it the amalgamation of multiple disciplines, such as economics, political science, and sociology, etc. For centuries, scientists have conducted many studies to understand the mechanisms of the society. However, due to the limitations of traditional research methods, there exist many critical social issues to be explored. To solve those issues, computational social science emerges due to the rapid advancements of computation technologies and the profound studies on social science. With the aids of the advanced research techniques, various kinds of data from diverse areas can be acquired nowadays, and they can help us look into social problems with a new eye. As a result, utilizing various data to reveal issues derived from computational social science area has attracted more and more attentions. In this paper, to the best of our knowledge, we present a survey on datadriven computational social science for the first time which primarily focuses on reviewing application domains involving human dynamics. The state-of-the-art research on human dynamics is reviewed from three aspects: individuals, relationships, and collectives. Specifically, the research methodologies used to address research challenges in aforementioned application domains are summarized. In addition, some important open challenges with respect to both emerging research topics and research methods are discussed.
- Authors: Zhang, Jun , Wang, Wei , Xia, Feng , Lin, Yu-Ru , Tong, Hanghang
- Date: 2020
- Type: Text , Journal article
- Relation: Big Data Research Vol. 21, no. (2020), p. 1-22
- Full Text:
- Reviewed:
- Description: Social science concerns issues on individuals, relationships, and the whole society. The complexity of research topics in social science makes it the amalgamation of multiple disciplines, such as economics, political science, and sociology, etc. For centuries, scientists have conducted many studies to understand the mechanisms of the society. However, due to the limitations of traditional research methods, there exist many critical social issues to be explored. To solve those issues, computational social science emerges due to the rapid advancements of computation technologies and the profound studies on social science. With the aids of the advanced research techniques, various kinds of data from diverse areas can be acquired nowadays, and they can help us look into social problems with a new eye. As a result, utilizing various data to reveal issues derived from computational social science area has attracted more and more attentions. In this paper, to the best of our knowledge, we present a survey on datadriven computational social science for the first time which primarily focuses on reviewing application domains involving human dynamics. The state-of-the-art research on human dynamics is reviewed from three aspects: individuals, relationships, and collectives. Specifically, the research methodologies used to address research challenges in aforementioned application domains are summarized. In addition, some important open challenges with respect to both emerging research topics and research methods are discussed.
Rapid health data repository allocation using predictive machine learning
- Uddin, Ashraf, Stranieri, Andrew, Gondal, Iqbal, Balasubramanian, Venki
- Authors: Uddin, Ashraf , Stranieri, Andrew , Gondal, Iqbal , Balasubramanian, Venki
- Date: 2020
- Type: Text , Journal article
- Relation: Health Informatics Journal Vol. 26, no. 4 (2020), p. 3009-3036
- Full Text:
- Reviewed:
- Description: Health-related data is stored in a number of repositories that are managed and controlled by different entities. For instance, Electronic Health Records are usually administered by governments. Electronic Medical Records are typically controlled by health care providers, whereas Personal Health Records are managed directly by patients. Recently, Blockchain-based health record systems largely regulated by technology have emerged as another type of repository. Repositories for storing health data differ from one another based on cost, level of security and quality of performance. Not only has the type of repositories increased in recent years, but the quantum of health data to be stored has increased. For instance, the advent of wearable sensors that capture physiological signs has resulted in an exponential growth in digital health data. The increase in the types of repository and amount of data has driven a need for intelligent processes to select appropriate repositories as data is collected. However, the storage allocation decision is complex and nuanced. The challenges are exacerbated when health data are continuously streamed, as is the case with wearable sensors. Although patients are not always solely responsible for determining which repository should be used, they typically have some input into this decision. Patients can be expected to have idiosyncratic preferences regarding storage decisions depending on their unique contexts. In this paper, we propose a predictive model for the storage of health data that can meet patient needs and make storage decisions rapidly, in real-time, even with data streaming from wearable sensors. The model is built with a machine learning classifier that learns the mapping between characteristics of health data and features of storage repositories from a training set generated synthetically from correlations evident from small samples of experts. Results from the evaluation demonstrate the viability of the machine learning technique used. © The Author(s) 2020.
- Authors: Uddin, Ashraf , Stranieri, Andrew , Gondal, Iqbal , Balasubramanian, Venki
- Date: 2020
- Type: Text , Journal article
- Relation: Health Informatics Journal Vol. 26, no. 4 (2020), p. 3009-3036
- Full Text:
- Reviewed:
- Description: Health-related data is stored in a number of repositories that are managed and controlled by different entities. For instance, Electronic Health Records are usually administered by governments. Electronic Medical Records are typically controlled by health care providers, whereas Personal Health Records are managed directly by patients. Recently, Blockchain-based health record systems largely regulated by technology have emerged as another type of repository. Repositories for storing health data differ from one another based on cost, level of security and quality of performance. Not only has the type of repositories increased in recent years, but the quantum of health data to be stored has increased. For instance, the advent of wearable sensors that capture physiological signs has resulted in an exponential growth in digital health data. The increase in the types of repository and amount of data has driven a need for intelligent processes to select appropriate repositories as data is collected. However, the storage allocation decision is complex and nuanced. The challenges are exacerbated when health data are continuously streamed, as is the case with wearable sensors. Although patients are not always solely responsible for determining which repository should be used, they typically have some input into this decision. Patients can be expected to have idiosyncratic preferences regarding storage decisions depending on their unique contexts. In this paper, we propose a predictive model for the storage of health data that can meet patient needs and make storage decisions rapidly, in real-time, even with data streaming from wearable sensors. The model is built with a machine learning classifier that learns the mapping between characteristics of health data and features of storage repositories from a training set generated synthetically from correlations evident from small samples of experts. Results from the evaluation demonstrate the viability of the machine learning technique used. © The Author(s) 2020.
The gene of scientific success
- Kong, Xiangjie, Zhang, Jun, Zhang, Da, Bu, Yi, Ding, Ying, Xia, Feng
- Authors: Kong, Xiangjie , Zhang, Jun , Zhang, Da , Bu, Yi , Ding, Ying , Xia, Feng
- Date: 2020
- Type: Text , Journal article
- Relation: ACM Transactions on Knowledge Discovery from Data Vol. 14, no. 4 (2020), p.
- Full Text:
- Reviewed:
- Description: This article elaborates how to identify and evaluate causal factors to improve scientific impact. Currently, analyzing scientific impact can be beneficial to various academic activities including funding application, mentor recommendation, discovering potential cooperators, and the like. It is universally acknowledged that high-impact scholars often have more opportunities to receive awards as an encouragement for their hard work. Therefore, scholars spend great efforts in making scientific achievements and improving scientific impact during their academic life. However, what are the determinate factors that control scholars' academic success? The answer to this question can help scholars conduct their research more efficiently. Under this consideration, our article presents and analyzes the causal factors that are crucial for scholars' academic success. We first propose five major factors including article-centered factors, author-centered factors, venue-centered factors, institution-centered factors, and temporal factors. Then, we apply recent advanced machine learning algorithms and jackknife method to assess the importance of each causal factor. Our empirical results show that author-centered and article-centered factors have the highest relevancy to scholars' future success in the computer science area. Additionally, we discover an interesting phenomenon that the h-index of scholars within the same institution or university are actually very close to each other. © 2020 ACM.
- Authors: Kong, Xiangjie , Zhang, Jun , Zhang, Da , Bu, Yi , Ding, Ying , Xia, Feng
- Date: 2020
- Type: Text , Journal article
- Relation: ACM Transactions on Knowledge Discovery from Data Vol. 14, no. 4 (2020), p.
- Full Text:
- Reviewed:
- Description: This article elaborates how to identify and evaluate causal factors to improve scientific impact. Currently, analyzing scientific impact can be beneficial to various academic activities including funding application, mentor recommendation, discovering potential cooperators, and the like. It is universally acknowledged that high-impact scholars often have more opportunities to receive awards as an encouragement for their hard work. Therefore, scholars spend great efforts in making scientific achievements and improving scientific impact during their academic life. However, what are the determinate factors that control scholars' academic success? The answer to this question can help scholars conduct their research more efficiently. Under this consideration, our article presents and analyzes the causal factors that are crucial for scholars' academic success. We first propose five major factors including article-centered factors, author-centered factors, venue-centered factors, institution-centered factors, and temporal factors. Then, we apply recent advanced machine learning algorithms and jackknife method to assess the importance of each causal factor. Our empirical results show that author-centered and article-centered factors have the highest relevancy to scholars' future success in the computer science area. Additionally, we discover an interesting phenomenon that the h-index of scholars within the same institution or university are actually very close to each other. © 2020 ACM.
- «
- ‹
- 1
- ›
- »