A learning-based approach for fault tolerance on grid resources scheduling
- Authors: Karimi, Mohammadbager , Bouyer, Asgarali , Mohebi, Ehsan , Rajabalipour, Hossein
- Date: 2009
- Type: Text , Conference proceedings
- Relation: 2009 5th IEEE GCC Conference and Exhibition, GCC 2009; Kuwait City; Kuwait; 17th-19th March published in 2009
- Full Text: false
- Reviewed:
- Description: While Grid environment has developed increasingly, unfortunately the importance of fault tolerance has not been remarkable in Grid resource management. On the other hand, the cost of computing by grid is important because grid is an economy-based system. Most organizations intend to spend little on their own computations by grid. Therefore, using a better approach to resource scheduling to avoid fault is necessary. This paper presents a new approach on fault tolerance mechanisms for the resource scheduling on grid by using Case-Based Reasoning technique in a local fashion. This approach applies a specific structure in order to prepare fault tolerance between executer nodes to retain system in a safe state with minimum data transferring. Certainly, this algorithm increases fault tolerant confidence therefore, performance of grid will be high.
- Description: 2009 5th IEEE GCC Conference and Exhibition, GCC 2009
Fault-tolerant data aggregation scheme for monitoring of critical events in grid based healthcare sensor networks
- Authors: Saeed, Ather , Stranieri, Andrew , Dazeley, Richard
- Date: 2011
- Type: Text , Conference paper
- Relation: Paper presented at 19th High Peformance Computing Symposium (HPC 2011) part of SCS Spring Simulation Multiconference (SpringSim'11)
- Full Text:
- Reviewed:
- Description: Wireless sensor devices are used for monitoring patients with serious medical conditions. Communication of content-sensitive and context sensitive datasets is crucial for the survival of patients so that informed decisions can be made. The main limitation of sensor devices is that they work on a fixed threshold to notify the relevant Healthcare Professional (HP) about the seriousness of a patient’s current state. Further, these sensor devices have limited processor, memory capabilities and battery. A new grid-based information monitoring architecture is proposed to address the issues of data loss and timely dissemination of critical information to the relevant HP. The proposed approach provides an opportunity to efficiently aggregate datasets of interest by reducing network overhead and minimizing data latency. To narrow down the problem domain, in-network processing of datasets with Grid monitoring capabilities is proposed for the efficient execution of the computational, resource and data intensive tasks. Interactive wireless sensor networks do not guarantee that data gathered from the heterogeneous sources will always arrive at the sink (base) node, but the proposed aggregation technique will provide a fault tolerant solution to the timely notification of a patient’s critical state. Experimental results received are encouraging and clearly show a reduction in the network latency rate.
SMOaaS: a Scalable Matrix Operation as a Service model in Cloud
- Authors: Ujjwal, K. C. , Battula, Sudheer , Garg, Saurabh , Naha, Ranesh , Patwary, Md Anwarul , Brown, Alexander
- Date: 2021
- Type: Text , Journal article
- Relation: The Journal of supercomputing Vol. 77, no. 4 (2021), p. 3381-3401
- Full Text: false
- Reviewed:
- Description: Matrix operations are fundamental to a wide range of scientific applications such as Graph Theory, Linear Equation System, Image Processing, Geometric Optics, and Probability Analysis. As the workload in these applications has increased, the sizes of matrices involved have also significantly increased. Parallel execution of matrix operations in existing cluster-based systems performs effectively for relatively small matrices but significantly suffers as matrices become larger due to limited resources. Cloud Computing offers scalable resources to handle this limitation however, the benefits of having access to almost-infinite scalable resources in the Cloud also come with challenges of ensuring time and resource-efficient matrix operations. To the best of our knowledge, there is no specific Cloud service that optimizes the efficiency of matrix operations on Cloud infrastructure. To address this gap and offer convenient service of matrix operations, the paper proposes a novel scalable service framework called Scalable Matrix Operation as a Service. Our framework uses Dynamic Matrix Partition techniques, based on matrix operation and sizes, to achieve efficient work distribution, and scales based on demand to achieve time and resource-efficient operations. The framework also embraces the basic features of security, fault tolerance, and reliability. Experimental results show that the adopted dynamic partitioning technique ensures faster and better performance when compared to the existing static partitioning technique.