List of Titles

QMET : A new quality assessment metric for no-reference video coding by using human eye traversal

Authors: Podder, Pallab , Paul, Manoranjan , Murshed, Manzur
Date: 2016
Type: Text , Conference proceedings
Relation: 2016 International Conference on Image and Vision Computing New Zealand, IVCNZ 2016; Palmerston North, New Zealand; 21st-22nd November 2016 p. 1-6
Full Text:
Reviewed:
Description: The subjective quality assessment (SQA) is an ever demanding approach due to its in-depth interactivity to the human cognition. The addition of no-reference based scheme could equip the SQA techniques to tackle further challenges. Existing widely used objective metrics-peak signal-to-noise ratio (PSNR), structural similarity index (SSIM) or the subjective estimator-mean opinion score (MOS) requires original image for quality evaluation that limits their uses for the situation having no-reference. In this work, we present a no-reference based SQA technique that could be an impressive substitute to the reference-based approaches for quality evaluation. The High Efficiency Video Coding (HEVC) reference test model (HM15.0) is first exploited to generate five different qualities of the HEVC recommended eight class sequences. To assess different aspects of coded video quality, a group of ten participants are employed and their eye-tracker (ET) recorded data demonstrate closer correlation among gaze plots for relatively better quality video contents. Therefore, we innovatively calculate the amount of approximation of smooth eye traversal (ASET) by using distance, angle, and pupil-size feature from recorded gaze trajectory data and develop a new-quality metric based on eye traversal (QMET). Experimental results show that the quality evaluation carried out by QMET is highly correlated to the HM recommended coding quality. The performance of the QMET is also compared with the PSNR and SSIM metrics to justify the effectiveness of each other.
Description: International Conference Image and Vision Computing New Zealand

An algorithm for network and data-aware placement of multi-tier applications in cloud data centers

Authors: Ferdaus, Md Hasanul , Murshed, Manzur , Calheiros, Rodrigo , Buyya, Rajkumar
Date: 2017
Type: Text , Journal article
Relation: Journal of Network and Computer Applications Vol. 98, no. (2017), p. 65-83
Full Text: false
Reviewed:
Description: Today's Cloud applications are dominated by composite applications comprising multiple computing and data components with strong communication correlations among them. Although Cloud providers are deploying large number of computing and storage devices to address the ever increasing demand for computing and storage resources, network resource demands are emerging as one of the key areas of performance bottleneck. This paper addresses network-aware placement of virtual components (computing and data) of multi-tier applications in data centers and formally defines the placement as an optimization problem. The simultaneous placement of Virtual Machines and data blocks aims at reducing the network overhead of the data center network infrastructure. A greedy heuristic is proposed for the on-demand application components placement that localizes network traffic in the data center interconnect. Such optimization helps reducing communication overhead in upper layer network switches that will eventually reduce the overall traffic volume across the data center. This, in turn, will help reducing packet transmission delay, increasing network performance, and minimizing the energy consumption of network components. Experimental results demonstrate performance superiority of the proposed algorithm over other approaches where it outperforms the state-of-the-art network-aware application placement algorithm across all performance metrics by reducing the average network cost up to 67% and network usage at core switches up to 84%, as well as increasing the average number of application deployments up to 18%. © 2017 Elsevier Ltd

An analysis of human engagement behaviour using descriptors from human feedback, eye tracking, and saliency modelling

Authors: Podder, Pallab , Paul, Manoranjan , Debnath, Tanmoy , Murshed, Manzur
Date: 2015
Type: Text , Conference proceedings
Relation: 2015 International Conference on Digital Image Computing: Techniques and Applications, DICTA 2015, Adelaide, 23-25th Nov 2015 in Digital Image Computing: Techniques and Applications (DICTA), 2015 International Conference
Relation: http://purl.org/au-research/grants/arc/DP130103670
Full Text: false
Reviewed:
Description: In this paper an analysis of human engagement behaviour with video is presented based on real life experiments. An engagement model could be employed in classroom education, enhancing programming skills, reading etc. Two groups of people, independent of one another, watched eighteen video clips separately at different times. The first group's participants' eye gaze locations, right and left pupil sizes, and eye blinking patterns are recorded by a state of the art Tobii eye tracker. The second group of people who are video experts opined about the most significant attention points of the videos. A well-known bottom-up visual saliency model, Graph-Based Visual Saliency (GBVS), is also utilized to create salient points for the videos. Taking into consideration all the above mentioned descriptors the introduced behaviour analysis demonstrates the level of participants' concentration with the videos.

Foreground motion and spatial saliency-based efficient HEVC Video Coding

Authors: Podder, Pallab , Paul, Manoranjan , Murshed, Manzur
Date: 2015
Type: Text , Conference paper
Relation: 2015 International Conference on Image and Vision Computing New Zealand (IVCNZ)
Full Text: false
Reviewed:
Description: High Efficiency Video Coding (HEVC) could not provide real time facilities to the limited processing and battery powered electronic devices as its encoding time complexity increases multiple times compared to its predecessor. Numerous researchers contribute to address this limitation by reducing a number of motion estimation (ME) modes where they analyze homogeneity, residual and statistical correlation among different modes. Although their approaches save some encoding time, however, could not reach the similar rate-distortion (RD) performance with HEVC encoder as they merely depend on existing Lagrangian cost function (LCF) within HEVC framework. To overcome this limitation, in this paper, we capture visual attentive Foreground motion and salient region (FMSR) which are sensitive to human visual system for quality assessment. The FMSR features captured by visual attentive and dynamic background modeling are adaptively synthesized to determine a subset of candidate modes. This preprocessing phase is independent from LCF. Since the proposed technique can avoid exhaustive exploration of all modes with simple criteria, it can reduce 27% encoding time on average. With efficient selection of FMSR-based appropriate block partitioning modes, it can also improve up to 1.0dB peak signal-to-noise ratio (PSNR).

Abnormal event detection in unseen scenarios

Authors: Haque, Mahfuzul , Murshed, Manzur
Date: 2012
Type: Text , Conference proceedings
Relation: 2012 IEEE International Conference on Multimedia and Expo Workshops (ICMEW), Melbourne, 9-13th July, 2012. pg 1-6
Full Text: false
Reviewed:
Description: Event detection in unseen scenarios is a challenging problem due to high variability of scene type, viewing direction, nature of scene entities, and environmental conditions. Existing event detection approaches mostly rely on context-specific tuning and training. Consequently, these techniques fail to achieve high scalability in a large surveillance network with hundreds of video feeds where scenario specific tuning/training is impossible. In this paper, we present a generic event detection approach where the extracted low-level features represent the global characteristics of the target scene instead of any context-specific information. From the temporal evolution of these context-invariant features over a timeframe, a fixed number of temporal features are extracted based on the periodicity of significant transition points and associated temporal orders. Finally, top-ranked temporal features are used to train binary classifier-based event models. In this approach, supervised training and exhaustive feature extraction are required only once while building the target event models. During real-time operation in unseen scenarios, event detection is performed based on the trained event models by extracting the required features only. The proposed event detection approach has been demonstrated for abnormal event detection in completely unseen public place scenarios from benchmark datasets without additional training and tuning. Furthermore, the proposed event detection approach has also outperformed recent optical flow based event detection technique.

A novel depth edge prioritization based coding technique to boost-UP HEVC performance

Authors: Podder, Pallab , Paul, Manoranjan , Murshed, Manzur
Date: 2016
Type: Text , Conference paper
Relation: 2016 IEEE International Conference on Multimedia & Expo Workshops (ICMEW)
Full Text: false
Reviewed:
Description: In addition to the texture, multiview video employs the utilization of depth coding for the reconstruction of 3D video and Free viewpoint video. Standing on some texture-depth correlations, a number of methods in literature reuses texture motion vector for the corresponding depth coding to reduce encoding time by avoiding costly motion estimation process. However, texture similarity metric is not always equivalent to the corresponding depth similarity metric especially at edge levels. Since their approaches could not explicitly detect and encode acute edge motions of depth objects, eventually, could not reach the similar or improved rate-distortion (RD) performance against the High Efficiency Video Coding (HEVC) reference test model (HM). With a view to more accurate motion detection and modeling, the proposed technique exploits an extra Pattern Mode comprising a group of pattern templates (GPTs) with different rectangular and non-rectangular object shapes and edges compared to the existing HEVC block partitioning modes. Moreover, the proposed Pattern Mode only encodes the motion areas and skips the background areas. The experimental results show that the proposed technique could save 30% encoding time and improve average 0.1dB Bjontegard Delta peak signal-to-noise ratio (BD-PSNR) compared to the HM.

Impact on vertical handoff decision algorithm by the network call admission control policy in heterogeneous wireless networks

Authors: Sharna, Shusmita , Murshed, Manzur
Date: 2012
Type: Text , Conference proceedings
Relation: 2012 IEEE 23rd International Symposium on Personal, Indoor and Mobile Radio Communications, Sydney, Sept. 9th-12th 2012, pp.893-898
Full Text: false
Reviewed:
Description: Vertical handoff plays an important role to provide seamless connectivity for a mobile user in an overlapped multinetwork environment. On the other hand in order to maintain network stability, efficient management of available radio resource becomes crucial as network operators want high network utilization and maximum profit generation. For vertical handoff management, existing research works considered these user centric vertical handoff decision algorithm and network centric call admission control as two isolated decision mechanisms in heterogeneous wireless environment. In this paper, however, we propose a correlation between vertical handoff decisions and call admission control policies. We have developed a novel vertical handoff decision model using the Markov decision process based vertical handoff decision algorithm by refining the optimality criterion to factor in the probabilistic consequence of the call dropping rates so that mobile-centric vertical handoff decision algorithm and network-centric call admission control can work through a feedback mechanism to maximize respective objectives in synergy.

Workload-aware incremental repartitioning of shared-nothing distributed databases for scalable OLTP applications

Authors: Kamal, Joarder , Murshed, Manzur , Buyya, Rajkumar
Date: 2016
Type: Text , Journal article
Relation: Future Generation Computer Systems Vol. 56, no. March (2016), p. 421-436
Full Text: false
Reviewed:
Description: On-line Transaction Processing (OLTP) applications often rely on shared-nothing distributed databases that can sustain rapid growth in data volume. Distributed transactions (DTs) that involve data tuples from multiple geo-distributed servers can adversely impact the performance of such databases, especially when the transactions are short-lived and these require immediate responses. The. k-way min-cut graph clustering based database repartitioning algorithms can be used to reduce the number of DTs with acceptable level of load balancing. Web applications, where DT profile changes over time due to dynamically varying workload patterns, frequent database repartitioning is needed to keep up with the change. This paper addresses this emerging challenge by introducing incremental repartitioning. In each repartitioning cycle, DT profile is learnt online and. k-way min-cut clustering algorithm is applied on a special sub-graph representing all DTs as well as those non-DTs that have at least one tuple in a DT. The latter ensures that the min-cut algorithm minimally reintroduces new DTs from the non-DTs while maximally transforming existing DTs into non-DTs in the new partitioning. Potential load imbalance risk is mitigated by applying the graph clustering algorithm on the finer logical partitions instead of the servers and relying on random one-to-one cluster-to-partition mapping that naturally balances out loads. Inter-server data-migration due to repartitioning is kept in check with two special mappings favouring the current partition of majority tuples in a cluster-the many-to-one version minimising data migrations alone and the one-to-one version reducing data migration without affecting load balancing. A distributed data lookup process, inspired by the roaming protocol in mobile networks, is introduced to efficiently handle data migration without affecting scalability. The effectiveness of the proposed framework is evaluated on realistic TPC-C workloads comprehensively using graph, hypergraph, and compressed hypergraph representations used in the literature. To compare the performance of any incremental repartitioning framework without any bias of the external min-cut algorithm due to graph size variations, a transaction generation model is developed that can maintain a target number of unique transactions in any arbitrary observation window, irrespective of new transaction arrival rate. The overall impact of DTs at any instance is estimated from the exponential moving average of the recurrence period of unique transactions to avoid transient fluctuations. The effectiveness and adaptability of the proposed incremental repartitioning framework for transactional workloads have been established with extensive simulations on both range partitioned and consistent hash partitioned databases. Â© 2015 Elsevier B.V.

An enhanced-MDP based vertical handoff algorithm for QoS support over heterogeneous wireless networks

Authors: Sharna, Shusmita , Amin, Mohammad , Murshed, Manzur
Date: 2011
Type: Text , Conference proceedings
Relation: Proceedings of 2011 IEEE International Symposium on Network Computing and Applications (NCA 2011),Cambridge, MA, 25-27th Aug, 2011
Full Text: false
Reviewed:
Description: Vertical handoff plays an important role in guaranteeing users to be always connected in an overlapped multi-network environment. During the vertical handoff procedure, handoff decision is the most important step that affects the normal working of communication. An incorrect handoff decision or selection of a non-optimal network may result in undesirable effects such as higher costs, poor quality of service (QoS) experience, and even dropped communication. Among the existing vertical handoff decision algorithms, the Markov Decision Process (MDP) based algorithm by Stevens-Navarro et al. is promising due to its ability to achieve the optimal expected reward. However, the reward function used by this algorithm is flawed as it favors reducing expected number of vertical handoffs at the expense of diminished expected values of other QoS parameters. This paper presents an extended MDP based algorithm (EMDP) with novel reward function formulation. Analysis shows that EMDP outperforms the MDP based algorithm in terms of improved expected values of all QoS parameters considered while keeping the vertical handoff number reasonably low.

Cuboid segmentation for effective image retrieval

Authors: Murshed, Manzur , Teng, Shyh , Lu, Guojun
Date: 2017
Type: Text , Conference proceedings
Relation: 2017 International Conference on Digital Image Computing : Techniques and Applications (DICTA); Sydney, Australia; 29th November-1st December 2017 p. 884-891
Full Text: false
Reviewed:
Description: Region-based image retrieval has been proven to be effective in finding relevant images. In this paper, we propose a cuboid im-age segmentation method which results in rectangle image partitions. Rectangle partitions are more suitable for image compression, retrieval and other image operations. We apply partitions in image retrieval in this paper. Our experimental results have shown that (1) the proposed partitioning method is effective in segmenting images into meaningful rectangles; (2) using colour partitions for image retrieval is more effective than using whole images; and (3) the partitioned approach has additional advantage of letting users to select certain objects/colours as queries to find more relevant images/objects. These three advantages could be important in crime scene investigation image indexing and retrieval. Moreover, the proposed technique is amenable to compressed-domain applications.

Analytical modeling of enhanced IEEE 802.11 with multiuser dynamic OFDMA under saturation load

Authors: Ferdous, Hasan , Murshed, Manzur
Date: 2011
Type: Text , Conference paper
Relation: 17th Asia-Pacific Conference on Communications, APCC 2011 p. 524-529
Full Text: false
Reviewed:
Description: Multiuser dynamic OFDMA based IEEE 802.11 distributed coordination function (DCF) has received significant interest from the researchers in recent time. Though several proposals have been made, to the best of our knowledge, none of these have presented an analytical model for this kind of medium access control protocols for IEEE 802.11. This paper provides a simple, nevertheless, very accurate analytical model to estimate the performance characteristics of IEEE 802.11 DCF with OFDMA under the assumptions of ideal channel conditions and saturation load. Our model accounts for important system parameters like throughput, collision rate, transmission delay, average contention window size, average retry count and average time wasted in backoff. Analytical results are verified through extensive simulations.

Lossless depth map coding using binary tree based decomposition and context-based arithmetic coding

Authors: Shahriyar, Shampa , Murshed, Manzur , Ali, Mortuza , Paul, Manoranjan
Date: 2016
Type: Text , Conference proceedings , Conference paper
Relation: 2016 IEEE International Conference on Multimedia and Expo, ICME 2016; Seattle, United States; 11th-15th July 2016; published in Proceedings of the 2016 IEEE International Conference on Mulitmedia and Expo Vol. 2016-August, p. 1-6
Full Text: false
Reviewed:
Description: Depth maps are becoming increasingly important in the context of emerging video coding and processing applications. Depth images represent the scene surface and are characterized by areas of smoothly varying grey levels separated by sharp edges at the position of object boundaries. To enable high quality view rendering at the receiver side, preservation of these characteristics is important. Lossless coding enables avoiding rendering artifacts in synthesized views due to depth compression artifacts. In this paper, we propose a binary tree based lossless depth coding scheme that arranges the residual frame into integer or binary residual bitmap. High spatial correlation in depth residual frame is exploited by creating large homogeneous blocks of adaptive size, which are then coded as a unit using context based arithmetic coding. On the standard 3D video sequences, the proposed lossless depth coding has achieved compression ratio in the range of 20 to 80. © 2016 IEEE.
Description: Proceedings - IEEE International Conference on Multimedia and Expo

From Tf-Idf to learning-to-rank : An overview

Authors: Ibrahim, Yousef , Murshed, Manzur
Date: 2015
Type: Text , Book chapter
Relation: Handbook of research on innovations in information retrieval, analysis, and management Chapter 3 p. 62-109
Full Text: false
Reviewed:
Description: Ranking a set of documents based on their relevances with respect to a given query is a central problem of information retrieval (IR). Traditionally people have been using unsupervised scoring methods like tf-idf, BM25, Language Model etc., but recently supervised machine learning framework is being used successfully to learn a ranking function, which is called learning-to-rank (LtR) problem. There are a few surveys on LtR in the literature; but these reviews provide very little assistance to someone who, before delving into technical details of different algorithms, wants to have a broad understanding of LtR systems and its evolution from and relation to the traditional IR methods. This chapter tries to address this gap in the literature. Mainly the following aspects are discussed: the fundamental concepts of IR, the motivation behind LtR, the evolution of LtR from and its relation to the traditional methods, the relationship between LtR and other supervised machine learning tasks, the general issues pertaining to an LtR algorithm, and the theory of LtR. © 2016 by IGI Global. All rights reserved.

Video coding using arbitrarily shaped block partitions in globally optimal perspective

Authors: Paul, Manoranjan , Murshed, Manzur
Date: 2011
Type: Text , Journal article
Relation: EURASIP Journal on Advances in Signal Processing Vol. 16, no. (2011), p.
Full Text:
Reviewed:
Description: Algorithms using content-based patterns to segment moving regions at the macroblock (MB) level have exhibited good potential for improved coding efficiency when embedded into the H.264 standard as an extra mode. The content-based pattern generation (CPG) algorithm provides local optimal result as only one pattern can be optimally generated from a given set of moving regions. But, it failed to provide optimal results for multiple patterns from entire sets. Obviously, a global optimal solution for clustering the set and then generation of multiple patterns enhances the performance farther. But a global optimal solution is not achievable due to the non-polynomial nature of the clustering problem. In this paper, we propose a near-optimal content-based pattern generation (OCPG) algorithm which outperforms the existing approach. Coupling OCPG, generating a set of patterns after clustering the MBs into several disjoint sets, with a direct pattern selection algorithm by allowing all the MBs in multiple pattern modes outperforms the existing pattern-based coding when embedded into the H.264.

Fast mode decision in the HEVC Video coding standard by exploiting region with dominated motion and saliency features

Authors: Podder, Pallab , Paul, Manoranjan , Murshed, Manzur
Date: 2012
Type: Text , Journal article
Relation: PLoS ONE Vol. Vol.11, no. 3 (2012), p. p.e0150673
Full Text:
Reviewed:
Description: The emerging High Efficiency Video Coding (HEVC) standard introduces a number of innovative and powerful coding tools to acquire better compression efficiency compared to its predecessor H.264. The encoding time complexities have also increased multiple times that is not suitable for realtime video coding applications. To address this limitation, this paper employs a novel coding strategy to reduce the time complexity in HEVC encoder by efficient selection of appropriate block-partitioning modes based on human visual features (HVF). The HVF in the proposed technique comprise with human visual attention modelling-based saliency feature and phase correlation-based motion features. The features are innovatively combined through a fusion process by developing a content-based adaptive weighted cost function to determine the region with dominated motion/saliency (RDMS)- based binary pattern for the current block. The generated binary pattern is then compared with a codebook of predefined binary pattern templates aligned to the HEVC recommended block-paritioning to estimate a subset of inter-prediction modes. Without exhaustive exploration of all modes available in the HEVC standard, only the selected subset of modes are motion estimated and motion compensated for a particular coding unit. The experimental evaluation reveals that the proposed technique notably down-scales the average computational time of the latest HEVC reference encoder by 34% while providing similar rate-distortion (RD) performance for a wide range of video sequences.

Cuboid colour image segmentation using intuitive distance measure

Authors: Tania, Sheikh , Murshed, Manzur , Teng, Shyh , Karmakar, Gour
Date: 2018
Type: Text , Conference proceedings
Relation: 2018 International Conference on Image and Vision Computing New Zealand, IVCNZ 2018; Auckland, New Zealand; 19th-21st November 2018 Vol. 2018-November, p. 1-6
Full Text:
Reviewed:
Description: In this paper, an improved algorithm for cuboid image segmentation is proposed. To address the two main limitations of the recently proposed cuboid segmentation algorithm, the improved algorithm substitutes colour quantization in HCL colour space with infinity norm distance in RGB colour space along with a different way to impose area thresholding. We also propose a new metric to evaluate the quality of segmentation. Experimental results show that the proposed cuboid segmentation algorithm significantly outperforms the existing cuboid segmentation algorithm in terms of quality of segmentation.
Description: International Conference Image and Vision Computing New Zealand

Adaptive weighted non-parametric background model for efficient video coding

Authors: Chakraborty, Subrata , Paul, Manoranjan , Murshed, Manzur , Ali, Mortuza
Date: 2017
Type: Text , Journal article
Relation: Neurocomputing Vol. 226, no. (2017), p. 35-45
Full Text:
Reviewed:
Description: Dynamic background frame based video coding using mixture of Gaussian (MoG) based background modelling has achieved better rate distortion performance compared to the H.264 standard. However, they suffer from high computation time, low coding efficiency for dynamic videos, and prior knowledge requirement of video content. In this paper, we introduce the application of the non-parametric (NP) background modelling approach for video coding domain. We present a novel background modelling technique, called weighted non-parametric (WNP) which balances the historical trend and the recent value of the pixel intensities adaptively based on the content and characteristics of any particular video. WNP is successfully embedded into the latest HEVC video coding standard for better rate-distortion performance. Moreover, a novel scene adaptive non-parametric (SANP) technique is also developed to handle video sequences with high dynamic background. Being non-parametric, the proposed techniques naturally exhibit superior performance in dynamic background modelling without a priori knowledge of video data distribution.

Contextual action recognition in multi-sensor nighttime video sequences

Authors: Anwaar-Ul, Haq , Gondal, Iqbal , Murshed, Manzur
Date: 2011
Type: Text , Conference paper
Relation: Proceedings of the 2011 Digital Image Computing: Techniques and Applications (DICTA 2011), Noosa 6th-8th Dec, 2011 p. 256-261
Full Text: false
Reviewed:
Description: Contextual information is important for interpreting human actions especially when actions exhibit interactive relationship with their context. Contextual clues become even more crucial when videos are captured in unfavorable conditions like extreme low light nighttime scenarios. These conditions encourage the use of multi-senor imagery and context enhancement. In this paper, we explore the importance of contextual knowledge for recognizing human actions in multi-sensor nighttime videos. Information fusion is utilized for encapsulating visual information about actions and their context. Space-time action information is contained using 3D fourier transform of fused action silhouette volume. In parallel, SIFT context images are extracted and fused using principal component analysis based feature fusion for each action class. Contextual dissimilarity is penalized by minimizing context SIFT flow energy. The action dataset comprises multi-sensor night vision video data from infra-red and visible spectrum. Experimental results show that fused contextual action information boost action recognition performance as compared to the baseline action recognition approac

Range-free passive localization using static and mobile sensors

Authors: Iqbal, Anindya , Murshed, Manzur
Date: 2012
Type: Text , Conference proceedings
Relation: 2012 IEEE International Symposium on a World of Wireless, Mobile and Multimedia Networks (WoWMoM), San Francisco, CA, 25th-28th June, 2012 p. 1-6
Full Text: false
Reviewed:
Description: In passive localization, sensors try to locate an event without any knowledge of event's emitted power. So, this is a more challenging problem compared to active localization. Existing passive localization schemes use expensive and noise-vulnerable range-based techniques. In this paper, we propose, to the best of our knowledge for the first time, a cost-effective range-free passive localization scheme exploiting hybrid sensor network model where mobile sensors are deployed on demand once an event is sensed by a static sensor. Efficient use of mobile sensors leads to two concomitant optimization problems: (1) positioning the mobile sensors so that the expected possible event location area is minimized; and (2) minimizing their overall traversed distance. To solve the first problem, we have developed a novel arc-coding based range-free localization technique that can accurately define the area of possible event location from the feedback of arbitrarily placed sensors without relying on expensive hardware to estimate range of signals. We have achieved significantly high localization accuracy with a low number of mobile sensors even after considering significant environmental noise. To solve the second problem, three alternative deployment strategies for the mobile sensors were simulated to recommend the best.

Action recognition using spatio-temporal distance classifier correlation filter

Authors: Anwaar-Ul Haq , Gondal, Iqbal , Murshed, Manzur
Date: 2011
Type: Text , Conference proceedings
Relation: 2011 International Conference on Digital Image Computing Techniques and Applications (DICTA), Noosa, QLD, 6th-8th Dec, 2011
Full Text: false
Reviewed:
Description: The problem of recognizing human actions is characterized by complex dynamics and strong variations in their executions. Despite this inconvenience, space-time correlations provide valuable clues for their discrimination. Therefore, space-time correlators like emph{Maximum Average Correlation Height} (MACH) filters have successfully been used for action recognition with encouraging results. However, their utility is challenged due to number of factors: (i) these filters are trained only for one class at a time and separate filters are required for each class increasing computational overhead, (ii) these filters simply take average of similar action instances and behave no better than average filters and (iii) misaligned action datasets create problems for these filters as they are not shift-invariant. In this paper, we address these issues by posing action recognition as a multi-class discrimination problem and propose a emph{single} 3D frequency domain filter, named Action ST-DCCF for multiple action classes that mitigates inherent discrepancies of correlation filters. It presents a different interpretation of correlation filters as a method of applying spatio-temporal transformation to the data rather than simply minimizing correlation energy across all possible shifts. Experiments on a variety of action datasets are performed to evaluate our approach. Experimental results are comparable to the existing action recognition approaches.
Description: The problem of recognizing human actions is characterized by complex dynamics and strong variations in their executions. Despite this inconvenience, space-time correlations provide valuable clues for their discrimination. Therefore, space-time correlators like \emph{Maximum Average Correlation Height} (MACH) filters have successfully been used for action recognition with encouraging results. However, their utility is challenged due to number of factors: (i) these filters are trained only for one class at a time and separate filters are required for each class increasing computational overhead, (ii) these filters simply take average of similar action instances and behave no better than average filters and (iii) misaligned action datasets create problems for these filters as they are not shift-invariant. In this paper, we address these issues by posing action recognition as a multi-class discrimination problem and propose a \emph{single} 3D frequency domain filter, named Action ST-DCCF for multiple action classes that mitigates inherent discrepancies of correlation filters. It presents a different interpretation of correlation filters as a method of applying spatio-temporal transformation to the data rather than simply minimizing correlation energy across all possible shifts. Experiments on a variety of action datasets are performed to evaluate our approach. Experimental results are comparable to the existing action recognition approaches.

Showing items 61 - 80 of 152

QMET : A new quality assessment metric for no-reference video coding by using human eye traversal

Video coding using arbitrarily shaped block partitions in globally optimal perspective

Fast mode decision in the HEVC Video coding standard by exploiting region with dominated motion and saliency features

Cuboid colour image segmentation using intuitive distance measure

Adaptive weighted non-parametric background model for efficient video coding