On temporal order invariance for view-invariant action recognition
- Authors: Ul-Haq, Anwaar , Gondal, Iqbal , Murshed, Manzur
- Date: 2013
- Type: Text , Journal article
- Relation: IEEE Transactions on Circuits and Systems for Video Technology Vol. 23, no. 2 (2013), p. 203-211
- Full Text: false
- Reviewed:
- Description: View-invariant action recognition is one of the most challenging problems in computer vision. Various representations are being devised for matching actions across different viewpoints to achieve view invariance. In this paper, we explore the invariance property of temporal order of action instances during action execution and utilize it for devising a new view-invariant action recognition approach. To ensure temporal order during matching, we utilize spatiotemporal features, feature fusion and temporal order consistency constraint. We start by extracting spatiotemporal cuboid features from video sequences and applying feature fusion to encapsulate within-class similarity for the same viewpoints. For each action class, we construct a feature fusion table to facilitate feature matching across different views. An action matching score is then calculated based on global temporal order constraint and number of matching features. Finally, the action label of the class with the maximum value of the matching score is assigned to the query action. Experimentation is performed on multiple view Inria Xmas motion acquisition sequences and West Virginia University action datasets, with encouraging results, that are comparable to the existing view-invariant action recognition techniques.
On dynamic scene geometry for view-invariant action matching
- Authors: Ul-Haq, Anwaar , Gondal, Iqbal , Murshed, Manzur
- Date: 2011
- Type: Text , Conference paper
- Relation: 24th IEEE Conference on Computer Vision and Pattern Recognition (CVPR) p. 3305-3312
- Full Text: false
- Reviewed:
- Description: Variation in viewpoints poses significant challenges to action recognition. One popular way of encoding view-invariant action representation is based on the exploitation of epipolar geometry between different views of the same action. Majority of representative work considers detection of landmark points and their tracking by assuming that motion trajectories for all landmark points on human body are available throughout the course of an action. Unfortunately, due to occlusion and noise, detection and tracking of these landmarks is not always robust. To facilitate it, some of the work assumes that such trajectories are manually marked which is a clear drawback and lacks automation introduced by computer vision. In this paper, we address this problem by proposing view invariant action matching score based on epipolar geometry between actor silhouettes, without tracking and explicit point correspondences. In addition, we explore multi-body epipolar constraint which facilitates to work on original action volumes without any pre-processing. We show that multi-body fundamental matrix captures the geometry of dynamic action scenes and helps devising an action matching score across different views without any prior segmentation of actors. Extensive experimentation on challenging view invariant action datasets shows that our approach not only removes long standing assumptions but also achieves significant improvement in recognition accuracy and retrieval.
Panic-driven event detection from surveillance video stream without track and motion features
- Authors: Haque, Mohammad , Murshed, Manzur
- Date: 2010
- Type: Text , Conference paper
- Relation: 2010 IEEE International Conference on Multimedia & Expo p. 173-178
- Full Text: false
- Reviewed:
- Description: Modern surveillance systems are becoming highly automated in terms of scene understanding and event detection capabilities, and most existing methods rely on track-and motion-based features for event classification and anomaly detection. However, trajectory-based methods fail in public scenarios due to frequently loosing the object tracks, while the capabilities of motion-based methods are limited in detection of direction and velocity related anomalies. In this paper, a novel feature extraction and event detection method is presented without using any track and motion features where event discriminating characteristics are discovered from the dynamics of multiple temporal features extracted from foreground blobs and then confined in support vector machine based models for real-time event detection. Experimental results on benchmark datasets show that the proposed method can successfully discriminate panic-driven events like sudden split, runaway, and fighting from usual events.
On stable dynamic background generation technique using Gaussian mixture models for robust object detection
- Authors: Haque, Mohammad , Murshed, Manzur , Paul, Manoranjan
- Date: 2008
- Type: Text , Conference paper
- Relation: 2008 IEEE Fifth International Conference on Advanced Video and Signal Based Surveillance p. 41-48
- Full Text: false
- Reviewed:
- Description: Gaussian mixture models (GMM) is used to represent the dynamic background in a surveillance video to detect the moving objects automatically. All the existing GMM based techniques inherently use the proportion by which a pixel is going to observe the background in any operating environment. In this paper we first show that such a proportion not only varies widely across different scenarios but also forbids using very fast learning rate. We then propose a dynamic background generation technique in conjunction with basic background subtraction which detected moving objects with improved stability and superior detection quality on a wide range of operating environments in two sets of benchmark surveillance sequences.