Scarf : Semi-automatic colorization and reliable image fusion
- Authors: Ul-Haq, Anwaar , Gondal, Iqbal , Murshed, Manzur
- Date: 2010
- Type: Text , Conference paper
- Relation: 2010 Digital Image Computing: Techniques and Applications p. 435-440
- Full Text: false
- Reviewed:
- Description: Nighttime imagery poses significant challenges to its enhancement due to loss of color information and limitation of single sensor to capture complete visual information at night. To cope with this challenge, multiple sensors are used to capture reliable nighttime imagery which presents additional demands for reliable visual information fusion. In this paper, we present a system, Scarf, which proposes reliable image fusion using advanced feature extraction techniques and a novel semi-automatic colorization based on optimization conformal to human visual system. Subjective and objective quality evaluation proves the effectiveness of proposed system.
VSAMS : Video stabilization approach for multiple sensors
- Authors: Ul-Haq, Anwaar , Gondal, Iqbal , Murshed, Manzur
- Date: 2010
- Type: Text , Conference proceedings
- Relation: 2010 International Conference on Digital Image Computing: Techniques and Applications, Dec. 2010, pp.411-416
- Full Text: false
- Description: Video Stabilization is now considered an old problem which is almost solved but there are still some connecting problems which needs research attention. One of such issues arises due to multiple unstable videos streams coming from multiple sensors which often contain complementary information. To enhance system performance, instability should be removed in a single go rather than stabilizing each sensor individually. This paper proposes a cooperative video stabilization framework, VSAMS for multisensory aerial data based on robust boosting curves which encapsulate stability of high spatial frequency information as used by flying parakeets (budgerigars). For reducing shake and jitter and preservation of actual camera path, a multistage smoothing approach is visualized. Experiments are performed on multisensory UAV data which contains infrared and electro-optical video streams. Subjective and objective quality evaluation proves effectiveness of the proposed cooperative stabilization framework.
On dynamic scene geometry for view-invariant action matching
- Authors: Ul-Haq, Anwaar , Gondal, Iqbal , Murshed, Manzur
- Date: 2011
- Type: Text , Conference paper
- Relation: 24th IEEE Conference on Computer Vision and Pattern Recognition (CVPR) p. 3305-3312
- Full Text: false
- Reviewed:
- Description: Variation in viewpoints poses significant challenges to action recognition. One popular way of encoding view-invariant action representation is based on the exploitation of epipolar geometry between different views of the same action. Majority of representative work considers detection of landmark points and their tracking by assuming that motion trajectories for all landmark points on human body are available throughout the course of an action. Unfortunately, due to occlusion and noise, detection and tracking of these landmarks is not always robust. To facilitate it, some of the work assumes that such trajectories are manually marked which is a clear drawback and lacks automation introduced by computer vision. In this paper, we address this problem by proposing view invariant action matching score based on epipolar geometry between actor silhouettes, without tracking and explicit point correspondences. In addition, we explore multi-body epipolar constraint which facilitates to work on original action volumes without any pre-processing. We show that multi-body fundamental matrix captures the geometry of dynamic action scenes and helps devising an action matching score across different views without any prior segmentation of actors. Extensive experimentation on challenging view invariant action datasets shows that our approach not only removes long standing assumptions but also achieves significant improvement in recognition accuracy and retrieval.
On temporal order invariance for view-invariant action recognition
- Authors: Ul-Haq, Anwaar , Gondal, Iqbal , Murshed, Manzur
- Date: 2013
- Type: Text , Journal article
- Relation: IEEE Transactions on Circuits and Systems for Video Technology Vol. 23, no. 2 (2013), p. 203-211
- Full Text: false
- Reviewed:
- Description: View-invariant action recognition is one of the most challenging problems in computer vision. Various representations are being devised for matching actions across different viewpoints to achieve view invariance. In this paper, we explore the invariance property of temporal order of action instances during action execution and utilize it for devising a new view-invariant action recognition approach. To ensure temporal order during matching, we utilize spatiotemporal features, feature fusion and temporal order consistency constraint. We start by extracting spatiotemporal cuboid features from video sequences and applying feature fusion to encapsulate within-class similarity for the same viewpoints. For each action class, we construct a feature fusion table to facilitate feature matching across different views. An action matching score is then calculated based on global temporal order constraint and number of matching features. Finally, the action label of the class with the maximum value of the matching score is assigned to the query action. Experimentation is performed on multiple view Inria Xmas motion acquisition sequences and West Virginia University action datasets, with encouraging results, that are comparable to the existing view-invariant action recognition techniques.
A novel color image fusion QoS measure for multi-sensor night vision applications
- Authors: Anwaar, Ul-Haq , Gondal, Iqbal , Murshed, Manzur
- Date: 2010
- Type: Text , Conference proceedings
- Full Text: false
- Description: Color image fusion of visible and infra-red imagery can play an important role in multi-sensor night vision systems that are an integral part of modern warfare. Image fusion minimizes the amount of required bandwidth by transmitting the fused image rather than multiple sensor images. Color image fusion can be achieved by combining inputs from original colored sensors or by employing pseudo colorization and color transfer to grayscale images. Various quality measures have been proposed for multi-sensor grayscale image fusion techniques; but no appropriate quality measure has been devised for the quality evaluation of multi-sensor color image fusion. In this paper, we propose a novel color image fusion quality measure, Color Fusion Objective Index (CFOI) based on colorfulness, gradient similarity and mutual information techniques. Experimental results show the effectiveness of CFOI to evaluate the color and salient feature extraction introduced by color fusion techniques into the final fused imagery as well as its consistency with subjective evaluation.
Automated multi-sensor color video fusion for nighttime video surveillance
- Authors: Ul-Haq, Anwaar , Gondal, Iqbal , Murshed, Manzur
- Date: 2010
- Type: Text , Conference proceedings
- Full Text: false
- Description: In this paper, we present an automated color transfer based video fusion method to attain real-time color night vision capability for night-time video surveillance. We utilize simple RGB Color transfer technique to fused pseudo colored video frames without conversion to any uncorrelated color space. We investigated that final color fusion results greatly depend on the selection of target color Image. Therefore, rather than using any arbitrary target color image based on mere general visual anticipation, we have automated target color image selection using structural similarity and color saturation. We further apply color enhancement to improve final appearance of color fused images. Subjective and objective quality evaluations greatly indicate the effectiveness of our color video fusion method for nighttime video surveillance applications.
Contextual action recognition in multi-sensor nighttime video sequences
- Authors: Anwaar-Ul, Haq , Gondal, Iqbal , Murshed, Manzur
- Date: 2011
- Type: Text , Conference paper
- Relation: Proceedings of the 2011 Digital Image Computing: Techniques and Applications (DICTA 2011), Noosa 6th-8th Dec, 2011 p. 256-261
- Full Text: false
- Reviewed:
- Description: Contextual information is important for interpreting human actions especially when actions exhibit interactive relationship with their context. Contextual clues become even more crucial when videos are captured in unfavorable conditions like extreme low light nighttime scenarios. These conditions encourage the use of multi-senor imagery and context enhancement. In this paper, we explore the importance of contextual knowledge for recognizing human actions in multi-sensor nighttime videos. Information fusion is utilized for encapsulating visual information about actions and their context. Space-time action information is contained using 3D fourier transform of fused action silhouette volume. In parallel, SIFT context images are extracted and fused using principal component analysis based feature fusion for each action class. Contextual dissimilarity is penalized by minimizing context SIFT flow energy. The action dataset comprises multi-sensor night vision video data from infra-red and visible spectrum. Experimental results show that fused contextual action information boost action recognition performance as compared to the baseline action recognition approac
Action recognition using spatio-temporal distance classifier correlation filter
- Authors: Anwaar-Ul Haq , Gondal, Iqbal , Murshed, Manzur
- Date: 2011
- Type: Text , Conference proceedings
- Relation: 2011 International Conference on Digital Image Computing Techniques and Applications (DICTA), Noosa, QLD, 6th-8th Dec, 2011
- Full Text: false
- Reviewed:
- Description: The problem of recognizing human actions is characterized by complex dynamics and strong variations in their executions. Despite this inconvenience, space-time correlations provide valuable clues for their discrimination. Therefore, space-time correlators like emph{Maximum Average Correlation Height} (MACH) filters have successfully been used for action recognition with encouraging results. However, their utility is challenged due to number of factors: (i) these filters are trained only for one class at a time and separate filters are required for each class increasing computational overhead, (ii) these filters simply take average of similar action instances and behave no better than average filters and (iii) misaligned action datasets create problems for these filters as they are not shift-invariant. In this paper, we address these issues by posing action recognition as a multi-class discrimination problem and propose a emph{single} 3D frequency domain filter, named Action ST-DCCF for multiple action classes that mitigates inherent discrepancies of correlation filters. It presents a different interpretation of correlation filters as a method of applying spatio-temporal transformation to the data rather than simply minimizing correlation energy across all possible shifts. Experiments on a variety of action datasets are performed to evaluate our approach. Experimental results are comparable to the existing action recognition approaches.
- Description: The problem of recognizing human actions is characterized by complex dynamics and strong variations in their executions. Despite this inconvenience, space-time correlations provide valuable clues for their discrimination. Therefore, space-time correlators like \emph{Maximum Average Correlation Height} (MACH) filters have successfully been used for action recognition with encouraging results. However, their utility is challenged due to number of factors: (i) these filters are trained only for one class at a time and separate filters are required for each class increasing computational overhead, (ii) these filters simply take average of similar action instances and behave no better than average filters and (iii) misaligned action datasets create problems for these filters as they are not shift-invariant. In this paper, we address these issues by posing action recognition as a multi-class discrimination problem and propose a \emph{single} 3D frequency domain filter, named Action ST-DCCF for multiple action classes that mitigates inherent discrepancies of correlation filters. It presents a different interpretation of correlation filters as a method of applying spatio-temporal transformation to the data rather than simply minimizing correlation energy across all possible shifts. Experiments on a variety of action datasets are performed to evaluate our approach. Experimental results are comparable to the existing action recognition approaches.