Automated opinion detection : Implications of the level of agreement between human raters
- Authors: Osman, Deanna , Yearwood, John , Vamplew, Peter
- Date: 2010
- Type: Text , Journal article
- Relation: Information Processing and Management Vol. 46, no. 3 (2010), p. 331-342
- Full Text: false
- Reviewed:
- Description: The ability to agree with the TREC Blog06 opinion assessments was measured for seven human assessors and compared with the submitted results of the Blog06 participants. The assessors achieved a fair level of agreement between their assessments, although the range between the assessors was large. It is recommended that multiple assessors are used to assess opinion data, or a pre-test of assessors is completed to remove the most dissenting assessors from a pool of assessors prior to the assessment process. The possibility of inconsistent assessments in a corpus also raises concerns about training data for an automated opinion detection system (AODS), so a further recommendation is that AODS training data be assembled from a variety of sources. This paper establishes an aspirational value for an AODS by determining the level of agreement achievable by human assessors when assessing the existence of an opinion on a given topic. Knowing the level of agreement amongst humans is important because it sets an upper bound on the expected performance of AODS. While the AODSs surveyed achieved satisfactory results, none achieved a result close to the upper bound. © 2009 Elsevier Ltd. All rights reserved.
Using psycholinguistic features for profiling first language of authors
- Authors: Torney, Rosemary , Vamplew, Peter , Yearwood, John
- Date: 2012
- Type: Text , Journal article
- Relation: Journal of the American Society for Information Science and Technology Vol. 63, no. 6 (2012), p. 1256-1269
- Full Text: false
- Reviewed:
- Description: This study empirically evaluates the effectiveness of different feature types for the classification of the first language of an author. In particular, it examines the utility of psycholinguistic features, extracted by the Linguistic Inquiry and Word Count (LIWC) tool, that have not previously been applied to the task of author profiling. As LIWC is a tool that has been developed in the psycholinguistic field rather than the computational linguistics field, it was hypothesized that it would be effective, both as a single type feature set because of its psycholinguistic basis, and in combination with other feature sets, because it should be sufficiently different to add insight rather than redundancy. It was found that LIWC features were competitive with previously used feature types in identifying the first language of an author, and that combined feature sets including LIWC features consistently showed better accuracy rates and average F measures than were achieved by the same feature sets without the LIWC features. As a secondary issue, this study also examined how effectively first language classification scaled up to a larger number of possible languages. It was found that the classification scheme scaled up effectively to the entire 16 language collection from the International Corpus of Learner English, when compared with results achieved on just 5 languages in previous research. 2012 ASIS&T.