- Title
- Meaning-sensitive noisy text analytics in the low data regime
- Creator
- Kasthuriarachchy, Buddhika
- Date
- 2022
- Type
- Text; Thesis; PhD
- Identifier
- http://researchonline.federation.edu.au/vital/access/HandleResolver/1959.17/192238
- Identifier
- vital:17963
- Abstract
- Digital connectivity is revolutionising people’s quality of life. As broadband and mobile services become faster and more prevalent globally than before, people have started to frequently express their wants and desires on social media platforms. Thus, deriving insights from text data has become a popular approach, both in the industry and academia, to provide social media analytics solutions across a range of disciplines, including consumer behaviour, sales, sports and sociology. Businesses can harness the data shared on social networks to improve their organisations’ strategic business decisions by leveraging advanced Natural Language Processing (NLP) techniques, such as context-aware representations. Specifically, SportsHosts, our industry partner, will be able to launch digital marketing solutions that optimise audience targeting and personalisation using NLP-powered solutions. However, social media data are often noisy and diverse, making the task very challenging. Further, real-world NLP tasks often suffer from insufficient labelled data due to the costly and time-consuming nature of manual annotation. Nevertheless, businesses are keen on maximising the return on investment by boosting the performance of these NLP models in the real world, particularly with social media data. In this thesis, we make several contributions to address these challenges. Firstly, we propose to improve the NLP model’s ability to comprehend noisy text in a low data regime by leveraging prior knowledge from pre-trained language models. Secondly, we analyse the impact of text augmentation and the quality of synthetic sentences in a context-aware NLP setting and propose a meaning-sensitive text augmentation technique using a Masked Language Model. Thirdly, we offer a cost-efficient text data annotation methodology and an end-to-end framework to deploy efficient and effective social media analytics solutions in the real world.; Doctor of Philosophy
- Publisher
- Federation University Australia
- Rights
- All metadata describing materials held in, or linked to, the repository is freely available under a CC0 licence
- Rights
- Copyright @ Buddhika Kasthuriarachchy
- Rights
- Open Access
- Subject
- Noisy Text Comprehension; Noisy Text Analytics; Pre-trained Language Models; Text Augmentation; Social Media Analytics
- Full Text
- Thesis Supervisor
- Chetty, Madhu
- Hits: 552
- Visitors: 495
- Downloads: 47
Thumbnail | File | Description | Size | Format | |||
---|---|---|---|---|---|---|---|
View Details Download | SOURCE2 | Australian Digital Thesis | 3 MB | Adobe Acrobat PDF | View Details Download |