Machine learning-based modelling for museum visitations prediction
- Authors: Yap, Norman , Gong, Mingwei , Naha, Ranesh , Mahanti, Aniket
- Date: 2020
- Type: Text , Conference proceedings
- Relation: 2020 International Symposium on Networks, Computers and Communications (ISNCC); Montreal, Canada; 20-22nd October, 2020, p.1-7
- Full Text: false
- Reviewed:
- Description: Cultural venues like museums increasingly seek to harness the value of data analytics to make data driven decisions related to exhibitions duration, marketing campaigns, resource planning, and revenue optimization. One key priority is the need to understand the influencing factors behind visitor attendance. Using data collected from a large museum, we investigated whether the weather has a significant impact on visitor attendance or that other factors are more important. We applied the Cross Industry Standard Process for Data Mining (CRISP-DM) methodology to perform the research, developed and built four different types of regression models using R and its machine learning packages to model visitor attendance. The models were trained and evaluated. Predictions of visitor attendance were then generated from each of the four models and forecast accuracy was measured. The extreme gradient boost model was the best model with the highest average forecast accuracy of 93% and lowest forecast variability when benchmarked against the actual visitor attendance from the test data set. The weather was not considered to be as significant in predicting visitor trends and numbers to the museum compared to factors like time of the day, day of the week and school holidays. However, it was still measured to have a slight impact as excluding weather variables resulted in a model with a poorer fit. Weather can potentially have a more marked impact on cultural attractions in more extreme weather environments and outdoor venues.
Piracy on the internet: Publisher-side analysis on file hosting services
- Authors: Chan, Marcus , Gong, Mingwei , Naha, Ranesh , Mahanti, Aniket
- Date: 2020
- Type: Text , Conference proceedings
- Relation: 2020 International Symposium on Networks, Computers and Communications (ISNCC); Montreal, QC, Canada; 20-22 October 2020 p. 1-7
- Full Text: false
- Reviewed:
- Description: In the file sharing ecosystem, One-Click File Hosting Services (FHS) such as Rapidgator and Uploaded, the previously Rapidshare and Megaupload, provide a platform for users to share copyrighted content. We present a publisher-side analysis of FHS file sharing dynamics through data collected from active measurement by crawling Warez-BB. The website is essentially a forum where publishers can share links to content they have uploaded on file hosting services. Consumers can use the website to gain access to content shared on the website, often free of charge. We primarily analyse various characteristics of file sharing with respect to view count as the evaluation metric.
Characterisation and comparative analysis of thematic video portals
- Authors: Adib, Saif , Mahanti, Aniket , Naha, Ranesh
- Date: 2021
- Type: Text , Journal article
- Relation: Technology in society Vol. 67, no. (2021), p. 101690
- Full Text: false
- Reviewed:
- Description: This paper provides a comprehensive measurement study on three video streaming websites with social media features - ‘TED Talks’, ‘xHamster’ and ‘XVideos’. We have analysed 2685 TED videos from 2006 to 2018 to characterise the service. For xHamster and XVideos, active measurements were used to collect unique metadata on almost 3405 and 6721 channels from 2012 to 2019 respectively, which were then analysed. Through these characterisations we gained insight into the main players of the websites – viewers, uploaders and website owners. Our analysis involved the studying of video streaming characteristics such as views, number of uploads, ratings, tags etc. By this we aim to give an overview of the services' current state and compare them with other traditional video streaming services. Our results showed some similar trends to be observed in all three websites such as TED videos and adult channels getting a high number of views despite low injection rate, maintaining a power-law behaviour due to front page recommendations and ratings being underutilised as a feature.Other observations include adult streaming services having a higher number of subscribers per channel. The characterisation results obtained are of value to network operators, content providers, and protocol designers. These results can also be used by content providers to measure what type of content is being watched on their websites. Our study provides a glimpse at how video streaming services function today and the trends they seem to follow. •Measurements and detailed characterisation study based on TED Talks, xHamster, and XVideos.•Comprehensive understanding of the online video streaming domain.•Insights on video streaming services and how they utilise their online social network of users.