Sentiment Dynamics and Topic Analysis of Greece’s Premier Summer Islands Destinations
- Kwnstantinos Lambrou
- Nov 29, 2024
- 4 min read
Updated: Jul 15
Project Overview
In a short-term consulting project conducted in collaboration with Mentionlytics and the Big Blue Data Academy, we analyzed social media posts related to five major Greek tourism destinations:
Santorini
Mykonos
Rhodes
Crete
Corfu
The primary goals of this project were to:
Collect, aggregate, and analyze data on Greek tourism to enrich the study with quantitative information.
Conduct sentiment analysis on social media posts concerning Greek tourism destinations.
Identify prevalent topics and trends using natural language processing techniques, such as Latent Dirichlet Allocation (LDA).
Derive insights into user behavior and preferences by examining sentiment dynamics.
Develop an interactive Power BI dashboard to visualize and facilitate the understanding of the analysis results.
Data Sources
Mentionlytics: Utilized for comprehensive data collection from various social media platforms, capturing all mentions related to Greek tourism destinations.
Bank of Greece: Provided comprehensive tourism data for Greece, covering the period from 2005 to 2023, including visitor arrivals, revenues and other key metrics.
Tools Used
Python: Served as the primary language for scripting and data analysis, leveraging its extensive ecosystem of data science libraries.
Libraries:
Pandas
NumPy
Matplotlib
Seaborn
spaCy
Gensim
VADER
EmoLex
Power BI: Utilized to develop interactive dashboards, enabling dynamic visualization and exploration of the analysis results.
Translation Service
Google Cloud Translate API: Employed to automatically translate non-English social media posts into English.
Methodology
Our approach to the project centered on utilizing the Mentionlytics platform, a robust tool for monitoring social media interactions, to gain real-time insights into online discourse. The methodology was structured into key stages designed to extract, process, and analyze data for generating actionable insights. The steps followed are detailed below:
Data Collection:
We began by collecting an extensive dataset of social media posts using Mentionlytics. This platform allowed us to extract posts in real-time by leveraging pre-defined keywords relevant to our research objectives. The data spanned multiple platforms, including Twitter, Facebook, Instagram, and various blogs, providing a diverse and rich dataset that encompassed a wide array of topics and discussions. Additionally, we incorporated data from the Bank of Greece, which provided detailed insights into Greece's tourism activity for the period 2005–2023, further enriching our analysis with official statistics on visitor arrivals and related metrics.
Data Cleaning:
The cleaning process was pivotal in preparing the text for analysis. This involved removing extraneous characters, numbers, and irrelevant symbols. Additionally, unicode normalization was applied to standardize text formats across posts written in various languages. A significant part of this step involved filtering out irrelevant content through extensive exploratory data analysis (EDA). This iterative process ensured that only pertinent posts were retained while minimizing the risk of removing relevant entries, a challenge given the multilingual and diverse nature of the dataset.
Content Translation:
Given the presence of multilingual data, we employed Google's Cloud Translation API to translate non-English posts into English. This step was critical for consistency, enabling the use of advanced analytical tools optimized for English text. The translation ensured that the original meaning of the posts was preserved while allowing for a unified corpus for analysis.
Sentiment and Topic Analysis:
Once the data was prepared, aspect-based sentiment analysis was performed. We combined VADER, a sentiment analysis tool tailored for social media, with spaCy's Named Entity Recognition (NER) to assign sentiment scores and extract entities such as locations, organizations, and individuals. This enabled a detailed analysis of sentiments linked to specific entities. Additionally, topic modeling was conducted using Latent Dirichlet Allocation (LDA) to uncover prevalent themes within the data. By examining sentiment trends across these themes, we gained deeper insights into audience perceptions.
Dashboard Overview
This dashboard provides a dynamic representation of key metrics, trends, and insights, allowing stakeholders to easily interpret the results and make informed decisions. Below are snapshots showcasing the main features and visualizations of the dashboard.
The Power BI dashboard also incorporates a custom Page Navigator, allowing seamless navigation between different sections of the report.
General info: The general info page includes annual data on arrivals, revenue, and average cost per overnight stay, providing a clear overview of Greek tourism trends from 2005 to 2023.

All platforms: The all platforms page provides an analysis across all social media platforms, focusing on the sources of mentions, daily mentions during the examined period, and the overall sentiment distribution within the mentions. Additionally, it includes the daily engagement rate index, which aggregates likes, comments, and shares of a post and divides them by the poster's follower count at the time of posting, offering insights into the influence levels of individual posts.

Sentiment Analysis: The sentiment analysis page examines sentiment distribution, emotion analysis, the sentiment score index (ranging from -1 to 1), and a word cloud visualization. Users can filter the analysis by destination, platform, entity type, and topic, allowing for a more targeted exploration of the data.

Recognizes Entities: The recognized entities page features a table displaying entities and their average sentiment scores. Similar to the previous page, users can apply filters by platform, entity type, and topic. The interactive chart allows selection of an entity from the table, revealing its average sentiment score across the five examined destinations.

Presentation
Collaborators
Kyriakos Papadopoulos
Efrosini Pagkali
Comments