Twitter Sentiment Analysis

Understanding user sentiment to aid mental health diagnosis

August 1st, 2016

Motivation

Pipeline to gather tweets on two polar topics to understand users’ sentiment towards them
Demonstration that identifies those users who use depression-indicative language
Useful to mental health professionals to identify long-term trends in user’s mental health

Data collection: Collected nearly 4,000 tweets from the Twitter Developer API and labelled them based on hashtags present. For example, tweets containing “depressed” (or related hashtags) will be labelled as belonging to the “depressive-indicative” class; tweets containing “happy” (or related hashtags) will be labelled as part of the “non-depressive-indicative” class.
Understand the user’s position in the Twitter community: Call the Twitter API to gain information about the user’s followers, followees, average retweet counts, and more.
Data analysis: Send each of the 4,000 tweets through IBM Watson’s Tone Analyzer API to gain more dimensions of sentiment information about each tweet.
Classification model: Use the labelled data to discriminate between tweets that are “depressive-indicative” or not in terms of their language characteristics. Trained classification model with scikit-learn’s k-Nearest Neighbors implementation.
Classify an unknown user: Given an unknown user, generate visualizations and an overall classification of their Twitter tweet language.

This site is open source. Improve this page »