- The United States, in particular, features extremely high costs for healthcare
- Public awareness and support mental health care is increasing
- Pipeline to gather tweets on two polar topics to understand users’ sentiment towards them
- Demonstration that identifies those users who use depression-indicative language
- Useful to mental health professionals to identify long-term trends in user’s mental health
- Data collection: Collected nearly 4,000 tweets from the Twitter Developer API and labelled them based on hashtags present. For example, tweets containing “depressed” (or related hashtags) will be labelled as belonging to the “depressive-indicative” class; tweets containing “happy” (or related hashtags) will be labelled as part of the “non-depressive-indicative” class.
- Understand the user’s position in the Twitter community: Call the Twitter API to gain information about the user’s followers, followees, average retweet counts, and more.
- Data analysis: Send each of the 4,000 tweets through IBM Watson’s Tone Analyzer API to gain more dimensions of sentiment information about each tweet.
- Classification model: Use the labelled data to discriminate between tweets that are “depressive-indicative” or not in terms of their language characteristics. Trained classification model with scikit-learn’s k-Nearest Neighbors implementation.
- Classify an unknown user: Given an unknown user, generate visualizations and an overall classification of their Twitter tweet language.
- Python + Django web framework
- scikit-learn & IBM Watson intelligence APIs
- chart.js & Material Bootstrap
© Pramod Kotipalli 2016
This site is open source. Improve this page