Stoel! Return Prediction with Sentiment Analysis of Twitter Data
MetadataShow full item record
- Master of Science 
We attempt to make improvements to stock return prediction accuracy through sentiment analysis of Twitter data. Our hypothesis is that Twitter users mainly consists of retail investors, implying that the aggregation of sentiment will influence stocks with lower levels of institutional ownership. Our analysis involves thee sentiment approaches. The first approach gives labels to tweets based on magnitude and direction of changes in the stocks price. The second is a manual labelling approach, where the authors went through tweets manually and determined whether the tweets had a positive, negative or neutral sentiment. The last is using a dictionary created from financial tweets. For the first two approaches, we utilised three text classification methods Naïve Bayes, Logistic Regression and SVM. The Sentiment features were used in tandem with common financial features, momentum, liquidity and volatility, to compare predictive power through three supervised regression models, Random Forest, Gradient Boosting and a neural network model - LSTM. We find that including sentiment in the models decrease accuracy slightly across all models, and that including the level of stock institutional ownership has limited effect on improving predictions, in our selected sample. We argue that larger data size may be beneficial create an accurate market sentiment proxy, and that sentiment analysis should be more useful when focusing on special cases, like peak volumes and the number of followers.
Masteroppgave(MSc) in Master of Science in Business Analytics - Handelshøyskolen BI, 2022