1
Mining news articles to predict a stock trend | |
Author | Khan, Salih |
Call Number | AIT RSPR no.IM-14-03 |
Subject(s) | Stocks Data mining |
Note | A research submitted in partial fulfillment of the requirements for the degree of Master of Science in Information Management, School of Engineering and Technology |
Publisher | Asian Institute of Technology |
Series Statement | Research studies project report ; no. IM-14-03 |
Abstract | Prediction of the stock trend by using mining techniques is one of the utmost important issues to be inspected. Now-a-days, especially in text and data mining society, predicting the stock prices movement based on the news articles contents is an emerging topic. Prior publishers have already shown that there is a robust relationship among the time when the stock prices movements and the time when the news stories are broadcasted. In this research, we present a prediction model that predicts stock trend by analyzing the impact of textual information such as news articles which is more superior to quantifiable data. We investigated the immediate influence of news stories on the time series based on the Efficient Market Hypothesis. It is a classification problem which needs several text and data mining tools and techniques. We use daily stock prices and time marked news articles related to Apple Company for making such a prediction model. The news articles are preprocessed and are labeled either as positive (up) or negative (down) by being aligned back to the pointed trends. The news articles selection are based on predefined positive and negative sentiment words dictionaries. The selected news articles are represented using the vector space model and term weighting scheme. Lastly, the connection between news article contents and trends on the stock prices are learned either by kNN or Naive Bayes machine learning algorithm. Various experiments are conducted to evaluate different aspects of the projected model and high level results in all of the experiments are obtained. By using kNN classifier, the accuracy of the prediction model is equal to 70% and by using Naive Bayes classifier; the accuracy is equal to 76% which is favorable and efficient than kNN. We compared total accuracy of both prediction models with news articles random labeling 51% accuracy. The accuracy of the two prediction models, kNN and Naive Bayes, is better than manually random prediction accuracy. |
Year | 2014 |
Corresponding Series Added Entry | Asian Institute of Technology. Research studies project report ; no. IM-14-03 |
Type | Research Study Project Report (RSPR) |
School | School of Engineering and Technology (SET) |
Department | Department of Information and Communications Technologies (DICT) |
Academic Program/FoS | Information Management (IM) |
Chairperson(s) | Guha, Sumanta; |
Examination Committee(s) | Dailey, Matthew N.;Duboz, Raphael; |
Scholarship Donor(s) | Ministry of Higher Education (MoHE), Afghanistan;Partnership Project; |
Degree | Research Studies Project Report (M.Sc.) - Asian Institute of Technology, 2014 |