1 AIT Asian Institute of Technology

Mining news articles to predict a stock trend

AuthorKhan, Salih
Call NumberAIT RSPR no.IM-14-03
Subject(s)Stocks
Data mining

NoteA research submitted in partial fulfillment of the requirements for the degree of Master of Science in Information Management, School of Engineering and Technology
PublisherAsian Institute of Technology
Series StatementResearch studies project report ; no. IM-14-03
AbstractPrediction of the stock trend by using mining techniques is one of the utmost important issues to be inspected. Now-a-days, especially in text and data mining society, predicting the stock prices movement based on the news articles contents is an emerging topic. Prior publishers have already shown that there is a robust relationship among the time when the stock prices movements and the time when the news stories are broadcasted. In this research, we present a prediction model that predicts stock trend by analyzing the impact of textual information such as news articles which is more superior to quantifiable data. We investigated the immediate influence of news stories on the time series based on the Efficient Market Hypothesis. It is a classification problem which needs several text and data mining tools and techniques. We use daily stock prices and time marked news articles related to Apple Company for making such a prediction model. The news articles are preprocessed and are labeled either as positive (up) or negative (down) by being aligned back to the pointed trends. The news articles selection are based on predefined positive and negative sentiment words dictionaries. The selected news articles are represented using the vector space model and term weighting scheme. Lastly, the connection between news article contents and trends on the stock prices are learned either by kNN or Naive Bayes machine learning algorithm. Various experiments are conducted to evaluate different aspects of the projected model and high level results in all of the experiments are obtained. By using kNN classifier, the accuracy of the prediction model is equal to 70% and by using Naive Bayes classifier; the accuracy is equal to 76% which is favorable and efficient than kNN. We compared total accuracy of both prediction models with news articles random labeling 51% accuracy. The accuracy of the two prediction models, kNN and Naive Bayes, is better than manually random prediction accuracy.
Year2014
Corresponding Series Added EntryAsian Institute of Technology. Research studies project report ; no. IM-14-03
TypeResearch Study Project Report (RSPR)
SchoolSchool of Engineering and Technology (SET)
DepartmentDepartment of Information and Communications Technologies (DICT)
Academic Program/FoSInformation Management (IM)
Chairperson(s)Guha, Sumanta;
Examination Committee(s)Dailey, Matthew N.;Duboz, Raphael;
Scholarship Donor(s)Ministry of Higher Education (MoHE), Afghanistan;Partnership Project;
DegreeResearch Studies Project Report (M.Sc.) - Asian Institute of Technology, 2014


Usage Metrics
View Detail0
Read PDF0
Download PDF0