Difference between revisions of "Machine Learning for Trading Strategies"

From EU COST Fin-AI
Jump to navigation Jump to search
Line 31: Line 31:
  
  
Data source:  
+
Data source:
 +
* yfinance s&p500
 +
* [https://www.kaggle.com/datasets/miguelaenlle/massive-stock-news-analysis-db-for-nlpbacktests Kaggle]
  
 
== Contact ==  
 
== Contact ==  
 
* [mailto:oste@zhaw.ch Prof. Dr. Jörg Osterrieder]
 
* [mailto:oste@zhaw.ch Prof. Dr. Jörg Osterrieder]
 
* [mailto:vliep1@bfh.ch Pieter-Jan Vliegen]
 
* [mailto:vliep1@bfh.ch Pieter-Jan Vliegen]

Revision as of 17:31, 24 May 2023

Details

  • Authors: Pieter-Jan Vliegen
  • Title: Machine Learning for Trading Strategies
  • Supervisior: Prof. Dr. Jörg Osterrieder
  • Degree: Bachelor of Science
  • University: BFH
  • Year: 2023
  • Status: Working Paper

Summary

This Bachelor thesis explored the integration of news sentiment analysis with financial data for predicting daily S&P500 index prices. By employing natural language processing and machine learning techniques, we found a positive correlation between sentiment scores derived from news articles and the movements in financial markets.

Using a RandomForest classifier, we achieved superior performance in predicting price changes, demonstrating the potential of advanced computational techniques in financial analytics. The study encountered challenges such as sourcing a suitable news dataset and selecting appropriate S&P500 tickers, which were addressed using effective data filtering and refining methodologies.

Despite potential biases in news sentiment analysis and the inherent complexities of integrating diverse data sources, the study illuminated the potential for enhancing the accuracy of financial forecasting models. The research contributes to the evolving field of sentiment analysis in finance, suggesting further exploration of alternative data types and advanced analytical methods for more accurate and robust predictive models.

Abstract

This bachelor's thesis explores the integration of news sentiment analysis with financial data to improve the accuracy of daily S&P500 index price predictions. Given the growing influence of public sentiment on financial markets, this study employs natural language processing and machine learning techniques to build a comprehensive model incorporating sentiment scores extracted from news articles with traditional financial data.

The thesis presents the creation of a novel dataset by combining sentiment scores—proportions of positive, neutral, and negative news—from select S&P500 tickers with their respective financial data. Machine learning, specifically the RandomForest classifier, plays a pivotal role in the forecasting model, enhancing its predictive power.

Despite challenges related to data acquisition and the selection of appropriate tickers, the research underscores the potential of sentiment analysis in financial forecasting. The study finds that the integration of sentiment analysis significantly contributes to the predictability of the S&P500 index prices, affirming the correlation between market sentiment and financial market movements.

For future research, the study recommends expanding the scope of data sources and exploring other machine learning algorithms to further enhance prediction accuracy and robustness. This thesis contributes to the growing literature on sentiment analysis in finance and underlines the significance of integrating alternative data types for better financial forecasting.

Important links

Data

Data source:

Contact