By Cooper Chasse
Faculty Mentor: Dr. Jeff Solka
Abstract
This project presents a machine learning pipeline for predicting daily stock returns of sports-sector companies, using Nike (NKE) as the primary case study. The pipeline combines traditional price-based technical features with four natural language processing (NLP) sentiment signals: general news sentiment sourced from the GDELT news database, sports-specific sentiment derived from keyword-filtered headlines, earnings call sentiment extracted from SEC EDGAR 10-Q and 10-K filings using FinBERT, and a Google Trends sports attention proxy capturing public engagement with the brand. A Random Forest model is trained using the Darts time series library and evaluated through a walk-forward backtest using VectorBT.

Leave a Reply