This project focuses on sentiment analysis in finance, using machine learning and deep learning models to predict stock price movements based on financial news and market sentiment.
- Collects stock market data and financial news
- Performs sentiment analysis using NLP techniques
- Implements various machine learning models (LSTM, RNN, Logistic Regression, XGBoost)
- Compares traditional models with sentiment-based predictions
- Programming Languages: Python
- Libraries & Tools: TensorFlow, Scikit-Learn, NLTK, BeautifulSoup, Pandas, NumPy
- Data Sources: Yahoo Finance, News APIs
This dataset is designed to analyze stock market trends and predict stock prices using machine learning models like LSTM, RNN, Logistic Regression, and XGBoost. The dataset contains historical stock price data, including key financial indicators.
This file includes daily stock price information with multiple financial attributes.
| Column Name | Description |
|---|---|
| Date | The trading date of the stock. |
| Open | The stock price at market opening. |
| High | The highest price reached during the trading session. |
| Low | The lowest price reached during the trading session. |
| Close | The stock price at market closing. |
| Adj Close | The adjusted closing price accounts for corporate actions like dividends and stock splits. |
| Volume | The number of shares traded during the session. |
A Jupyter Notebook that contains:
- Data Preprocessing – Cleaning and preparing the dataset.
- Feature Engineering – Extracting key financial indicators.
- Model Training – Implementing LSTM, RNN, Logistic Regression, and XGBoost for prediction.
- Performance Evaluation – Comparing accuracy, RMSE, and stock price trends.
After running the stock prediction models, you can expect the following outcomes:
- The models will generate future stock price predictions based on historical data.
- Predictions will be visualized using line charts to compare actual vs. predicted prices.
- Evaluation metrics such as RMSE (Root Mean Square Error) and Mean Absolute Error (MAE) will be used to compare different models.
- Expected Trends:
- LSTM & RNN → Expected to capture long-term dependencies and perform well on time-series data.
- XGBoost → Expected to perform well with engineered features but may struggle with sequential dependencies.
- Logistic Regression → Suitable for trend classification but may not be as precise for exact price forecasting.
- Identifying key factors (e.g., past prices, trading volume) that influence stock movements.
- Sentiment analysis (if included) will show how news and financial sentiment affect predictions.
- Graphical representations such as:
- 📈 Stock price trends (Actual vs. Predicted)
- 📊 Feature importance for predictions
- 📉 Error analysis across models