8000 GitHub - mvladk/predictshare: ML on data set to predict share price
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

mvladk/predictshare

Repository files navigation

Creating a Python web application that uses a neural network to predict the price of Yahoo stock involves several steps, including data collection, preprocessing, model creation, training, and finally, integration into a web framework. For this example, we'll use Flask as the web framework, scikit-learn for data preprocessing, and Keras for building the neural network model. This example assumes you have basic knowledge of Python, Flask, and machine learning concepts.

Step 1: Install Required Libraries

First, ensure you have Python installed on your system. Then, install the required libraries using pip:

pip install flask numpy pandas scikit-learn keras yfinance

yfinance is used to fetch historical stock data from Yahoo Finance.

Step 2: Fetch and Preprocess the Data

Create a Python script (e.g., fetch_data.py) to fetch historical stock data and preprocess it for the neural network.

import numpy as np
import pandas as pd
import yfinance as yf
from sklearn.preprocessing import MinMaxScaler

# Fetch historical stock data
def fetch_stock_data(symbol):
    stock_data = yf.download(symbol, start="2020-01-01", end="2023-01-01")
    stock_data = stock_data[['Close']]  # We'll use only the 'Close' prices
    return stock_data

# Preprocess data
def preprocess_data(stock_data):
    scaler = MinMaxScaler(feature_range=(0, 1))
    scaled_data = scaler.fit_transform(stock_data)
    
    # Create a dataset where X is the number of past days' stock prices
    # and y is the next day's stock price
    X, y = [], []
    for i in range(60, len(scaled_data)):
        X.append(scaled_data[i-60:i, 0])
        y.append(scaled_data[i, 0])
    X, y = np.array(X), np.array(y)
    X = np.reshape(X, (X.shape[0], X.shape[1], 1))
    
    return X, y, scaler

# Example usage
if __name__ == "__main__":
    data = fetch_stock_data('YHOO')
    X, y, scaler = preprocess_data(data)

Step 3: Build and Train the Neural Network

Create another script (e.g., train_model.py) to define and train your neural network model.

from keras.models import Sequential
from keras.layers import Dense, LSTM, Dropout
from fetch_data import fetch_stock_data, preprocess_data

# Fetch and preprocess data
data = fetch_stock_data('YHOO')
X, y, scaler = preprocess_data(data)

# Build the model
model = Sequential()
model.add(LSTM(units=50, return_sequences=True, input_shape=(X.shape[1], 1)))
model.add(Dropout(0.2))
model.add(LSTM(units=50, return_sequences=False))
model.add(Dropout(0.2))
model.add(Dense(units=1))  # Prediction of the next closing value

model.compile(optimizer='adam', loss='mean_squared_error')

# Train the model
model.fit(X, y, epochs=100, batch_size=32)

# Save the model and scaler
model.save('stock_prediction_model.h5')

Step 4: Create a Flask Web Application

Now, create a Flask application (app.py) that uses the trained model to predict stock prices.

from flask import Flask, request, jsonify
import numpy as np
import pandas as pd
from keras.models import load_model
from fetch_data import fetch_stock_data, preprocess_data
import yfinance as yf

app = Flask(__name__)

# Load the trained model and scaler
model = load_model('stock_prediction_model.h5')

@app.route('/predict', methods=['POST'])
def predict():
    json_input = request.json
    symbol = json_input['symbol']
    # Fetch recent stock data
    recent_data = yf.download(symbol, period="2mo", interval="1d")
    recent_data = recent_data[['Close']]
    _, _, scaler = preprocess_data(recent_data)
    
    # Preprocess the data
    last_60_days = recent_data[-60:].values
    last_60_days_scaled = scaler.transform(last_60_days)
    X_test = np.array([last_60_days_scaled])
    X_test = np.reshape(X_test, (X_test.shape[0], X_test.shape[1], 1))
    
    # Make prediction
    predicted_price = model.predict(X_test)
    predicted_price = scaler.inverse_transform(predicted_price)
    
    return jsonify(predicted_price=predicted_price[0][0])

if __name__ == '__main__':
    app.run(debug=True)

Step 5: Test Your Application

Run your Flask application:

python app.py

Then, you can test your prediction endpoint by sending a POST request with a JSON body containing the stock symbol to predict. You can use tools like Postman or curl for this:

curl -X POST -H "Content-Type: application/json" -d '{"symbol":"YHOO"}' http://localhost:5000/predict

This setup provides a basic framework. Note that the actual performance of the model in predicting stock prices can vary greatly and should be used with caution in real investment decisions. Additionally, the symbol for Yahoo might have changed or be different depending on when you're trying this example, so make sure to use the correct stock symbol.

The error message OSError: No file or directory found at stock_prediction_model.h5 indicates that the Flask application is trying to load a model from a file named stock_prediction_model.h5, but it cannot find the file in the expected location. This can happen for a few reasons:

  1. The model has not been saved correctly: Ensure that the code to save the model in train_model.py has been executed successfully and that there were no errors during the saving process. The line responsible for saving the model is model.save('stock_prediction_model.h5'). Make sure this line of code is executed without errors.

  2. The model is saved in a different directory: If you ran the script that saves the model in a different directory from where your Flask application (app.py) is located, the Flask application might not be able to find the model file. Ensure that the model file is in the same directory as your Flask application or provide an absolute path to the model file when loading it.

  3. File permissions: There might be a file permission issue preventing the Flask application from accessing the model file. Ensure that the file permissions allow your application to read the file.

To resolve the issue, you can follow these steps:

  • Verify Model Saving: First, ensure that the model is being saved correctly by checking the output of the script train_model.py and confirming that there are no errors during the saving process.

  • Check the File Location: Ensure that stock_prediction_model.h5 is in the same directory as app.py. If it's not, move it to the correct directory or modify the load_model path in app.py to point to the correct location. For example, if your model is in a subdirectory named models, you would load it like this:

    model = load_model('models/stock_prediction_model.h5')
  • Use an Absolute Path: If the above steps don't work, try using an absolute path to the model file when loading it. An absolute path specifies the complete directory path from the root of the filesystem to the file, which can help avoid issues with relative paths.

    model = load_model('/path/to/your/model/stock_prediction_model.h5')

Replace /path/to/your/model/ with the actual path to the directory containing your model file.

After ensuring the model file is correctly saved and located in the right directory, try running your Flask application again.

The warning you're seeing is due to a change in the recommended file format for saving models in Keras. The newer versions of Keras prefer the .keras file extension for saving models, as it uses the SavedModel format which is more versatile and supports more features compared to the older HDF5 format. To address both the warning and the OSError you encountered, let's update the model saving and loading process accordingly.

Updating Model Saving Process

In your train_model.py script, when you save the model, change the file extension from .h5 to .keras:

# Save the model
model.save('stock_prediction_model.keras')

Updating Model Loading Process

In your Flask application (app.py), update the model loading line to reflect the new file extension:

# Load the trained model
model = load_model('stock_prediction_model.keras')

Addressing the OSError

The OSError: No file or directory found at stock_prediction_model.h5 indicates that the Flask application is unable to find the model file. This could be due to several reasons:

  1. Model File Not Saved Properly: Ensure that the model training and saving process (train_model.py) is completed without errors before you try to load the model in your Flask application.

  2. Incorrect File Path: Make sure that the Flask application and the model file are in the same directory, or provide the correct relative or absolute path to the model file when loading it.

  3. File Naming Mismatch: Ensure that the filename used to save the model matches the filename used when loading the model. After updating to the .keras extension, make sure this change is reflected everywhere the file is referenced.

After making these changes, your model saving and loading process should be updated to the recommended format, and the OSError should be resolved provided the file path and naming are correct. Remember to rerun your model training script to save the model in the new format before attempting to load it in your Flask application.

The error you're encountering, ImportError: cannot import name 'EVENT_TYPE_OPENED' from 'watchdog.events', suggests there's an issue with the watchdog package, which is a dependency used by Flask's reloader in debug mode. This could be due to an incompatible version of watchdog or a problem with the installation.

Here are a few steps to resolve this issue:

1. Update watchdog

First, try updating watchdog to the latest version, as the issue might have been fixed in a newer release:

pip install --upgrade watchdog

2. Reinstall watchdog

If updating doesn't solve the problem, try uninstalling and then reinstalling watchdog:

pip uninstall watchdog
pip install watchdog

3. Check for Known Issues

Look for any known issues on the watchdog GitHub issues page. Sometimes, specific versions of libraries have compatibility issues with certain systems or Python versions, and these are usually documented in the project's issues or release notes.

4. Run Flask Without Debug Mode

If you're still encountering the issue and need an immediate workaround, you can run the Flask application without debug mode. This isn't a solution to the problem but can help you proceed with development until you find a fix. Note that running without debug mode disables automatic reloading and debug information:

if __name__ == '__main__':
    app.run(debug=False)

5. Use a Virtual Environment

If you're not already doing so, consider using a Python virtual environment for your project. This can help manage dependencies and avoid conflicts between different projects:

# Create a virtual environment
python3 -m venv myenv

# Activate the virtual environment
# On Windows
myenv\Scripts\activate
# On Unix or MacOS
source myenv/bin/activate

# Install dependencies in the virtual environment
pip install flask numpy pandas scikit-learn keras yfinance watchdog

6. Check Your Python and Flask Versions

Ensure that your Python version is compatible with the versions of Flask and watchdog you're using. Sometimes, newer versions of libraries may not support older versions of Python, and vice versa.

If none of these steps resolve the issue, consider seeking help from the Flask or watchdog communities. You might find others who have encountered and solved the same problem.

curl -X POST -H "Content-Type: application/json" -d '{"symbol":"YHOO"}' http://localhost:5000/predict

curl -X POST -H "Content-Type: application/json" -d '{"symbol":"AAPL"}' http://localhost:5000/predict

To modify your fetch_data.py script to fetch 24 months (2 years) of historical stock data instead of the shorter period previously specified, you'll need to adjust the yf.download function call within the fetch_stock_data function. Here's how you can do it:

import numpy as np
import pandas as pd
import yfinance as yf
from sklearn.preprocessing import MinMaxScaler

# Fetch historical stock data
def fetch_stock_data(symbol):
    # Adjusted to fetch 24 months of data
    stock_data = yf.download(symbol, period="24mo", interval="1d")
    stock_data = stock_data[['Close']]  # We'll use only the 'Close' prices
    return stock_data

# Preprocess data
def preprocess_data(stock_data):
    scaler = MinMaxScaler(feature_range=(0, 1))
    scaled_data = scaler.fit_transform(stock_data)
    
    # Create a dataset where X is the number of past days' stock prices
    # and y is the next day's stock price
    X, y = [], []
    for i in range(60, len(scaled_data)):
        X.append(scaled_data[i-60:i, 0])
        y.append(scaled_data[i, 0])
    X, y = np.array(X), np.array(y)
    X = np.reshape(X, (X.shape[0], X.shape[1], 1))
    
    return X, y, scaler

# Example usage
if __name__ == "__main__":
    data = fetch_stock_data('AAPL')  # Example symbol, replace with your target
    X, y, scaler = preprocess_data(data)

This change will ensure that when you call fetch_stock_data, it will attempt to retrieve 24 months of closing price data for the specified stock symbol. Fetching more data can help address issues related to insufficient historical data for model training or prediction. However, remember that more data also means more processing time and potentially more noise in the data, depending on your model and prediction goals.

After making this change, ensure that any script or function that calls fetch_stock_data or expects data from it is aware of the potentially larger dataset and adjusts accordingly. This might include checking the size of the returned data before attempting to preprocess or use it, especially if your model relies on a specific window size or data structure.

If you're still encountering the "error": "Not enough data to make a prediction" message after modifying fetch_data.py to fetch 24 months of data, it suggests that the issue might not be solely due to the amount of historical data being fetched. Here are a few steps to troubleshoot and resolve this issue:

1. Verify Data Availability

First, ensure that the stock symbol you're querying (AAPL in the example) actually has 24 months of data available on Yahoo Finance. Some stocks, especially those of recently listed companies, might not have 24 months of historical data.

2. Check the Data Fetching Process

Ensure that the yf.download function is successfully fetching the data. You can add print statements or log the shape of the stock_data DataFrame after fetching to ensure it's not empty and contains a sufficient number of rows.

def fetch_stock_data(symbol):
    stock_data = yf.download(symbol, period="24mo", interval="1d")
    stock_data = stock_data[['Close']]  # Focus on 'Close' prices
    print(f"Fetched {len(stock_data)} rows for {symbol}.")  # Debugging line
    return stock_data

3. Adjust the Preprocessing Function

The preprocessing function assumes there's enough data to create sequences of 60 days for training. If, after fetching 24 months of data, you're still getting this error, it might be due to the preprocessing step not handling the data correctly. Ensure that the preprocessing logic correctly forms the input sequences for your model. If your dataset is slightly short (e.g., due to market holidays or weekends), consider lowering the window size temporarily to test the functionality.

4. Modify the Flask Route to Handle Edge Cases

In your Flask application, ensure that the route which handles predictions is correctly checking the length of the fetched data before attempting to preprocess and predict. If the data length is insufficient even after fetching 24 months of data, the issue might lie in how the data is being processed or in an incorrect assumption about the data's shape.

5. Debugging the Flask Application

Add more detailed error handling and logging around the data fetching and preprocessing steps in your Flask application. This can help identify exactly where the process is failing.

@app.route('/predict', methods=['POST'])
def predict():
    json_input = request.json
    symbol = json_input['symbol']
    recent_data = yf.download(symbol, period="24mo", interval="1d")
    recent_data = recent_data[['Close']]

    if recent_data.empty or len(recent_data) < 60:
        return jsonify(error="Not enough data to make a prediction"), 400

    try:
        X, y, scaler = preprocess_data(recent_data)
        # Further processing and prediction...
    except Exception as e:
        return jsonify(error=str(e)), 500

6. Ensure Correct Symbol and Data Range

Double-check the stock symbol and ensure it's correct and actively traded. Also, verify that Yahoo Finance provides data for the requested period for this symbol.

If you've gone through these steps and are still facing issues, consider providing more context or specific error messages you're encountering for further assistance.

after modifying in app.py: recent_data = yf.download(symbol, period="24mo", interval="1d")

got a result:

"predicted_price": 411.7828369140625

Great! Modifying the yf.download call in your app.py to fetch 24 months of data and successfully receiving a predicted price indicates that the adjustment resolved the issue related to insufficient historical data for making a prediction. With more data, your model had enough input to generate a forecast for the stock price.

Next Steps

Now that you've successfully obtained a predicted stock price, here are a few considerations and potential next steps to enhance your project:

  1. Model Evaluation and Tuning: Continuously evaluate the performance of your model using various metrics (e.g., RMSE, MAE) and consider tuning your model or experimenting with different architectures to improve accuracy.

  2. Feature Engineering: Explore the possibility of including more features in your model beyond the closing price, such as opening price, high, low, volume, and technical indicators (e.g., moving averages, RSI). More relevant features can potentially improve the model's predictive capabilities.

  3. Data Splitting: Ensure you're splitting your data into training and testing sets to evaluate your model's performance on unseen data. This practice helps gauge how well your model generalizes to new data.

  4. Regular Updates: Stock market conditions change over time, so regularly retraining your model with the most recent data can help maintain its relevance and accuracy.

  5. User Interface Enhancements: If you plan to make your application accessible to others, consider enhancing the user interface with more detailed predictions, historical data visualization, and explanations of how predictions are made.

  6. Legal and Ethical Considerations: Be mindful of the legal and ethical considerations when providing financial advice or predictions. Clearly state the limitations of your model and that it should not be used as the sole basis for investment decisions.

  7. Deployment: If you're considering deploying your application, research and choose an appropriate production-grade server and consider the security aspects of your application, especially if you're dealing with sensitive user data.

  8. Feedback Loop: Implement a mechanism to collect feedback on your model's predictions to continuously improve its performance. This could be as simple as comparing predicted prices with actual prices over time and adjusting your model accordingly.

Congratulations on reaching this milestone in your project! Remember, machine learning and financial modeling are iterative processes that involve continuous learning and improvement.

curl -X POST -H "Content-Type: application/json" -d '{"symbol":"AAPL"}' http://localhost:5000/predict

About

ML on data set to predict share price

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

0