A machine learning system for identifying recurring financial transactions from bank data.
Recur-scan analyzes transaction history to automatically detect recurring payments, subscriptions, and other regular financial commitments. It uses a combination of pattern recognition and machine learning to identify:
- Monthly subscriptions (streaming services, memberships, etc.)
- Regular bill payments (utilities, rent, etc.)
- Periodic deposits (paychecks, dividends, etc.)
- Varying-amount recurring transactions
The system processes transaction data from multiple sources, extracts relevant features, and trains models to classify transactions as recurring or non-recurring with high accuracy.
- Requires uv - follow the instructions on that page to install uv.
make install
Re-run make install whenever new dependencies have been added
uv add <package>
uv remove <package>
make check
make test
uv run scripts/<script_name>.py
source .venv/bin/activate
# activate the virtual environment
python scripts/<script_name>.py
# run the script
- src/recur_scan/ - Core library for feature extraction and model implementation
- scripts/ - Data processing, training, and evaluation scripts
- 10_create_training_data.py - Prepares transaction data for labeling
- 15_gather_questions.py - Identifies ambiguous labels for review
- 30_train.py - Trains the recurring transaction detection model
- tests/ - Unit and integration tests
- Raw transaction data is processed by
10_create_training_data.py
into balanced datasets - Labelers mark transactions as recurring or non-recurring
15_gather_questions.py
identifies ambiguous cases for review- Features are extracted using
recur_scan.features.get_features()
- Model is trained using
30_train.py
This project is licensed under the Apache License - see LICENSE file for details.