Categorise transactions automatically

data-analysisintermediateproven

The problem

Your accounting requires transactions to be coded to cost centres, projects, or categories, but manual categorisation is time-consuming and inconsistent. Different staff members code the same supplier differently. It takes hours each month and delays management reports.

The solution

Train a simple text classifier on your historical categorised transactions to suggest categories for new ones. The model learns from transaction descriptions, amounts, and dates to predict the right category. Human always reviews and approves suggestions. Accuracy improves as it learns from corrections.

What you get

An automated system that suggests transaction categories with confidence scores. Staff review each suggestion and approve or correct it. Over time, the model learns from corrections and accuracy improves. Reduces categorisation time by 60-80% while maintaining human oversight.

Before you start

At least 12 months of correctly categorised transaction history (1000+ transactions)
Stable category structure (categories haven't changed significantly)
Access to transaction data exports from your accounting system
Basic Python knowledge or willingness to adapt the example code
Process for staff to review and correct suggestions
DATA PROTECTION: Before uploading transaction data to cloud platforms like Google Colab, check your data protection policy. Transaction descriptions may contain donor or beneficiary names - anonymise or remove these before processing. Consider whether local processing (Jupyter Notebook on your own machine) is more appropriate for financial data.

When to use this

Processing 50+ transactions per month that need categorising
Consistent category structure over time
Staff time on manual categorisation is significant (4+ hours per month)
Categories are reasonably predictable from transaction description and amount
You have enough historical data to train on (1000+ categorised transactions)

When not to use this

Category structure changes frequently (makes training data obsolete)
Very small transaction volumes (manual is faster than maintaining automation)
High accuracy is legally required with full audit trail (model suggestions might not meet compliance needs)
Transaction descriptions are too vague to be predictive
No one can spend time on initial setup and monthly retraining
Your accounting software already has bank rules or auto-categorisation features - try these first before building a custom ML model (Xero, QuickBooks, Sage all have built-in rules engines)

Steps

1
Export and clean historical transaction data
Export 12-24 months of categorised transactions from your accounting system. Include: date, description, amount, existing category, and any other relevant fields. Remove any uncategorised transactions. Clean the data: standardise category names (remove typos, merge similar categories), remove test transactions or errors. If you have multiple currencies, either filter to one currency or add a currency column and convert amounts to a base currency (e.g., GBP) to ensure consistent amount-based patterns.
2
Prepare features for the model
Create features the model can learn from: extract keywords from transaction descriptions, include amount as a feature, extract day of week and month from date (some categories have time patterns), create vendor name extraction if not already a field. The richer your features, the better the model learns.
3
Split data into training and test sets
Hold back the most recent 2-3 months as a test set. Use the older data for training. This simulates how the model will perform on future transactions. Check that all your categories appear in both training and test sets (important for rare categories).
4
Train classifier model
Start with a simple model: RandomForestClassifier or LogisticRegression from scikit-learn. Train on your features and categories. The model learns patterns like 'transactions from [grant body] with amount ~£10,000 quarterly → Restricted Funds - Grant Income category' or 'volunteer expenses around £30-50 → Volunteer Costs category'. Don't over-complicate initially - simple models often work well.
5
Evaluate accuracy on test set
Run the model on your test set and check accuracy. Aim for 70%+ overall accuracy. Check per-category accuracy - some categories might be easy (100% accurate) while others are hard (50%). Identify which categories the model struggles with. Review misclassifications to understand patterns.
6
Build review interface
Create a simple spreadsheet or web form where staff review suggestions. Show: transaction details, suggested category, confidence score. Include easy 'Approve' or 'Correct to:' workflow. Lower confidence suggestions (< 70%) should be flagged for closer review. Log all corrections for retraining. For non-technical staff: export predictions to CSV (using df.to_csv('predictions.csv')) so they can review in Excel or Google Sheets.
7
Set up monthly retraining process
Each month: collect approved/corrected transactions, add to training data, retrain model, evaluate accuracy on new test set. Track accuracy over time - it should improve as training data grows. Adjust features if you spot patterns the model misses. This maintenance is crucial for sustained performance.

Example code

Basic transaction classifier with scikit-learn

Train a RandomForest classifier on transaction descriptions and amounts to predict categories. Note: The raw amount feature is used without scaling - if you notice very large or very small transactions being systematically misclassified, try normalising the amount (e.g., using np.log(amount) or StandardScaler) to give text features more weight.

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import classification_report
import joblib

# Load historical categorised transactions
df = pd.read_csv('categorised_transactions.csv')

# Prepare features
# Combine description and amount into feature set
X_text = df['description'].fillna('')
X_amount = df['amount']
y = df['category']

# Split data (80% train, 20% test)
X_text_train, X_text_test, X_amount_train, X_amount_test, y_train, y_test = train_test_split(
    X_text, X_amount, y, test_size=0.2, random_state=42
)

# Vectorize text descriptions (convert text to numbers)
vectorizer = TfidfVectorizer(max_features=500, ngram_range=(1, 2))
X_text_train_vec = vectorizer.fit_transform(X_text_train)
X_text_test_vec = vectorizer.transform(X_text_test)

# Combine text features with amount
import numpy as np
from scipy.sparse import hstack
X_train = hstack([X_text_train_vec, X_amount_train.values.reshape(-1, 1)])
X_test = hstack([X_text_test_vec, X_amount_test.values.reshape(-1, 1)])

# Train classifier
clf = RandomForestClassifier(n_estimators=100, random_state=42)
clf.fit(X_train, y_train)

# Evaluate
y_pred = clf.predict(X_test)
print(classification_report(y_test, y_pred))

# Save model for future use
joblib.dump(clf, 'transaction_classifier.pkl')
joblib.dump(vectorizer, 'vectorizer.pkl')

# Predict on new transaction
def predict_category(description, amount):
    text_vec = vectorizer.transform([description])
    features = hstack([text_vec, [[amount]]])
    prediction = clf.predict(features)[0]
    probability = clf.predict_proba(features)[0].max()
    return prediction, probability

# Example - charity transactions
category, confidence = predict_category("National Lottery Community Fund Q3 Grant", 15000.00)
print(f"Predicted category: {category} (confidence: {confidence:.1%})")

category, confidence = predict_category("Volunteer mileage claim - home visits", 34.50)
print(f"Predicted category: {category} (confidence: {confidence:.1%})")