Spot donors who might stop giving

fundraisingintermediateproven

The problem

You've got hundreds or thousands of regular donors, but you don't know which ones are thinking about stopping. By the time someone cancels their direct debit, it's too late. You need a way to spot at-risk donors early so you can reach out before they lapse.

The solution

You'll use Google Colab (a free online notebook for running Python code) with Gemini to help you build a simple prediction model. Don't worry if that sounds technical. Colab lets you run code in your browser without installing anything, and Gemini can explain what each bit does and help you adapt it to your data. You'll export your donation history as a CSV, upload it to Colab, and use a recipe of code that looks at patterns like: when did each donor last give? How often do they give? Has their giving dropped off? The model learns what "about to lapse" looks like from donors who stopped in the past, then flags current donors showing similar patterns.

What you get

A spreadsheet of your donors ranked by risk score (0-100%). For each donor, you'll see the reasons behind their score, things like "giving frequency dropped 40%" or "no gift in 8 months". You can filter by risk level and export lists for your retention campaigns. The whole thing runs in your browser and you keep your data.

Before you start

At least 2 years of donation history (more is better)
A CSV export from your CRM with donor IDs, dates, and amounts (no names or contact details needed)
A Google account (for Colab and Gemini)
About 2-3 hours to work through it the first time, less once you've done it before
A quick check with whoever handles data protection at your organisation (see the data protection step below)

When to use this

You've got at least 500 donors with a couple of years of history
You have a regular giving/Direct Debit programme - predictive models for churn work best on recurring gifts rather than one-off donors
Retention matters to you and you'd rather target outreach than email everyone
You want to understand why someone's flagged as at-risk, not just get a score
Your CRM doesn't have prediction built in, or charges extra for it

When not to use this

You've got fewer than 100 donors. There just isn't enough data for patterns to emerge
Your CRM already does this and you're happy with it
You don't have time to act on the results. No point predicting if you can't follow up

Steps

1
Check your data protection basis
Before you start, have a quick word with whoever handles data protection at your organisation. The good news: you don't need names, emails, or addresses for this analysis, just donor IDs, dates, and amounts. You'll likely rely on "legitimate interests" as your lawful basis under UK GDPR, since retaining donors is clearly in the charity's interest and donors would reasonably expect you to try to keep them engaged. Your organisation should document this with a simple Legitimate Interest Assessment. If you're already doing any kind of donor analysis, you've probably covered this already. The ICO has guidance on legitimate interests if you need it.
2
Get your data out of your CRM
Export a CSV with donor ID, donation date, and donation amount. You don't need names or contact details. If you want to be extra cautious, replace the real donor IDs with random numbers before uploading (you can match them back later using a lookup table you keep locally). Include everyone, not just current donors. You need to see who lapsed in the past so the model can learn what that looks like.
3
Open Google Colab and upload your CSV
Go to colab.research.google.com and create a new notebook. You can upload your CSV directly. Your data stays in your Colab session and isn't used by Google for training. Colab gives you a Python environment in your browser with all the libraries you need already installed.
4
Use Gemini to help you understand the starter code
Copy the example code below into Colab. If anything's confusing, paste the code (not your data) into Gemini and ask "what does this bit do?" or "how do I change this for my data?". Gemini's good at explaining Python code in plain English. Keep your actual donor data in Colab, not in Gemini.
5
Calculate the key numbers for each donor
The code calculates things like: how long since they last gave, how often they give, whether their giving is trending up or down. These patterns are what the model uses to spot risk. Run the code and check the output makes sense for donors you know.
6
Train the model on your historical data
The model looks at donors who lapsed in the past and learns what their patterns looked like before they stopped. It's like showing it examples and saying "this is what at-risk looks like". Training takes a few seconds.
7
Check how well it works
The code will show you how accurate the model is on data it hasn't seen before. You're looking for a balance: catching most of the real risks without flagging so many people your team can't follow up.
8
Get your risk scores
Run the model on your current donors and export the results. You'll get a spreadsheet with risk scores and the top reasons for each score. Sort by risk and you've got your priority list.
9
Plan what you will actually do(optional)
The prediction is only useful if you act on it. High-risk major donors might get a phone call. Medium-risk might get a personal email. Low-risk might just go into a re-engagement campaign. Decide this before you run it.

Example code

Calculate the key numbers for each donor

This turns your raw donation list into useful numbers: how recently each donor gave, how often they give, how much they give. Paste this into Colab and ask Gemini if you need help adapting it to your column names.

import pandas as pd
from datetime import datetime

# Load your CSV (change the filename to match yours)
donations = pd.read_csv('donations.csv')
donations['donation_date'] = pd.to_datetime(donations['donation_date'])

today = datetime.now()

# Calculate key features for each donor
features = donations.groupby('donor_id').agg(
    recency_days=('donation_date', lambda x: (today - x.max()).days),
    tenure_days=('donation_date', lambda x: (today - x.min()).days),
    frequency=('donation_date', 'count'),
    total_given=('amount', 'sum'),
    avg_gift=('amount', 'mean'),
).reset_index()

# Have a look at what you've got
print(features.head(10))

Train the model and get predictions

This trains the model on past data and then scores your current donors. The output tells you which donors are at risk and why.

from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import classification_report

# First, define who is 'lapsed': donors who haven't given in 12+ months
# This creates the target variable the model will learn to predict
features['lapsed'] = (features['recency_days'] > 365).astype(int)

# WARNING: Data leakage consideration
# Using current 'recency_days' to both define 'lapsed' AND as a predictor creates
# a subtle issue: recency perfectly predicts lapse because they're defined from the
# same data point. For production use, you should:
# 1. Use a historical snapshot (e.g., features calculated 12 months ago)
# 2. Predict who lapsed between then and now
# For this learning example, the model still teaches useful patterns, but real-world
# accuracy will be lower than training metrics suggest.

X = features.drop(['donor_id', 'lapsed'], axis=1)
y = features['lapsed']

# Split data: 80% to train, 20% to test
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

# Train the model
model = RandomForestClassifier(n_estimators=100, max_depth=10)
model.fit(X_train, y_train)

# See how well it works
print(classification_report(y_test, model.predict(X_test)))

# Get risk scores for everyone (probability of lapsing)
features['risk_score'] = model.predict_proba(X)[:, 1] * 100

# See which features matter most
for name, importance in zip(X.columns, model.feature_importances_):
    print(f"{name}: {importance:.1%}")

# Export your results
features.sort_values('risk_score', ascending=False).to_csv('donor_risk_scores.csv')