Extract key facts from case notes

service-deliveryintermediateemerging

The problem

You've got years of case notes in free-text format: 'Spoke to client about housing situation. Referred to partner org for debt advice. Client mentioned feeling stressed.' There's valuable information buried in there, but you can't query it. You can't answer questions like 'how many people with housing issues also have debt problems?' or 'which risk factors appear most before crisis?'

The solution

Use an LLM to read your case notes and extract structured facts: dates, interventions provided, services mentioned, risk indicators flagged, needs identified, outcomes noted. What was narrative becomes a database you can query for patterns, track intervention pathways, and spot early warning signs across your caseload.

What you get

A structured database with rows for each case note and columns for: client ID, date, interventions mentioned, needs identified, risk factors present, outcomes recorded, services accessed. You can now query 'show me all cases with housing + mental health needs' or 'what interventions precede positive outcomes?'

Before you start

Case notes exported as text (CSV with one note per row)
Defined categories for what you want to extract (your services, risk factors, needs taxonomy)
An OpenAI or Anthropic API key for batch processing
Data protection approval to use AI tools with case data - under UK GDPR this requires a lawful basis (likely legitimate interests or consent), a Data Protection Impact Assessment (DPIA), and appropriate technical safeguards. Consult ICO guidance on AI and data protection before proceeding

When to use this

You've got hundreds or thousands of unstructured case notes
You need to answer questions that require querying across all cases
You want to identify patterns in service pathways or risk factors
Manual coding of notes would take longer than you have

When not to use this

Your case notes are already structured data in your CRM
You have very few cases - might be quicker to manually code
The notes are too brief or inconsistent for extraction to work
Data protection policy doesn't permit AI processing of case notes
You have contractual obligations (e.g., from local authority commissioners or funders) that prohibit third-party AI processing of case data

Steps

1
Check data protection permissions
Before extracting anything, check with your data protection lead whether you can process case notes through AI tools. You may need to anonymise heavily first (removing names, specific locations, etc.). Some organisations will say no - respect that boundary.
2
Define your extraction categories
List what you want to pull out: service types you offer, common needs categories, risk indicators you track, intervention types, outcome measures. Be specific. 'Mental health mentioned' is better than 'wellbeing'. Create a controlled vocabulary so extraction is consistent.
3
Anonymise your notes
Remove or replace client names, addresses, specific identifying details. Be aware of 're-identification' risk: specific combinations of facts (e.g., a very specific location combined with a rare condition or unusual circumstances) can identify someone even without their name. You want the factual content (what was discussed, what services offered) not the personal identifiers. The extraction works on anonymised text just as well.
4
Test extraction on a sample
Take 20-30 notes and run them through the extraction prompt manually in Claude or ChatGPT. Does it identify the right categories? Does it miss important details? Does it hallucinate things that aren't there? Refine your prompt based on what you learn.
5
Build your extraction prompt
Create a prompt that includes your categories and asks for structured JSON output. Tell it to only extract facts present in the note, mark confidence for uncertain extractions, and flag when something's mentioned but details are vague. Include 2-3 example notes with their correct extractions.
6
Process in batches
Use the API and example code to process all your notes. The code loops through each note, extracts the structured data, and builds a CSV. This might take a few hours for thousands of notes. Monitor a sample to check quality stays consistent.
7
Validate the results
Spot-check extractions against original notes. Do the extracted facts match what's written? Are categories applied consistently? Check a random sample of at least 100 notes. If accuracy is below 85%, revise your prompt and re-run.
8
Start querying your new database(optional)
Now you can ask questions like: 'How many cases mentioned housing + debt together?', 'What's the typical pathway for mental health referrals?', 'Which risk factors most commonly precede crisis?'. Use spreadsheet filters or simple queries to find patterns.

Example code

Extract structured facts from case notes

This processes case notes to extract structured information. Adapt the categories to match what you track. Install with: pip install openai pandas

# pip install openai pandas
import os
from openai import OpenAI
import pandas as pd
import json
import time

# Initialize client - ensure OPENAI_API_KEY is set in environment
if not os.environ.get('OPENAI_API_KEY'):
    raise ValueError("Please set OPENAI_API_KEY environment variable")
client = OpenAI()

# Your categories to extract - adapt these
categories = {
    "services": ["housing support", "debt advice", "mental health", "food parcels", "benefits advice"],
    "needs": ["emergency accommodation", "financial hardship", "mental health crisis", "addiction support"],
    "risk_factors": ["eviction notice", "domestic abuse", "suicide ideation", "child protection concerns"],
    "outcomes": ["crisis averted", "referred to specialist", "ongoing support", "case closed"]
}

def extract_from_note(note_text):
    prompt = f"""Extract structured information from this case note.

Categories to look for:
{json.dumps(categories, indent=2)}

Return JSON with:
- services_mentioned: list of services from the categories (only if explicitly mentioned)
- needs_identified: list of needs from the categories
- risk_factors: list of risk factors present
- outcomes_noted: list of outcomes mentioned
- confidence: overall confidence in extraction (0-100)
- notes: any important context or uncertainty

Only extract information explicitly present in the note. Don't infer or guess.

Case note:
{note_text}

Return only valid JSON."""

    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}],
        response_format={"type": "json_object"}
    )

    return json.loads(response.choices[0].message.content)

# Load case notes
notes_df = pd.read_csv('case_notes.csv')
print(f"Processing {len(notes_df)} case notes...")

results = []
for idx, row in notes_df.iterrows():
    if idx % 50 == 0:
        print(f"Progress: {idx}/{len(notes_df)}")

    try:
        extraction = extract_from_note(row['note_text'])

        # Flatten for CSV
        results.append({
            'note_id': row.get('note_id', idx),
            'client_id': row.get('client_id'),
            'date': row.get('date'),
            'services': ', '.join(extraction.get('services_mentioned', [])),
            'needs': ', '.join(extraction.get('needs_identified', [])),
            'risks': ', '.join(extraction.get('risk_factors', [])),
            'outcomes': ', '.join(extraction.get('outcomes_noted', [])),
            'confidence': extraction.get('confidence'),
            'extraction_notes': extraction.get('notes')
        })

        time.sleep(0.2)  # Rate limiting

    except Exception as e:
        print(f"Error processing note {idx}: {e}")
        results.append({
            'note_id': row.get('note_id', idx),
            'error': str(e)
        })

# Save results
output_df = pd.DataFrame(results)
output_df.to_csv('extracted_facts.csv', index=False)

print(f"\nExtraction complete. Processed {len(results)} notes")
print(f"Average confidence: {output_df['confidence'].mean():.1f}%")
print(f"\nLow confidence notes (< 70%) for review: {len(output_df[output_df['confidence'] < 70])}")

# Summary statistics
print("\nMost common services mentioned:")
all_services = [s for services in output_df['services'].dropna() for s in services.split(', ') if s]
print(pd.Series(all_services).value_counts().head(10))