← Back to recipes

Process documents in bulk with LLM APIs

operationsintermediateproven

The problem

You've got 100 grant applications to review, 50 case notes to summarise, or 200 beneficiary feedback forms to extract themes from. Each is a separate document (PDF, Word doc, text file). Reading and analysing them manually would take weeks. You need to apply consistent AI analysis across all documents, but copying each into Claude.ai is still too slow.

The solution

Write a script that loops through a folder of documents, extracts the text, sends it to an LLM API with a consistent prompt (e.g., 'score this grant application' or 'summarise this case note'), and saves the structured results to a spreadsheet. This is the batch processing pattern for documents instead of CSV rows.

What you get

A CSV file with one row per document containing AI-generated analysis. For grant applications: score, strengths, weaknesses, recommendation. For case notes: summary, key issues, follow-up actions. For feedback forms: themes, sentiment, key quotes. Typically processes 50-200 documents in 20-60 minutes.

Before you start

  • Folder of documents to process (PDF, DOCX, or TXT files)
  • API key from OpenAI or Anthropic (budget £10-50 depending on volume)
  • Clear rubric: what should the AI extract or assess from each document?
  • Python environment (Google Colab works, or local installation)
  • Documents are in English or other language supported by the LLM

When to use this

  • You have 20+ documents that need the same analysis applied
  • Documents are in standard formats (PDF, Word, plain text)
  • The analysis is well-defined (score against rubric, extract specific information, summarise)
  • Manual reading would take days or weeks
  • You can tolerate 85-95% accuracy with human review of edge cases

When not to use this

  • Fewer than 20 documents (manual processing is faster)
  • Documents are scanned images without OCR (need to extract text first)
  • Each document needs highly customised assessment (not a standard template)
  • Documents contain highly sensitive information you can't send to external APIs
  • You need 100% accuracy (always budget for human review)
  • Documents are very long (>100 pages) - may hit API context limits

Steps

  1. 1

    Create your assessment rubric or extraction template

    Define exactly what you want from each document. For grant applications: 'Score 1-10 on: alignment with our priorities, feasibility, budget reasonableness. List 3 strengths and 3 weaknesses. Recommend: fund, maybe, reject.' For case notes: 'Summarise in 2-3 sentences, list key issues flagged, suggest follow-up actions.' Test this rubric manually on 3 documents to ensure it works.

  2. 2

    Prepare your documents in one folder

    Put all documents to process in a single folder. Use clear filenames (e.g., 'grant-app-001.pdf', 'case-note-2024-01-15.docx'). If documents are scanned PDFs without text, you'll need to OCR them first (separate step). Organise subfolders if you have different document types needing different prompts.

  3. 3

    Set up code with text extraction

    Use the example code below. It handles PDF, DOCX, and TXT files. For PDFs: uses PyPDF2 to extract text. For DOCX: uses python-docx. For TXT: reads directly. Test text extraction on 3 documents first - check the extracted text looks right (no garbled characters, structure preserved). Some complex PDFs extract poorly.

  4. 4

    Test your prompt on 5 documents

    Modify the code to process only 5 documents (there's a limit in the example). Run it. Check outputs carefully: Does the AI follow your rubric? Are scores reasonable? Is extracted information accurate? Common issues: AI misses key details in long documents, scores are too generous, formatting is inconsistent. Refine prompt and re-test.

  5. 5

    Run the full batch

    Remove the document limit and run on all documents. For 100 documents this typically takes 30-60 minutes depending on length and API speed. The code shows progress and handles rate limits. When complete, you'll have a CSV with one row per document containing the AI analysis.

  6. 6

    Review outputs and flag anomalies

    Open the results CSV. Sort by AI scores or key fields. Spot-check 15-20 documents against the original: is the analysis accurate? Flag any documents where the AI clearly failed (extracted nonsense, missed key information, contradicted itself). Review these manually.

  7. 7

    Handle failed extractions(optional)

    Some documents might have failed text extraction (complex PDFs, protected documents, corrupt files). The code logs these. For failed documents: try opening in Word and saving as plain text, or use OCR if they're scanned images. Re-run just the failed documents.

Example code

Process grant applications in bulk

Batch process grant applications with scoring rubric. Install: pip install openai PyPDF2 pandas tqdm

import os
import pandas as pd
import openai
from PyPDF2 import PdfReader
import time
from tqdm import tqdm

# Configuration
API_KEY = 'your-openai-api-key'
DOCS_FOLDER = './grant-applications'  # Folder with PDFs
OUTPUT_CSV = 'grant_scores.csv'

# Your assessment rubric
ASSESSMENT_PROMPT = """You are reviewing a grant application. Assess it using this rubric:

1. Alignment with our priorities (1-10 score): How well does it match our funding criteria?
2. Feasibility (1-10 score): Is the project realistic and achievable?
3. Budget quality (1-10 score): Is the budget clear, reasonable, and well-justified?

Also provide:
- 3 key strengths (bullet points)
- 3 key weaknesses (bullet points)
- Overall recommendation: FUND, MAYBE, or REJECT
- Brief rationale (2-3 sentences)

Return as JSON:
{
  "alignment_score": 7,
  "feasibility_score": 8,
  "budget_score": 6,
  "strengths": ["...", "...", "..."],
  "weaknesses": ["...", "...", "..."],
  "recommendation": "MAYBE",
  "rationale": "..."
}"""

def extract_text_from_pdf(pdf_path):
    """Extract text from PDF file"""
    try:
        reader = PdfReader(pdf_path)
        text = ""
        for page in reader.pages:
            text += page.extract_text()
        return text
    except Exception as e:
        return None

def assess_document(text, filename):
    """Send document to OpenAI for assessment"""
    try:
        response = openai.ChatCompletion.create(
            model='gpt-4o-mini',
            messages=[
                {'role': 'system', 'content': ASSESSMENT_PROMPT},
                {'role': 'user', 'content': f"Document: {filename}\n\n{text}"}
            ],
            temperature=0.3,
            max_tokens=1000
        )

        import json
        result = json.loads(response.choices[0].message.content)
        return result

    except Exception as e:
        return {'error': str(e)}

# Initialize
openai.api_key = API_KEY
results = []

# Get all PDF files
pdf_files = [f for f in os.listdir(DOCS_FOLDER) if f.endswith('.pdf')]

# For testing: process first 5 only
# Remove this line when ready for full batch
pdf_files = pdf_files[:5]

print(f"Processing {len(pdf_files)} documents...")

# Process each document
for pdf_file in tqdm(pdf_files):
    pdf_path = os.path.join(DOCS_FOLDER, pdf_file)

    # Extract text
    print(f"\nExtracting text from {pdf_file}...")
    text = extract_text_from_pdf(pdf_path)

    if not text:
        print(f"  Failed to extract text from {pdf_file}")
        results.append({
            'filename': pdf_file,
            'error': 'Text extraction failed'
        })
        continue

    # Assess document
    print(f"  Assessing {pdf_file}...")
    assessment = assess_document(text, pdf_file)

    # Store results
    results.append({
        'filename': pdf_file,
        'alignment_score': assessment.get('alignment_score'),
        'feasibility_score': assessment.get('feasibility_score'),
        'budget_score': assessment.get('budget_score'),
        'strengths': ' | '.join(assessment.get('strengths', [])),
        'weaknesses': ' | '.join(assessment.get('weaknesses', [])),
        'recommendation': assessment.get('recommendation'),
        'rationale': assessment.get('rationale'),
        'error': assessment.get('error')
    })

    # Rate limiting
    time.sleep(1)

# Save results
df = pd.DataFrame(results)
df.to_csv(OUTPUT_CSV, index=False)

print(f"\nDone! Results saved to {OUTPUT_CSV}")
print(f"Successfully processed: {df['error'].isna().sum()} documents")
print(f"Errors: {df['error'].notna().sum()} documents")

# Show summary statistics
if df['error'].isna().sum() > 0:
    print(f"\nAverage scores:")
    print(f"  Alignment: {df['alignment_score'].mean():.1f}/10")
    print(f"  Feasibility: {df['feasibility_score'].mean():.1f}/10")
    print(f"  Budget: {df['budget_score'].mean():.1f}/10")
    print(f"\nRecommendations:")
    print(df['recommendation'].value_counts())

Summarise case notes in bulk using Claude

Summarise case notes in bulk using Claude. Install: pip install anthropic python-docx pandas tqdm

import os
import pandas as pd
import anthropic
from docx import Document
import time
from tqdm import tqdm

# Configuration
API_KEY = 'your-anthropic-api-key'
DOCS_FOLDER = './case-notes'  # Folder with DOCX files
OUTPUT_CSV = 'case_summaries.csv'

SUMMARY_PROMPT = """Summarise this case note following this template:

1. Brief summary (2-3 sentences): What happened in this interaction?
2. Key issues identified (bullet list): What concerns or needs were flagged?
3. Actions taken (bullet list): What was done or arranged?
4. Follow-up required (bullet list): What needs to happen next?
5. Urgency (LOW, MEDIUM, HIGH): How urgent is follow-up?

Return as JSON."""

def extract_text_from_docx(docx_path):
    """Extract text from Word document"""
    try:
        doc = Document(docx_path)
        return '\n'.join([para.text for para in doc.paragraphs])
    except Exception as e:
        return None

def summarise_case_note(text, filename, client):
    """Send case note to Claude for summarisation"""
    try:
        message = client.messages.create(
            model="claude-3-5-sonnet-20241022",
            max_tokens=1000,
            temperature=0.3,
            messages=[
                {
                    "role": "user",
                    "content": f"{SUMMARY_PROMPT}\n\nCase note: {filename}\n\n{text}"
                }
            ]
        )

        import json
        result = json.loads(message.content[0].text)
        return result

    except Exception as e:
        return {'error': str(e)}

# Initialize
client = anthropic.Anthropic(api_key=API_KEY)
results = []

# Get all Word docs
docx_files = [f for f in os.listdir(DOCS_FOLDER) if f.endswith('.docx')]

# Test mode: first 5 only
docx_files = docx_files[:5]

print(f"Processing {len(docx_files)} case notes...")

for docx_file in tqdm(docx_files):
    docx_path = os.path.join(DOCS_FOLDER, docx_file)

    # Extract text
    text = extract_text_from_docx(docx_path)

    if not text:
        results.append({'filename': docx_file, 'error': 'Text extraction failed'})
        continue

    # Summarise
    summary = summarise_case_note(text, docx_file, client)

    results.append({
        'filename': docx_file,
        'summary': summary.get('summary'),
        'key_issues': ' | '.join(summary.get('key_issues', [])),
        'actions_taken': ' | '.join(summary.get('actions_taken', [])),
        'follow_up': ' | '.join(summary.get('follow_up', [])),
        'urgency': summary.get('urgency'),
        'error': summary.get('error')
    })

    time.sleep(1)

# Save
df = pd.DataFrame(results)
df.to_csv(OUTPUT_CSV, index=False)
print(f"\nSaved {len(df)} summaries to {OUTPUT_CSV}")

Tools

OpenAI APIservice · paid
Visit →
Anthropic Claude APIservice · paid
Visit →
Pythonplatform · free · open source
Visit →
PyPDF2 or pypdflibrary · free · open source
Visit →

Resources

At a glance

Time to implement
hours
Setup cost
low
Ongoing cost
low
Cost trend
decreasing
Organisation size
small, medium, large
Target audience
operations-manager, program-delivery, fundraising, data-analyst

Cost depends on document length and model used. Typical grant application (3-5 pages): £0.05-0.10 with GPT-4o-mini, £0.20-0.40 with GPT-4. Claude Opus handles longer documents better but costs more. For 100 documents averaging 4 pages: £5-40 depending on model. Always test with 5 documents first to estimate costs.

Part of this pathway

Written by AI Recipes for Charities

Last updated: 2024-12-23